AI: The generic conversation (was: Will we be viped out by SI?)

From: Eliezer S. Yudkowsky (
Date: Sun Jan 07 2001 - 21:10:13 MST

And he sat there, screaming, paralyzed by pain, as every cell of his body
was racked with agonizing, massive waves of deja vu.


                     The Generic Conversation
        (as adapted from _Friendly AI_, a work in progress)

Somebody: "But what happens if the AI decides to do [something only a
human would want]?"

Eliezer: "Ve won't want to do [whatever] because the instinct for doing
[whatever] is a complex functional adaptation, and complex functional
adaptations don't just materialize in source code. I mean, it's
understandable that humans want to do [whatever] because of [insert
selection pressure], but you can't reason from that to AIs."

Somebody: "But everyone needs to do [whatever] because [insert personal
philosophy], so the AI will decide to do it as well."

Eliezer: "Yes, doing [whatever] is sometimes useful. But even if the AI
decides to do [whatever] because it serves [insert Friendliness supergoal]
under [insert contrived scenario], that's not the same as having an
independent desire to do [whatever]."

Somebody: "Yes, that's what I've been saying: The AI will see that
[whatever] is useful and decide to start doing it. So now we need to
worry about [some scenario in which doing <whatever> is catastrophically

Eliezer: "But the AI won't have an independent desire to do [whatever].
The AI will only do [whatever] when it serves the supergoals. A Friendly
AI would never do [whatever] if it stomps on the Friendliness supergoals."

Somebody: "I don't understand. You've admitted that [whatever] is
useful. Obviously, the AI will alter itself so it does [whatever]

Eliezer: "The AI doesn't need to give verself an instinct in order to do
[whatever]; if doing [whatever] really is useful, then the AI can see that
and do [whatever] as a consequence of pre-existing supergoals, and only
when [whatever] serves those supergoals."

Somebody: "But an instinct is more efficient, so the AI will alter itself
to do [whatever] automatically."

Eliezer: "Only for humans. For an AI, [insert complex explanation of the
cognitive differences between having 32 2-gigahertz processors and 100
trillion 200-hertz synapses], so making [whatever] an independent
supergoal would only be infinitesimally more efficient."

Somebody: "Yes, but it is more efficient! So the AI will do it."

Eliezer: "It's not more efficient from the perspective of a Friendly AI
if it results in [something catastrophically unFriendly]. To the exact
extent that an instinct is context-insensitive, which is what you're
worried about, a Friendly AI won't think that making [whatever]
context-insensitive, with all the [insert horrifying consequences], is
worth the infinitesimal improvement in speed."

Somebody: "Oh. Well, maybe -"

Eugene Leitl (interrupts): "But you can only build AIs using evolution.
So the AI will wind up with [exactly the same instinct that humans have]."

Eliezer: "One, I don't plan on using evolution to build seed AIs. Two,
even if I did use controlled evolution, winding up with [whatever] would
require exactly duplicating [some exotic selection pressure]. Three, even
if I duplicated [some exotic selection pressure], my breeding population
would start out with a set of Friendly supergoals, with the understanding
that competing in controlled evolution serves the Friendliness supergoals
by developing better AI, and with the knowledge that [whatever] helps them
compete, so the breeding population could do [whatever] as a consequence
of Friendliness supergoals, and a mutation for doing [whatever]
instinctively wouldn't add any additional adaptive behaviors. There'd be
no selection pressure."

Eugene Leitl: "Yeah? Well, I do plan to use evolution. And I plan to
run the AIs using scenarios with [some exotic selection pressure]. And I
don't plan on starting out with a breeding population of Friendly AIs.
And even if I did, I wouldn't try to create any selection pressures that
would ensure that [whatever] was executed in a context-sensitive and
Friendliness-aware way, so even the infinitesimal performance improvement
that comes from total context-insensitivity would be an evolutionary

Eliezer: "Okay, so your AIs will destroy the world. Mine won't."

Eugene Leitl: "But evolution is the only way to develop AIs. And the
starting population of AIs will have [some kind of totally opaque code],
so you won't be able to start out with a Friendly breeding population, or
have any control over their evolution at all. You'll be forced to do it
my way and then you'll destroy the world."

Eliezer: "One, your way is so incredibly dangerous that I just wouldn't
do it, no matter how useful I thought it might be. Two, according to my
best current theoretical understanding of intelligence, you are so totally
off-base that 'flat wrong' doesn't even begin to describe it."

Eugene Leitl: "Yeah? Well, *I* think *you're* off-base, and *I* went to
high school."

Eliezer: "Are we done? We're done."

-- -- -- -- --
Eliezer S. Yudkowsky
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:56:17 MDT