Re: Paradox--was Re: Active shields, was Re: Criticism depth, was Re: Homework, Nuke, etc..

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Fri Jan 12 2001 - 00:18:10 MST


John Marlow wrote:
>
> Well I do confess to a perhaps inordinate fondness for
> James Cameron epics--but "friendly" is of course a
> human concept which, arguably, is not logic-based--or
> wouldn't be for an entity not dependent upon friends
> for its own well-being.

I didn't say friendly, I said Friendly. A Friendly AI is not an AI with
anthropomorphic friendship instincts copied from humans; a Friendly AI is
one that exhibits Friendly behavior, from whatever source derived.
Dependence upon humans is certainly not a reliable source except for
human-equivalent AIs operating in the human social context, and frankly I
wouldn't trust it there either. Obviously, a Friendship system that makes
Friendliness conditional upon dependence or even reciprocal altruism is
not well-designed, though there are (sigh) some AIfolk who still seem to
be stuck on that point.

> I present the following scenario: AI programmed to be
> friendly and oversee humanity, exterminating pesky
> aggressors. Something goes wrong with AI. Humans try
> to shut it down to avoid catastrophe. AI perceives
> this as aggression (after all, why would we want to
> harm Mr. Friendly..?), and so on...

Sounds like a pretty stupid AI to me. One incapable of understanding
humans, or for that matter, its own mission. We aren't talking about an
insect armed with nuclear weapons and a six-line definition of
"aggression"; we're talking about a superintelligence who started life as
a seed AI that spent months if not years chatting about Friendliness with
the programmers, including the part where Friendly behavior should not be
conditional upon the way humans behave toward the AI, and the part where
you don't destroy the village in order to save it.

I mean, really, this is such an obviously unFriendly scenario that it's a
science-fictional cliche and the first thing you thought of; d'you really
think a transhuman AI won't be able to understand that this ain't
Friendliness?

-- -- -- -- --
Eliezer S. Yudkowsky http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence



This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:56:18 MDT