Friend or Foe? was Re: Paradox--was Re: Active shields blah blah

From: Michael M. Butler (
Date: Sat Jan 13 2001 - 02:35:56 MST

We have some extant examples of persons of whom it can be said they are
smart and trustworthy. These persons might not be the same for all 6+
billion persons now living--I'd be quite surprised if they were.

Trust in other persons is sometimes (mis)placed for unexamined reasons.
An AI (I prefer lately to speak of APs, to emphasize that they ought to
be persons, and treated as such, if at all possible) which is 10 times
smarter than Marilyn vos Savant but that shares the conflicted, occluded
worldview of a modern US Presidential candidate would not be an entity
I'd pay money to have around--to put it mildly.

Even so, a Superintelligence that doesn't share cultural values with
(for instance) some Ayatollahs might have to be very smart and
diplomatic and compassionate and clever. Perhaps it could be.

Would it still be Friendly if it didn't believe that everyone should be
a Muslim and subject to Muslim law? Would you think so if you were an
Ayatollah? Would you be right?

Another big Q.: can we (or "can Eliezer") build an AI (I prefer to call
it an AP) that is capable of self-improvement that can't/won't edit out
Friendliness as some sort of "dead code optimization"?

I've been calling such an uneditable attribute an "L-Box" in my private
writings. Humans have 'em. "L" can stand for Loyalty if you want. My
current thinking is that modeling the human limbic system is important
in order for any AP to understand humans. There is some risk of creating
an emotionally unbalanced AP. The same is true of people today. We need
to do a better job of raising APs than we do of raising many kids if
those APs are going to be trustworthy. If they are able to go SI, it's
even more important.

"Eliezer S. Yudkowsky" wrote:
> John Marlow wrote:
> >
> > **True enough--but their ancestors did. And you feel
> > no obligation to refrain from killing them--much less
> > to look after them--because their ancestors wrote your
> > source code.
> That's *right*. An AI, even a Friendly AI, feels no obligation because
> *we* wrote the source code. Not unless someone puts it there, and while
> there will someday be AIs that are good drinking companions and fit
> participants in the human drama, and these AIs may wax sentimental about
> their creators, the Friendly AIs come *first* - the Sysop, or the
> Guardians.
> An AI can be Friendly because there's nothing there except what you put
> there, what you share with the AI. The task is nontrivial because you
> don't always know what it is you're putting there, but that blank slate,
> that vast silent space, is what makes the task possible.
> I don't want to sound like it's a question of coercion. The paradigm of
> Friendly AI is to create unity between the will of the Friendly AI and the
> decisions of an idealized human altruist. It is not a question of
> control. You have to identify with the Friendly AI you build, because a
> human thinks about controlling different humans, wheedling or coercing the
> Other, but the only time we *build* a mind is when we build ourselves.
> The Friendly AI I want to build is the same being I would make myself into
> if that were only way to get a Sysop (a Guardian AI, if you prefer). A
> Sysop might not turn out to be a real person, and I'd back myself up first
> if I could - but a Sysop is a valid thing for a mind to be, a valid state
> for a mind to occupy, a valid part of the world, not a slave or someone
> we've tricked into servitude.
> You keep trying to understand AI in human terms. Everything that you use
> to model other minds is specialized on understanding humans, and an AI
> isn't human. A Friendly AI isn't Friendly to us because *we* built it; it
> would be just as Friendly if the identical source code materialized from
> thin air, and will go on to be just as Friendly to aliens if a
> pre-Singularity civilization is ever found orbiting another star. That
> lack of sentiment doesn't make it dangerous. My benevolence towards other
> sentient beings isn't conditional on their having created me; why would I
> want to build a conditionally Friendly AI?
> -- -- -- -- --
> Eliezer S. Yudkowsky
> Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:56:18 MDT