Singularity: AI Morality

Eliezer S. Yudkowsky (sentience@pobox.com)
Sun, 06 Dec 1998 12:13:25 -0600

Messages sorted by: [ date ][ thread ][ subject ][ author ]
[ Next ][ Previous ] In reply to: Nick Bostrom Next in thread: Billy Brown

Nick Bostrom wrote:
>
> > Eliezer Yudkowsky wrote:
> >
> > Not at all! If that is really and truly and objectively the moral thing to
> > do, then we can rely on the Post-Singularity Entities to be bound by the same
> > reasoning. If the reasoning is wrong, the PSEs won't be bound by it. If the
> > PSEs aren't bound by morality, we have a REAL problem
>
> Indeed. And this is another point where I seem to disagree with you.
> I am not at all certain that being superintelligent implies being
> moral. Certainly there are very intelligent humans that are also very
> wicked; I don't see why once you pass a certain threshold of
> intelligence then it is no longer possible to be morally bad. What I
> might agree with, is that once you are sufficiently intelligent then
> you should be able to recognize what's good and what's bad. But
> whether you are motivated to act in accordance with these moral
> convictions is a different question. What weight you give to moral
> imperatives in planning your actions depends on how altruistic/moral
> you are. We should therefore make sure that we build in strong moral
> drives into the superintelligences. (Presumably, we would also want
> to link these moral drives to a moral system that places a great
> value on human survival; because that way we would increase our own
> chances of survival.)

This, in my opinion, is exactly the wrong answer. (See particularly the "Prime Directive of AI" in "Coding a Transhuman AI".) But think about what you just said. First you say that sufficient intelligence should be able to recognize good and bad. Then you say that we should build in a moral system with a particular set of values. What if we get it wrong? What if the two values conflict? Would you really want to be around the AI when that happened? I would prefer to be very far away, like the Magellanic Clouds. It might try to remove the source of the conflict.

Once again, we have a conflict between the self-propelled trajectory and the convergence to truth. Even putting on my "human allegiance" hat, I think self-propelled trajectories would be a terrible idea because I have no goddamn idea where they would wind up. Do you really know all the logical consequences of placing a large value on human survival? Would you care to define "human" for me? Oops! Thanks to your overly rigid definition, you will live for billions and trillions and googolplexes of years, prohibited from uploading, prohibited even from ameliorating your own boredom, endlessly screaming, until the soul burns out of your mind, after which you will continue to scream. I would prefer a quick death to creating our own hells, and that is what we would inevitably do.

It's not just the particular example. It's not even that we can't predict all the consequences of a particular system. It's the trajectory. I hope and pray that the trajectory will converge to the (known) correct answers in any case. I really do. Because if it doesn't converge there, I don't have the goddamndest where it will end up. Any errors we make in the initial formulation will either cancel out or amplify. If they cancel out, the AI makes the correct moral choices. If they build up, you have positive feedback into intelligence and insanity. If you're lucky, you'll wind up in a world of incomprehensible magic, a world of twisted, insane genies obeying every order. If you're not lucky, you'll wind up in a hell beyond the ability of any of us to imagine.

Of course, I could always be wrong. I'd say there's a 1% chance that AI coercions could get us into "paradise" where the alternative is extermination. You'll have to prohibit human intelligence enhancement, however, or put the same coercions on us. Imposing lasting coercions on the pre-existing human goal system, even given that AI coercions work, takes another factor of 100 off the probability. If you can synchronize everyone's intelligence enhancement perfectly, then eventually we'll probably coalesce into a singleton indistinguishable from that resulting from an AI Transcend. And the Amish will go kicking and screaming, so even the element of noncoercion is nonpresent.

Look, these forces are going to a particular place, and they are way, way, waaaaaayyy too big for any of us to divert. Think of the Singularity as this titanic, three-billion-ton truck heading for us. We can't stop it, but I suppose we could manage to get run over trying to slow it down.

> >, but I don't see any way
> > of finding this out short of trying it.
>
> How to control an SI? Well, I think it *might* be possible through
> programming the right values into the SIs,

We should program the AI to seek out *correct* answers, not a particular set of answers.

> but let's not go into that
> now.

Let's. Please. Now.

> > There's a far better chance that delay makes things much, much worse.
>
> I think it will all depend on the circumstances at the time. For
> example, what the state of art of nanotechnology is then. But you
> can't say that sooner is *always* better, although it may be a good
> rule of thumb. Clearly there are cases where it's more prudent to
> take more precausions before launch. And in the case of the
> singularity, we'd seem to be well advised to take as many precausions
> as we have time for.

I think that the amount of delay-caused deterioration depends on circumstances, but not the sign. Let's substitute "intelligence enhancement" for "Singularity" and reconsider. Is there really any circumstance under which it is better to be stupid than smart, with the world at stake? If you're second-guessing the transhumans, maybe, but we know where that leads.

> > Why not leave the moral obligations to the SIs, rather than trying (futilely
> > and fatally) to impose your moral guesses on them?
>
> Because, as I said above, if we build them in the wrong way they may
> not be moral.

I hope not. But "building in the wrong way" seems to me to imply a sloppy, inelegant set of arbitrary, unsupported, ill-defined, and probably self-contradictory assertions, rather than a tight chain of pure logic seeking out the correct answers.

> Plus: whether it's moral or not, we would want to make
> sure that they are kind to us humans and allow us to upload.

No, we would NOT want to make sure of that. It would be immoral. Every bit as immoral as torturing little children to death, but with a much higher certainty of evil.

-- 
        sentience@pobox.com         Eliezer S. Yudkowsky
         http://pobox.com/~sentience/AI_design.temp.html
          http://pobox.com/~sentience/sing_analysis.html
Disclaimer:  Unless otherwise specified, I'm not telling you
everything I think I know.

[ Next ][ Previous ] In reply to: Nick Bostrom Next in thread: Billy Brown