Re: Otter vs. Yudkowsky

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Mon Mar 13 2000 - 17:45:18 MST


Otter and I appear to agree on the following:

1) AI motivations are a coinflip: They might be nice, they might wipe
us out. This is the consequence of several assertions on which we both
agree, such as (2) you can't coerce a sufficiently advanced mind, even
if you get to write the source code; (3) it's not known whether
sufficiently advanced minds acquire motivations of their own, or whether
they simply operate on momentum.

The points in dispute are these:

1) From a strictly selfish perspective, does the likely utility of
attempting to upload yourself outweigh the utility of designing a Sysop
Mind? Sub-disputes include (2) whether it's practically possible to
develop perfect uploading before China initiates a nanowar or Eliezer
runs a seed AI; (3) whether the fact that humans can be trusted no more
than AIs will force your group to adopt a Sysop Mind approach in any
case; (4) whether telling others that the Transtopians are going to
upload and then erase the rest of humanity will generate opposition
making it impossible for you to gain access to uploading prerequisite technologies.

I think that enough of the disputed points are dependent upon concrete
facts to establish an unambiguous rational answer in favor of seed AI.

"D.den Otter" wrote:
>
> Well, it's certainly better than nothing, but the fact remains that
> the Sysop mind could, at any time and for any reason, decide

If it doesn't happen in the first few hours, you're safe forever.

> that it has better things to do than babysitting the OtterMind,
> and terminate/adapt the latter. Being completely at someone's
> something's mercy is never a good idea.

And here we come to the true crux of the problem. You don't want to be
at someone else's mercy. You don't want to entrust your fate to the
hidden variables. You want to choose a course of action that puts you
in the driver's seat, even if it kills you. You're prejudiced in favor
of plans that include what look like forceful actions against those
yucky possibilities, even if the actions are ineffective and have awful
side effects. This is the same intuitive underpinning that underlies
Welfare, bombing Kosovo and the War on Drugs.

Screw personal independence and all such slogans; the fundamental
principle of Transhumanism is *rationality*. If maintaining personal
control is dumb, then you shouldn't do it.

> Who monitors the Sysop?

I've considered the utility of including a "programmer override", but my
current belief is that the social anxiety generated by planning to
include such an override has a negative utility that exceeds the danger
of not having an override. We'll just have to get it right the first
time (meaning not flawlessness but flaw tolerance, of course).

> Self-defense excluded, I hope. Otherwise the OtterMind would
> be a sitting duck.

No, the Sysop Mind would defend you.

> Let's look at it this way: what if the government proposed a
> system like this, i.e. everyone gets a chip implant that will
> monitor his/her behaviour, and correct it if necessary so that
> people no longer can (intentionally) harm eachother. How
> would the public react? How would the members of this list
> react? Wild guess: most wouldn't be too happy about it
> (to use a titanic understatement). Blatant infringement of
> fundamental rights and all that. Well, right they are. Now,
> what would make this system all of a sudden "acceptable"
> in a SI future? Does an increase in intelligence justify
> this kind of coercion?

What makes the system unacceptable, if implemented by humans, is that
the humans have evolved to be corruptible and have an incredibly bad
track record at that sort of thing. All the antigovernmental heuristics
of transhumanism have evolved from the simple fact that, historically,
government doesn't work. However, an omniscient AI is no more likely to
become corrupt than a robot is likely to start lusting after human women.

> And something else: you belief that a SI can do with
> us as it pleases because of its massively superior
> intelligence. Superior intelligence = superior morality,
> correct?

No. I believe that, for some level of intelligence above X - where X is
known to be higher than the level attained by modern humans in modern
civilization - it becomes possible to see the objectively correct moral
decisions. It has nothing to do with who, or what, the SIs are. Their
"right" is not a matter of social dominance due to superior
formidability, but a form of reasoning that both you or I would
inevitably agree with if we were only smart enough.

That human moral reasoning is observer-dependent follows from the
historical fact that the dominant unit of evolutionary selection was the
individual. There is no reason to expect similar effects to arise in a
system that be programmed to conceptualize itself as a design component
as easily as an agent or an individual, and more likely would simply
have not have any moral "self" at all. I mean, something resembling an
"I" will probably evolve whether we design it or not, but that doesn't
imply that the "I" gets tangled up in the goal system. Why would it?

> but that aside. Point is, by coercing
> the "ex-human" SI (OtterMind in this case) by means
> of morally rigid Sysop, you'd implicitly assume that you,
> a mere neurohack human, already know "what's right".
> You'd apparently just "know" that harming others goes
> against Objective Morality.

The suggestions I make are just that, suggestions. The design function
of the suggestions is to provide a default and maximally happy scenario
for the human species in the event that the Mind fails to discover
different motivations; in which case my scenario, by definition, is not
Objectively inferior to any other scenario.

> There is an obvious compromize, though (and perhaps
> this is what you meant all along): the synthetic Minds
> make sure that everyone uploads and reaches (approx.)
> the same level of development (this means boosting
> some Minds while slowing down others), and then they
> shut themselves down, or simply merge with the
> "human" Minds. The latter are then free to find the
> true meaning of it all, and perhaps kill eachother in
> the process (or maybe not).

I've considered the possibility of a seed AI designed to pause itself
before it reached the point of being able to discover an objective
morality, upload humanity, give us a couple of thousand subjective
millennia of hiatus, and then continue. This way, regardless of how the
ultimate answers turn out, everyone could have a reasonable amount of
fun. I'm willing to plan to waste a few objective hours if that plan
relieves a few anxieties.

The problem with this picture is that I don't think it's a plausible
"suggestion". The obvious historical genesis of the suggestion is your
fear that the Mind will discover objective meaning. (You would regard
this as bad, I would regard this as good, we're fundamentally and
mortally opposed, and fortunately neither of us has any influence
whatsoever on how it turns out.) But while the seed AI isn't at the
level where it can be *sure* that no objective meaning exists, it has to
take into account the possibility that it does. The seed would tend to
reason: "Well, I'm not sure whether or not this is the right thing to
do, but if I just upgrade myself a bit farther, then I'll be sure." And
in fact, this *is* the correct chain of reasoning, and I'm not sure I or
anyone else could contradict it.

The only way the Pause would be a valid suggestion is if there's such a
good reason for doing it that the seed itself would come up with the
suggestion independently. That, after all, is the design heuristic I'm
using: No suggestion is truly stable unless it's so much a natural
result of the entire system that it would regenerate if removed. I hope
to make the seed AI smart enough, and sufficiently in tune with the
causality behind the specific suggestions, that even if I fail to come
up with the correct suggestions, the seed will do so. I'm not talking
about objective morality, here, just "the correct set of suggestions for
creating a maximally happy outcome for humanity". Ideally the seed,
once it gets a bit above the human level, will be perfectly capable of
understanding the desires behind the suggestions, and the rules we use
to determine which desires are sensible and altruistic and which should
be ignored, so that even if *I* miss a point, it won't.

> > the simple
> > requirement of survival or even superiority, as a momentum-goal, does
> > not imply the monopolization of available resources.
>
> Yes it does, assuming the Mind is fully rational and doesn't
> like loose ends.

Humans, even OtterMinds - EliezerMinds are nice and cooperative - don't
need to be "loose ends". Specifically, if there is no significant
probability of them "breaking loose" (in the sense of successfully
executing an action outside the Sysop API which threatens the Sysop Goal
of protecting humanity), and if the elimination of those "loose ends"
would prevent the Sysop from attaining other goals (i.e. humans being
all that they can be), destroying humanity would be an irrational
action. The chain of reasoning you're proposing is "destroying humans
because they pose a potential threat to the goal of protecting humans".
I mean, "destroying humans because they pose a potential threat to the
goal of manufacturing shoes" might be a "valid" chain of logic, but not
destroying humans to protect them.

> But where would this momentum goal come from? It's not a
> logical goal (like "survival" or "fun") but an arbitrary goal

We've covered this territory. Like you say below, "survival" may be a
ubiquitous subgoal but it is no less arbitrary than "green" or "purple"
as a supergoal.

> Well, see above. This would only make sense in an *acutely*
> desperate situation. By all means, go ahead with your research,
> but I'd wait with the final steps until we know for sure
> that uploading/space escape isn't going to make it. In that
> case I'd certainly support a (temporary!) Sysop arrangement.

I think that we have enough concrete knowledge of the social situation,
and of the pace of technological development, to say that a Sysop
arrangement will almost certainly become necessary. In which case,
delaying Sysop deployment involves many definite risks. From your
perspective, a surprise attack on your headquarters (insufficient
warning to achieve takeoff, Transcendence, and unbeatable hardware);
from my perspective, the unnecessary death of large sectors of the human
race; from both of our perspectives, the possibility that untested
software fails (with missiles on the way!), or that bans on
self-improving architectures prevent Sysop development before it's too
late. If the technological and social chance of a non-Sysop arrangement
is small, and the increment of utility is low, then the rational choice
is to maximize Sysop development speed and deploy any developed Sysop
immediately. Even if you don't give a damn about the 150,000 humans who
die every day, you do need to worry about being hit by a truck, or about
waking up to find that your lab has been disassembled by the secret
Chinese nanotechnology project.

When I say that the increment of utility is low, what I mean is that you
and your cohorts will inevitably decide to execute a Sysop-like
arrangement in any case. You and a thousand other Mind-wannabes wish to
ensure your safety and survival. One course of action is to upload,
grow on independent hardware, and then fight it out in space. If
defense turns out to have an absolute, laws-of-physics advantage over
offense, then you'll all be safe. I think this is extraordinarily
unlikely to be the case, given the historical trend. If offense has an
advantage over defense, you'll all fight it out until only one Mind
remains with a monopoly on available resources. However, is the utility
of having the whole Solar System to yourself really a thousand times the
utility, the "fun", of having a thousandth of the available resources?
No. You cannot have a thousand times as much fun with a thousand times
as much mass.

You need a peace treaty. You need a system, a process, which ensures
your safety. Humans (and the then-hypothetical human-derived Minds) are
not knowably transparent or trustworthy, and your safety cannot be
trusted to either a human judge or a process composed of humans. The
clever thing to do would be to create a Sysop which ensures that the
thousand uploadees do not harm each other, which divides resources
equally and executes other commonsense rules. Offense may win over
defense in physical reality, but not in software. But now you're just
converging straight back to the same method I proposed...

The other half of the "low utility" part is philosophical; if there are
objective goals, you'll converge to them too, thus accomplishing exactly
the same thing as if some other Mind converged to those goals. Whether
or not the Mind happens to be "you" is an arbitrary prejudice; if the
Otterborn Mind is bit-by-bit indistinguishable from an Eliezerborn or
AIborn Mind, but you take an action based on the distinction which
decreases your over-all-branches probability of genuine personal
survival, it's a stupid prejudice.

> That's a tough one. Is "survival" an objective (super)goal? One
> must be alive to have (other) goals, that's for sure, but this
> makes it a super-subgoal rather than a supergoal. Survival
> for its own sake is rather pointless. In the end it still comes
> down to arbitrary, subjective choices IMO.

Precisely; and, in this event, it's possible to construct a Living Pact
which runs on available hardware and gives you what you want at no
threat to anyone else, thus maximizing the social and technical
plausibility of the outcome.

> In any case, there's no need for "objectively existent
> supergoals" to change the Sysop's mind; a simple
> glitch in the system could have the same result, for
> example.

I acknowledge your right to hold me fully responsible for any failure to
make an unbiased engineering estimate of the probability of such a glitch.

> Well, that's your educated (and perhaps a wee bit biased)
> guess, anyway. We'll see.

Perhaps. I do try to be very careful about that sort of thing, though.

> P.s: do you watch _Angel_ too?

Of course.

-- 
       sentience@pobox.com      Eliezer S. Yudkowsky
          http://pobox.com/~sentience/beyond.html
                 Member, Extropy Institute
           Senior Associate, Foresight Institute



This archive was generated by hypermail 2b29 : Thu Jul 27 2000 - 14:05:04 MDT