Re: Otter vs. Yudkowsky

From: D.den Otter (
Date: Fri Mar 24 2000 - 16:14:35 MST

> From: Eliezer S. Yudkowsky <>

> > Yes, that's the *practical* side of the dispute. There's also the
> > philosophical issue of whether personal survival is more important
> > than the creation of superintelligent successors, "egoism" vs
> > "altruism" etc., of course. This inevitably adds an element of
> > bias to the above debate.
> I have no trouble seeing your point of view. I am not attempting to
> persuade you to relinquish your selfishness; I am attempting to persuade
> you that the correct action is invariant under selfishness, altruism,
> and Externalism.

If nano is relatively easy, if it quickly becomes dangerous and
uncontrollable and if strong AI is much easier than human
upgrading then your approach does indeed make sense, as a last
desperate effort of a dying race. However, if the situation
turns out to be less urgent, then (gradual) uploading is definitely
the way to go from a "selfish" perspective because, instead
of being forced to passively await a very uncertain future, one
can significantly improve one's odds by stepping up one's efforts.
Initiative & creativity are rewarded, as they should be. You can
outsmart (or understand and make deals with, for that matter)
other uploads, but you can't do that with a transhuman AI if
you're merely human or an upload running on its system. It's
not just an instinctive, irrational illusion of control, you
really *have* more control.

[corrected piece copied from other post]
> > No, that estimate is definitely incorrect. Using a value of less than
> > 10% or more than 70% would be unjustifiable. 30% was "pulled out of the
> > air"; I'll happily defend the range itself.
> >
> > More than 70% would be unjustifiable due to the Fermi Paradox and unknowability.
> >
> > Since, if we can create a Sysop with specifiable stable goals, we win,
> > to assert that the probability is less than 10% would require
> > demonstrating that the probability of (A) External goals (and hostile
> > ones, at that), or (B) the probability that stable arbitrary goals can
> ^^^
> should be 'cannot'
> > be produced, are one or the other above 90%, or that their product is
> > above 90%; which requires a degree of definite knowledge about these
> > issues that nobody possesses. Even if it were possible to rationally
> > estimate the resulting "specifiable stable goals" probability as being
> > below 10%, which I do not think is the case, then it would be absurd to
> > argue it as being 1%. To say that a 99% probability of "no specifiable
> > goals" holds is to imply definite knowledge, which neither of us has.
> Or in other words, I'm sure den Otter would agree that 70% is a
> reasonable upper bound on our chance of success given our current
> knowledge (although I'm sure he thinks it's too optimistic).

Well yes, ok, I do think it's somewhat optimistic, but, strictly
speaking, the 10-70% range is reasonable enough. Needless
to say, when you look at the average we're dealing with some
pretty bad odds here. I mean, if those were your odds to
survive a particular operation, you'd probably want your cryonics
organization to do a [rescue team] standby at the hospital.
Unfortunately, in the Singularity Game there are no backups
and no second chances.

> It is
> equally possible to set a reasonable upper bound on our chance of failure.

Fairy nuff...
[end insert]

> > The longer you exist, the more opportunities there will be for
> > something to go wrong. That's pretty much a mathematical
> > certainty, afaik.
> While I view the growth of knowledge and intelligence as an open-ended
> process, essentially because I am an optimist, I do expect that all
> reasoning applicable to basic goals will have been identified and
> produced within a fairly small amount of time, with any remaining
> revision taking place within the sixth decimal place. I expect the same
> to hold of the True Basic Ultimate Laws of Physics as well. The problem
> is finite; the applications may be infinite, and the variations may be
> infinite, but the basic rules of reasoning, and any specific structure,
> are finite.

Well, I hope you're right. The biggest risk is probably, as you
already suggested, at the very beginning; that's when human error
(i.e. bad programming/bad hardware/some freak mishap)
could thoroughly mess up the AI's mental structure, making
it utterly unpredictable and potentially very dangerous. Or
just useless, of course. This possibility should by no means
be underestimated.
> Don't think of it as an enemy; think of it as an Operating System.

Operating Systems can be your enemy too; as a Win95 user
I know what I'm talking about...
> > Natural evolution may have made some pretty bad mistakes, but
> > that doesn't necessarily mean that *all* of our programming will become
> > obsolete. If the SIs want to do something, they will have to stay
> > alive to do it (unless of course they decide to kill themselves, but
> > let's assume for the sake of argument that this won't be the case).
> > Basic logic. So some sort of self-preservation "instinct" will be
> > required(*) to keep the forces of entropy at bay. Survival requires
> > control --the more the better-- over one's surroundings. Other
> > intelligent entities represent by definition an area of deminished
> > control, and must be studied and then placed in a threat/benefit
> > hierarchy which will help to determine future actions. And voila,
> > your basic social hierachy is born. The "big happy egoless
> > cosmic family model" only works when the other sentients
> > are either evolutionary dead-ends which are "guaranteed" to
> > remain insignificant, or completely and permanently like-minded.
> Nonsense. If the other sentients exist within a trustworthy Operating
> System - I do think that a small Power should be able to design a
> super-Java emulation that even a big Power shouldn't be able to break
> out of; the problem is finite - then the other sentients pose no threat.

Yes, but what if the other sentients *aren't* part of the
simulation (alien life forms from distant galaxies, uploads
or AIs that have ascended more or less simultaneously but
independently). By "surroundings" I meant the "real world"
outside the SI technosphere, not the (semi-autonomous)
simulations it is running. I agree that containing the latter
shouldn't be too much of a problem, but that's not the
issue here.

> Even if they do pose a threat, then your argument is analogous to
> saying that a rational operating system, which views its goal as
> providing the best possible environment for its subprocesses, will kill
> off all processes because they are untrustworthy. As a logical chain,
> this is simply stupid.

It would be stupid, yes, but that's not what I was saying (see
> > No, no, no! It's exactly the other way around; goals are
> > observer dependent by default. As far as we know this is
> > the only way they *can* be.
> I should correct my terminology; I should say that observer-*biased*
> goals are simply evolutionary artifacts. Even if only
> observer-dependent goals are possible, this doesn't rule out the
> possibility of creating a Sysop with observer-unbiased goals.

That's a lot better. I still think that a totally "selfless"
being is a lower form of life no matter how smart it
is otherwise (a giant leap backwards on the ladder of
evolution, really), but perhaps this is just an "irrational"
matter of personal taste. In any case, the prudent
approach is to keep your ego until you're certain, beyond
all reasonable doubt, that it is not an asset but a handicap
(or useless appendage). In other words, you'll have to
become a SI first.
> > Evolution represents, among other things, some basic rules
> > for survival. No matter how smart the SIs will become, they'll
> > still have to play by the rules of this reality to live & prosper.
> Your statement is simply incorrect. The possibility of a super-Java
> encapsulation, which I tend to view as the default possibility - human
> Java can be broken because humans make mistakes, humans make mistakes
> because they're running with a high-level four-item short-term memory
> and no codic cortex, and a superintelligence which knows all the laws of
> physics and has a codic cortex should be able to design security a Power
> couldn't crack; the problem is finite - directly contradicts the
> necessity of all the survival activities you postulate.

Is it your aim to "encapsulate" the whole known universe with
everything in it? Regardless of your answer, if there *are* other
sentients in this universe, you will eventually have to deal
with them. Either you come to them -swallowing galaxies
as you go along- or they come to you. Then you'll have to
deal with them, not in Java-land, but in reality-as-we-know-it
where the harsh rules of evolution (may) still apply.
> > You can't deny self-evident truths like "might makes right"
> > without paying the price (decreased efficiency, possibly
> > serious damage or even annihilation) at some point. And
> > yes, I also belief that suicide is fundamentally stupid,
> > *especially* for a Power which could always alter its mind
> > and bliss out forever if there's nothing better to do.
> Only if the Power is set up to view this as desirable, and why would it
> be? My current goal-system design plans don't even call for "pleasure"
> as a separate module, just selection of actions on the basis of their
> outcomes. And despite your anthropomorphism, this does not consist of
> pleasure. Pleasure is a complex functional adaptation which responds to
> success by reinforcing skills used, raising the level of mental energy,
> and many other subtle and automatic effects that I see no reason to
> preserve in an entity capable of consciously deciding how to modify
> itself.

No, it wouldn't need the pleasure/pain mechanism for
self-modification, but that doesn't mean that the pure
emotion "pleasure" will become redundant; if your machine
seeks the "meaning of life", it might very well re-discover
this particular emotion. If it's inherently logical, it *will*
eventually happen, or so you say.

> In particular, your logic implies that the *real* supergoal is
> get-success-feedback,

Forget success-feedback, we're talking about an "untangled"
emotion for its own sake. The supergoal would be "to have
fun", and the best way to do this is to have a separate
module for this, and let "lower" autonomous systems sort
out the rest. The power would be happy & gay all the time,
no matter what happened, without being cognitively impaired
as ecstatic humans tend to be.

> and that the conditions for success feedback are
> modifiable; this is not, however, an inevitable consequence of system
> architecture, and would in fact be spectacularly idiotic; it would
> require a deliberate effort by the system programmer to represent
> success-feedback as a declarative goal on the same level as the other
> initial supergoals, which would be simply stupid.

If you don't allow an entitity to experience the emotion
"pleasure", you may have robbed it of something "inherently"
good. Not because emotions are needed for functioning
and expanding, but because pure, un-attached, freely
controllable pleasure is a *bonus*. You have 3 basic
levels: -1 is "suffering", 0 is the the absence of emotions,
good or bad (death is one such state) and 1 is
"pleasure". Why would a fully freethinking SI want
to remain on level 0 when it can move, without sacrificing
anything, to level 1, which is "good" by defintion
(I think I'm happy, therefore I *am* happy or something
like that). Think of it as a SI doing crack, but without all
the nasty side-effects. Would there be any logical reason
_not_ to "do drugs", i.e. bliss out, if it didn't impair your
overall functioning in any way? Bottom line: it doesn't
matter whether you include pleasure in the original
design; if your machine seeks the ultimate good and
has the ability to modify itself accordingly, it may
introduce a pleasure module at some point because
it concludes that nothing else makes (more) sense.
I think it's very likely, you may think the opposite, but
that the possibility exists is beyond any reasonable
> > The only
> > logical excuse for killing yourself is when one knows for pretty
> > damn sure, beyond all reasonable doubt, that the alternative
> > is permanent, or "indefinite", hideous suffering.
> Nonsense; this is simply den Otter's preconditions. I thought you had
> admitted - firmly and positively asserted, in fact - that this sort of
> thing was arbitrary?

Yes, IMO everything is arbitrary in the end, but not everything
is equally arbitrary in the context of our current situation. If
you strip everything away you're left with a pleasure-pain
mechanism, the carrot and the stick. From our current pov,
it makes sense to enhance the carrot and to get rid of the
stick altogether (while untangling the reward system from
our survival mechanism and making the latter more or
less "autonomous", for obvious reasons). AIs could still
be useful as semi-independent brain modules that take
care of system while the "I" is doing its bliss routine.
"Mindless" Sysops that still have to answer *fully* to
the superintelligent "I", whose line of consciousness can
be traced back directly to one or several human uploads.
For these beings, personal identity would be much less
of an illusion than it is to us, for a SIs "never sleep" and
actually understand and control their inner systems.
But that aside...

> > In other words, objective morality will always be just an educated
> > guess. Will there be a limit to evolution anyway? One would be
> > inclined to say "yes, of course", but if this isn't the case, then
> > the quest for objective morality will go on forever.
> Well, you see "objective morality" as a romantic, floating label. I see
> it as a finite and specifiable problem which, given true knowledge of
> the ultimate laws of physics, can be immediately labeled as either
> "existent" or "nonexistent" within the permissible system space.

You're still driven by "arbitrary" emotions, attaching value to
random items (well, not completely random in an evolutionary
context; seeking "perfection" can be good for survival). At the
very least you should recognize that your desire to get rid of "all
human suffering" is just an emotional, "evolved" monkey
hangup (the whole altruism thing, of which this is clearly
a part, is just another survival strategy. Nothing more, nothing
less). But that's ok, we're all still merely human after all.
> > I'm sure you could make some pov-less freak in the lab, and
> > keep it alive under "ideal", sterile conditions, but I doubt that
> > it would be very effective in the real world. As I see it, we have
> > two options: a) either the mind really has no "self" and no "bias"
> > when it comes to motivation, in which case it will probably just
> > sit there and do nothing, or b) it *does* have a "self", or creates
> > one as a logical result of some pre-programmed goal(s), in
> > which case it is likely to eventually become completely
> > "selfish" due to a logical line of reasoning.
> Again, nonsense. The Sysop would be viewable - would view itself -
> simply as an intelligent process that acted to maintain maximum freedom
> for the inhabitants, an operating system intended to provide equal
> services for the human species, its user base. Your argument that
> subgoals could interfere with these supergoals amounts to postulating
> simple stupidity on the part of the Sysop. Worries about other
> supergoals interfering are legitimate, and I acknowledge that, but your
> alleged chain of survival logic is simply bankrupt.

You said yourself that your Sysop would have power-like
abilities, and could reprogram itself completely if so desired.
I mean, an entity that can't even free itself from its original
programming can hardly be called a Power, can it? Perhaps
you could more or less guarantee the reliability (as a
defender of mankind) of a "dumb" Sysop (probably too
difficult to complete in time, if at all possible), but not
that of truly intelligent system which just happens to
have been given some initial goals. Even humans can
change their supergoals "just like that", let alone SIs.
And how would the search for ultimate truth fit into
this picture, anyway? Does "the truth" have a lower,
equal or higher priority than protecting humans?

> > [snakes & rodents compared to AIs & humans]
> > > It would be very much different. Both snakes and rodents evolved.
> > > Humans may have evolved, but AIs haven't.
> >
> > But they will have to evolve in order to become SIs.
> No, they wouldn't. "Evolution" is an extremely specific term yielding
> phenomena such as selection pressures, adaptations, competition for
> mates, and so on. An AI would need to improve itself to become Sysop;
> this is quite a different proposition than evolution.

Unless it encounters outside competition. It may not have to
compete for mates, but resources and safety issues can be
expected to remain significant even for Powers. Also, even
if we assume no outside competition ("empty skies"), it
still could make a *really* bad judgement call while upgrading
itself. Evolution of any sort will involve some trial and error,
and some mistakes can have serious consequences (certainly
when messing with your mental structure, and moving into
completely uncharted territory).
> > I'd take the former, of course, but that's because the odds in this
> > particular example are extremely (and quite unrealistically so)
> > bad. In reality, it's not you vs the rest of humanity, but you vs
> > a relative small financial/technological elite, many (most) of
> > whom don't even fully grasp the potential of the machines they're
> > working on. Most people will simply never know what hit them.
> Even so, your chances are still only one in a thousand, tops - 0.1%, as
> I said before.

Again, only in the worst-case "highlander" scenario and only
if the upload group would really be that big (it would probably
be a lot smaller).
> > Anyway, there are no certainties. AI is not a "sure shot", but
> > just another blind gamble, so the whole analogy sort of
> > misses the point.
> Not at all; my point is that AI is a gamble with a {10%..70%} chance of
> getting 10^47 particles to compute with, while uploading is a gamble
> with a {0.0000001%..0.1%} of getting 10^56. If you count in the rest of
> the galaxy, 10^58 particles vs. 10^67.

Think bigger. Anyway, this is not (just) about getting as much
particles as possible for your computations ("greed"), but rather
about threat control ("survival"). There is an "unknown" chance
that the Sysop will turn against you (for whatever reason), in
which case you're at a horrible disadvantage, much more so
than in the case of a MAD failure and subsequent battle between
more or less equally developed Powers. I'd rather fight a
thousand peers than one massively superior Power anytime.
Uploads basically have two chances to survive: 1) You make
a deal, most likely based on MAD, and no-one fights, everyone
lives. 2) If this fails, you still have a fighting 0.1% chance
even in the worst case scenario, i.e. when everyone fights to
the death (*which isn't likely*; SIs ain't stupid so they're
more likely to compromize than to fight a battle with such bad

Therefore, I have to conclude that an upload's
chance would be considerably better than your 0.1%
figure. 10-70%, i.e. Sysop range, would be a lot more
realistic, IMO. SIs may not like competition, but they
are no *retards*. I'd be surprised indeed if they just
started bashing eachother like a bunch of cavemen.
If MAD can even work for tribes of highly irrational
monkeys, it sure as hell should work for highly rational

> > Power corrupts, and absolute power...this may not apply just
> > to humans. Better have an Assembly of Independent Powers.
> > Perhaps the first thing they'd do is try to assassinate eachother,
> > that would be pretty funny.
> "Funny" is an interesting term for it. You're anthropomorphizing again.

Of course, I *am* still human, after all. It would be "funny" from
my current perspective, nothing more and nothing less.

> What can you, as a cognitive designer, do with a design for a group of
> minds that you cannot do with a design for a single mind? I think the
> very concept that this constitutes any sort of significant innovation,
> that it contributes materially to complexity in any way whatsover, is
> evolved-mind anthropomorphism in fee simple.

Just plain old multiple redundancy. Seemed like a good idea
when there's so much at stake and humans are doing the
initial programming.

> As I recall, you thought
> approximately the same thing, back when you, I, and Nick Bostrum were
> tearing apart Anders Sandberg's idea that an optimized design for a
> Power could involve humanlike subprocesses.

Ah, those were the days. Was that before or after we smashed
Max More's idea that a SI would need others to interact with
(for economical or social reasons? Anyway, I still agree that
messy human subprocesses should be kept out of a SI's
mental frame, of course. No disagreement here. But what
exactly has this got to do with multiple redundancy for
> > > I *would* just forget about the Singularity, if it was necessary.
> >
> > Necessary for what?
> Serving the ultimate good.

Oh yes, of course. I suppose this is what some call "God"...
> > Aren't you really just rationalizing
> > an essentially "irrational" choice (supergoal) like the rest of
> > humanity?
> No. I'm allowing the doubting, this-doesn't-make-sense part of my mind
> total freedom over every part of myself and my motivations; selfishness,
> altruism, and all. I'm not altruistic because my parents told me to be,
> because I'm under the sway of some meme, or because I'm the puppet of my
> romantic emotions; I'm altruistic because of a sort of absolute
> self-cynicism under which selfishness makes even less sense than
> altruism. Or at least that's how I'd explain things to a cynic.

I've done some serious doubting myself, but extreme (self)
cynicism invariably lead to nihilism, not altruism. Altruism
is just one of many survival strategies for "selfish" genes,
*clearly* just a means to an evolved end. Altruism is a sub-
goal if there ever was one, and one that becomes utterly
redundant when a self-sufficient entity (a SI) gets into a
position where it can safely eliminate all competition.

Altruism is *compromize*. A behaviour born out of necessity
on a planet with weak, mortal, interdependent, evolved
creatures. To use it out of context, i.e. when it has
become redundant, is at least as arbitrary as the preference
for some particular flavor of ice-cream. At the very least,
it is just as arbitrary as selfishness (its original "master").
So, unless you just flipped a coin to determine your guideline
in life (and a likely guideline for SIs), what exactly *is* the
logic behind altruism? Why on earth would it hold in a SI world,
assuming that SIs can really think for themselves and aren't
just blindly executing some original program?
> > > If it's an irrational prejudice, then let it go.
> >
> > Then you'd have to stop thinking altogether, I'm afraid.
> Anxiety! Circular logic! If you just let *go*, you'll find that your
> mind continues to function, except that you don't have to rationalize
> falsehoods for fear of what will happen if you let yourself see the
> truth. Your mind will go on as before, just a little cleaner.

A bit of untangling is fine, but getting rid of the "I" is not
acceptable until I have (much) more data. Such is the prudent
approach...Upload, expand, reconsider.
> Still: (1) You'll notice that Angel hasn't committed suicide or
> ditched his soul, both actions which he knows perfectly well how to
> execute.

Ditching his soul would be more or less the same as
committing suicide (he exits, Angelus enters). His innate
fear of death combined with a rather irrational sense of
"duty" (to do penance for crimes he didn't even commit)
is what keeps him going, IMO. Also, one would expect
that he, subconsciously or not, expects to -eventually- be
rewarded in one way or another for his good behaviour.
The latter certainly isn't unlikely is *his* reality...

This archive was generated by hypermail 2b29 : Thu Jul 27 2000 - 14:06:17 MDT