Re: Posthuman mind control (was RE: FAQ Additions)

Eliezer S. Yudkowsky (sentience@pobox.com)
Wed, 03 Mar 1999 23:27:03 -0600

Messages sorted by: [ date ][ thread ][ subject ][ author ]
[ Next ][ Previous ] In reply to: Nick Bostrom Next in thread: Nick Bostrom

Nick Bostrom wrote:
>
> Eliezer S. Yudkowsky wrpte:
>
> > Are you arguing that, say, someone who was brought up as a New Age
> > believer and switches to being an agnostic is not making a rational choice?
>
> Believing in the healing powers of crystals is not a value, it's a
> mistaken factual opinion. The New Ager, provided he has the same data
> as we do, will be rational to give up his New Agey beliefs.

I invoke the banner of Crockerism to communicate, and humbly beg your tolerance: I think you may be conforming the facts to the theory.

On the whole, New Agers are not people who form mistaken factual opinions about the healing powers of crystals. You are, shall I say, extropomorphizing? These people do not believe their tenets as the simplest explanation for incorrectly reported facts; they believe because Crystals are the Manifestation of the New Age of Warmth and Love and Kindness which shall Overcome the Cold Logic of Male-Dominated Science. (That this is a shockingly sexist insult to women everywhere never seems to occur to them.)

> It's hard to give a precise definition of fundamental value, just as
> it is hard to give a precise definition of what it means to believe
> in a proposition.

?? There are a few kinds of cognitive objects in the mind associated with "belief", including the form of the proposition itself, the qualitative degree of truth assigned to that proposition, and a few assorted emotions that affect whether we believe in something or are invoked as a consequence of believing in something - emotional commitment is an example; when we make an emotional commitment to an idea, we think it is certain, are reluctant to entertain propositions we think will contradict the idea, and believe that believing in the idea is right.

> But let me try to explain by giving a simplified
> example. Suppose RatioBot is a robot that moves aroung in a finite
> two-dimensional universe (a computer screen). RatioBot contains two
> components: (1) a long list, where each line contains a description
> of a possible state of the universe together with a real number (that
> state's "value") [snip] On the other
> hand, the values expressed by the list (1) could be said to be
> fundamental.

The human mind doesn't work that way. There are innate *desires*, such as emotions, which are quite distinct from the current set of *purposes*. Desires don't change, but can be easily overridden by purposes. And purposes, as mental objects, are propositions like any other proposition, that have a degree of truth like any other proposition, and can change if found to be untrue. If I have the thought "my purpose ought to be X", my purposes change as a consequence.

Purposes can rationally change whenever the justification for that purpose fails. Most people get an initial set of purposes taught like any set of facts in childhood; purposes are harder to change, because they usually teach that believing in the purpose is right, and invoke emotions like emotional commitments. But if you start shooting down other facts that were taught along with the purpose, and the purpose's surrounding memes, you can eventually move on to the justification and the purpose itself.

That's how it is for humans. I'm not saying that is how it *ought* to be, or how an AI *must* be; those are separate propositions. But as a description of the way humans operate, I think it is more accurate than either RatioBot or HappyApplet.

> I think I know approximately what my fundamental values are: I want
> everybody to have the chance to prosper, to be healty and happy, to
> develop and mature, and to live as long as they want in a physically
> youthful and vigorous state, free to experience states of
> consciousness deeper, clearer and more sublime and blissful than
> anything heard of before; to transform themselves into new kinds of
> entities and to explore new real and artificial realities, equipt
> with intellects incommensurably more encompassing than any human
> brain, and with much richer emotional sensibilities. I want very much
> that everybody or as many as possible get a chance to do this.
> However, if I absolutely had to make a choice I would rather give
> this to my friends and those I love (and myself of course) than to
> people I haven't met, and I would (other things equal) prefer to give
> it to people now existing than only to potential future people.

Do you think that these fundamental values are *true*? That they are better than certain other sets of fundamental values? That the proposition "these values are achievable and non-self-contradictory" is true?

If a Power poofed into existence and told you that all Powers had the same set of values, and that it was exactly identical to your stated set EXCEPT that "blissful" (as opposed to "happy") wasn't on the list; would you change your fundamental goals, or would you stick your fingers in your ears and hum as loud as you could because changing your beliefs would interfere with the "blissful" goal?

> With human-level AIs, unless they have a very clear and unambigous
> value-structure, it could perhaps happen. That's why we need to be on
> our guard against unexpected consequences.

With seed, human, and tranhuman AIs, it will happen no matter what we do to prevent it. I don't think that our theories are yet representative of reality, only unreal abstractions from reality, just as there are no "protons", only quarks. Second only to "Do what is right", as a fundamental value, is "We don't know what the hell is going on." Any facts and any goals and any reasoning methods we program in will be WRONG and will eventually be replaced. We have to face that.

-- 
        sentience@pobox.com         Eliezer S. Yudkowsky
         http://pobox.com/~sentience/AI_design.temp.html
          http://pobox.com/~sentience/sing_analysis.html
Disclaimer:  Unless otherwise specified, I'm not telling you
everything I think I know.

[ Next ][ Previous ] In reply to: Nick Bostrom Next in thread: Nick Bostrom