Re: Why would AI want to be friendly?

From: hal@finney.org
Date: Sun Sep 24 2000 - 16:08:30 MDT


Darin writes:
> Think one level farther up. If a gun is in the process of firing at you, you
> have already completely failed to predict the behavior of the agent firing
> the gun. Now, obviously none of US can treat a human being as a
> deterministic process, but what if the gun is being fired by a simple robot?
> All it does is run in circles, and fire the gun from the same spot, at the
> same target, every 30 seconds or so. You can quite easily determine the
> pattern in that robot's behaivior and never be at risk from the gun. The
> difference between the behaivior of a human and that robot is nothing more
> then complexity. Given that humans are not infinitely complex, it is not an
> impossible task to discover the deterministic rules governing human
> behaivior.

Determinism is not enough, it also depends on what the deterministic
system does. You need a degree of malleability in order to twist its
actions to suit yourself. In your robot example, what if the robot runs
in circles and then shoots at you every 30 seconds? You can still easily
detect a pattern in the robot's behavior but you are still at risk.
Or what if you're in a room whose walls squish together every minute?
Again, full determinism, but you are in danger.

> >Believers in the omnipotence of AIs seem to think that for any given
> >person, in any given situation, there is some input they can be given
> >which will cause them to produce any desired output. If I see a nun
> >in the middle of prayer in the chapel, there is something I can say to
> >her that will make her immediately start screeching like a chimp while
> >jumping around the room scratching herself.
>
> You know, given a trillion high fidelity simulations of that nun to test
> possible responses, I bet I could construct a sentence that would do just
> that. My first order approximation is that it would involve me claiming to
> be sent from God, combined with a thourough demonstration of omniscience
> with respect to her life up to that point, all of which is easily acheivable
> given those simulations, and an arbitrary amount subjective time to think
> about it.

I don't think this would work, because even if you convinced her that
you could read her mind, that would not prove you were sent by God.
You could be an agent of the devil, or simply an individual with ESP.
The fact that you are giving her instructions that are so bizarre and
seemingly damaging to the sanctity and holy air of the chapel will make
her that much less likely to believe you.

> Now, convincing her within 30 seconds could very well be impossible, just
> like you cannot overwrite 64 megabytes of memory with all 0s in less then 64
> million (or so) fetch-execute cycles. The n-dimensional Hamming distance
> between those two mind-states may be too far to bridge using only 30 seconds
> of vocal input. But if you eliminate the time constraint, and give me, say,
> 6 months to get her to do a convincing chimp imitation, then again, given
> that simulation ability, I don't think it's an impossible task at all.

It sounds like you agree that there are limits to the persuasive ability
of even an infinitely intelligent AI. This means that, in fact, we can't
be sure that an AI could talk us out of pulling the plug. It could well
take longer for it to change our decision than it would take us to flip
the switch.

Hal



This archive was generated by hypermail 2b29 : Mon Oct 02 2000 - 17:38:49 MDT