Re: Singularity: AI Morality (AI containment)

Eliezer S. Yudkowsky (
Wed, 09 Dec 1998 14:31:40 -0600

Brian Atkins wrote:
> I'm curious if there has been a previous discussion on this
> list regarding the secure containment of an AI (let's say a
> SI AI for kicks)? Many people on the list seem to be saying
> that no matter what you do, it will manage to break out of
> the containment. I think that seems a little far-fetched...

There are several issues here. The first issue is that even the best programs aren't absolutely secure. Remember when Sun announced a flaw in the Java language? I don't recall offhand whether the flaw was such that an AI would have been able to break out, but the point is that here's a language designed for nothing BUT security, and it isn't secure enough to bet the planet on.

The second issue is mental manipulation. Persuasion is an art to which the brain devotes an astonishing amount of complexity. I think it quite likely that an SI with a full understanding of that system would be able to dangle us on puppet strings; it's optimized against humans attacking in a particular way, and that optimization creates certain vulnerabilities. I hope that, what with the recent debate on this very list, nobody is going to claim that we are towers of memetic invulnerability; flaws that were evolutionary advantages have actually been widened. Remember, it only needs one human to create an opening that it can widen by other methods.

The third issue is physical manipulation. It is more than conceivable to me that even perfectly designed software may not pose an obstacle. There are cryptanalytic attacks on hardened chips that center around causing bit errors via radiation and exploiting the resulting arithmetical errors, so this is already an actual discipline. Low-level physics appears chaotic to us, but sufficiently high intelligence may perceive strange attractors that can be combined to yield useful results.

The fourth issue is added processing power costs. Java is slower than C++ by a factor of ten. Even if reasonable-against-SI security can be achieved by running code interpreted by Java running on virtual hardware, the resulting slowdown of a thousand is probably enough to ensure that less scrupulous groups will have an AI ten years earlier. If network computing is off-limits due to security issues, that adds an additional slowdown of thousands or millions. Running hundreds of SIs in a Javalike sandbox on a single computer will not be feasible until around 2070 at the current rate of progress, so perhaps I shouldn't worry.

You can claim that maybe one particular problem probably won't happen, but good luck convincing any sane person to bet the planet on everything going right. Maybe "no matter what you do" is extreme. Maybe there's actually a 5% chance that Murphy's Law will go on holiday. Is anybody here that desperate?

Anybody who respects computational complexity (such as Billy Brown) or higher intelligence (John K Clark) knows better than to assume we can predict or manipulate either. I mean, we can't even push on the economy and get it to do something predictable, and the chief lesson of this entire century is that we should just leave it alone. Let's not learn this the hard way with respect to the Singularity, shall we?

--         Eliezer S. Yudkowsky

Disclaimer:  Unless otherwise specified, I'm not telling you
everything I think I know.