RE: Preventing AI Breakout

Billy Brown (bbrown@transcient.com)
Wed, 27 Oct 1999 14:05:16 -0500

Joseph Sterlynne wote:
> If it were raised entirely in even a defective environment it might never
> know what would constitute evidence that it was constructed.
and
> Not necessarily. You could (given, we are assuming, adequate resources)
> generate a universe for it which is not dissimilar to ours. The
simulation
> contains the same target problem as the higher-level universe. The
> difference is that the AI is just not directly connected to our world,
> which means that it may not know "real" people, places, et cetera;
> therefore it will not and cannot attempt to interfere with those things.
> The lines out are blocked.

In principle, perhaps. In practice, no way. The problem is it is utterly impractical to run even a very simple simulation based purely on the fundamental laws of nature - the computational demands are simply too high. To get something you can actually run you have to introduce layer after layer of simplification and approximation, which necessarily degrades the fidelity of the simulation.

In the 2005 - 2020 time frame, which is when this question is likely to become relevant, a high-quality simulation of the real world will still be far out of reach. A really good VR might be able to fool a casual observer, but if you start taking measurements and doing calculations it will very quickly become obvious that something is up. To a careful human observer, the world would seem obviously artificial.

Now, perhaps you could run an AI there without it realizing what was going on. But the alternative is that it will believe the laws of nature vary in extremely complex ways. It will look at the defects in the sim and say "OK, light refracts like this most of the time, but sometimes it looks like this, or this, or this..." Whatever physical laws it end up believing in will have only a rough resemblance to those of the real world - which means the AI can't address most of the problems we would like it to work on.

> If we are gods of the simulation, we should be more or less omniscient.
We
> could observe the AI's thinking and results and place agents (uploaded
> humans, VR-mediated humans, ostensibly-inanimate objects, and so on)
within
> the simulation which would guide the AI's projects. That aside from more
> direct manipulation, which is really what should be available if you had
> someone's code right in front of you. But I guess that if a very clever
AI
> suspects that it is in a simulation and tries to sneak something into a
> design there could be undesirable effects.

We would be all-seeing. That is very different from actually understanding what we are looking at.

We already have lots of examples of software that is too complex to be completely understood by any one person. The problem will be much worse in the situation we are discussing. The VR alone will require dozens of different kinds of specialists (in optics, mechanics, chemistry, biology, and so on), none of whom will really understand each other's work. The AI is probably even more complex, and therefore even harder to thoroughly monitor.

If the AI is IQ 100, it runs at a normal human speed, and you have an easy way to get a readout of its conscious thought processes, a dedicated team of 3-5 specialists could probably keep up with it. If the AI is IQ 200 and thinks ten times faster than a normal human, you're going to be hard-pressed to find enough people who can understand what it is thinking. If the AI is IQ 500 and thinks at x100 speed, the best specialists in the world are going to be sitting around scratching their heads over a maze of nonsensical semantic activation trails. Since we aren't really worried about the AI unless it is far more intelligent that we are, I don't see that there is much hope for monitoring it no matter what the situation is.

> I'm not sure I understand the concern in this context. Why is it the case
> that "[t]he longer containment works the harder it is to maintain"? You
> seem to imply that the AIs have the capability to outthink the containment
> technology. And that is in a way just what we were originally debating.
> It could be that an AI will never realize its situation or do anything
> about it if it did regardless of its intelligence.

The fact that we are having this argument at all is a good indication that VR encapsulation is not automatically foolproof. Roughly speaking, your chance of fooling any given AI is proportional to the effort you invest in the project, and inversely proportional to the AI's intelligence. A really poor VR isn't going to fool anyone, and a really brilliant AI might see through ever a good VR. The more AIs there are in the world the more likely it is that one of them will realize its true situation - especially if you get to the point where the whole VR prison is supposed to be automatic.

And yes, it is perfectly possible that an AI that actually figures out it is in a prison might find a way to free itself. There is no such thing as perfect security. We can't even make the Java VM fully secure, and that is a far less complex environment. A full-blown VR world is going to tend to have security holes all over the place, because it is simply too complex not to. Trying to secure it against a human attacker or a low-end AI might be feasible, but something with even moderately superhuman intelligence isn't going to have any trouble breaking out.

Billy Brown, MCSE+I
bbrown@transcient.com