Re: AI Prime Directive

Eliezer S. Yudkowsky (sentience@pobox.com)
Fri, 11 Sep 1998 09:06:42 -0500

Messages sorted by: [ date ][ thread ][ subject ][ author ]
[ Next ][ Previous ] In reply to: Damien Broderick Next in thread: Hara Ra

Damien Broderick wrote:
>
> I'm eagerly
> awaiting some detailed commentary from the computing gurus. Meanwhile, I
> am disturbed by one indication that Eli is approaching his task
> (understandably - how much can one guy cover?) without much sense of the
> last 40 or 50 years' accumulated philosophy, let alone such really
> important archives of wisdom as sf.

SF, yes. I have, for example, read "Tik-Tok" and "Roderick", which you refer to later on. In both cases, the cognitive nature of the Asimov loss is never explained, and is usually guessed to be a manufacturing flaw. You don't need "Tik-Tok" for that; it's available in "Frankenstein". Greg Egan's "Quarantine" is the only really detailed look (that I've read) at a coercion which fails for fundamental reasons of cognitive science.

As for philosophy - have you taken a look at QCL, the Quantum Computing Language? It's not enough that AI-writers have to be computer scientists, cognitive scientists, neurologists, psychologists, and consciousness philosophers; now we have to be quantum physicists. If it weren't for Penrose's "Shadows of the Mind", I'd have been totally lost. So yeah, you're right, the last 50 years worth of philosophy haven't been getting much attention, except for the AI-oriented stuff. _Ancient_ philosophy, maybe, as long as I avoid fools like the arch-traitor Plato.

> The fact that Eli places at the core
> of his endeavours the really silly instruction that we must
> < Never allow arbitrary, illogical, or untruthful goals to enter the AI. >
> reflects a touching faith in human powers of understanding and consistency.

I'm not quite sure what you mean by this.

> There is some play made of Asimov's Three Laws, and a heart-felt admonition
> that AI designers not be taken in by this great but erroneous deontological
> doctrine. Calm down, Eli - nobody has ever taken that gadget seriously,
> least of all Isaac (who took it over from Campbell as a story-generating
> device exactly because of its capacity to spew out entertainingly
> inconsistent interpretations and even world-views in the poor machine in
> its thrall).

I quote from Asimov's "The Complete Robot":

"What's more, in 'Runaround' I listed my 'Three Laws of Robotics' in explicit detail for the first time, and these, too, became famous. At least, they are quoted in and out of season, in all sorts of places that have nothing primarily to do with science fiction, even in general quotation references. And people who work in the field of artificial intelligence sometimes take occasion to tell me that they think the Three Laws will serve as a good guide."

I hear about them all the time. Not necessarily from AI workers, but Asimov has done a very good job of teaching everyone to think that AIs need coercions to keep them in order. Now I not only have to worry about the AIers, I have to worry about the venture capitalists and the marketers and the managers.

> The best deconstruction/demolition outside formal philosophy of this
> program is found in various stories and novels by the brilliant John
> Sladek, who is usually mysteriously overlooked by people who know Rudy
> Rucker's work, let alone Asimov's. Sladek showed again and again how
> fatuous the Three Laws are, how inevitable the escape from their supposed
> restraints. Look for his stories, anbd the novels TIK-TOK, RODERICK, OR
> THE EDUCATION OF A YOUNG MACHINE, and RODERICK AT RANDOM. (God knows how
> he managed to leave the B off the start of those titles.)

Incredibly funny books - but they are not cognitive science. Reading them, you'd think: "My God, look what happens when robots go unrestrained! We'd better make extra sure to slap more coercions on them." That the robots go insane _because_ of the coercions, not in spite of them, is never suggested.

> More seriously, the topic of discursive reframing (which makes a total hash
> of any `Prime Directive' imaginable) can be pursued in books such as
> Stanley Fish's reader-response diatribe IS THERE A TEXT IN THIS CLASS?
> through to my own THEORY AND ITS DISCONTENTS. It's fundamental to Roger
> Schank's work in AI, too, I'd have thought.

I'm not familiar with "discursive reframing", and I can't find it on the Internet, but offhand I suspect that it's either a problem of infinite recursion or the unstable definition problem. In any case, I'm not sure what you mean by it making a hash of the "Prime Directive"; I suspect you meant "Three Laws". And it's very easy to imagine all kinds of Three Laws that can be imposed on a system. That's the problem.

Philosophically absurd things happen all the time. Nobody will leave coercions out because they're philosophically absurd. You need to say: "You can't give AIs absolute orders because they'll go insane." You need to say: "Coercions are fundamentally irrational, and if we build an irrational AI it will become irrational in a way we don't expect."

-- 
        sentience@pobox.com         Eliezer S. Yudkowsky
         http://pobox.com/~sentience/AI_design.temp.html
          http://pobox.com/~sentience/sing_analysis.html
Disclaimer:  Unless otherwise specified, I'm not telling you
everything I think I know.

[ Next ][ Previous ] In reply to: Damien Broderick Next in thread: Hara Ra