Re: FAQ Additions (Posthuman mind control)

Nick Bostrom (
Wed, 24 Feb 1999 20:34:53 +0000

Eliezer S. Yudkowsky wrote:

> Your posthumans will find their own goals. In any formal goal system
> that uses first-order probabilistic logic, there are lines of logic that
> will crank them out, totally independent of what goals they start with.
> I'm not talking theory; I'm talking a specific formal result I've
> produced by manipulating a formal system.


> It's like a heat engine. Choices are powered by differential
> desirabilities. If you think the real, factual landscape is flat, you
> can impose a set of arbitrary (or even inconsistent) choices without
> objection. But we don't *know* what the real landscape is, and the
> probabilistic landscape *isn't flat*. The qualia of joy have a higher
> probability of being "good" than the qualia of pain. Higher
> intelligence is more likely to lead to an optimal future.

Well, if we allow the SIs to have completely unfettered intellects, then it should be all right with you if we require that they have respect for human rights as a fundamental value, right? For if there is some "objective morality" then they should discover that and change their fundamental values despite the default values we have given them. Would you be happy as long as we allow them full capacity to think about moral issues and (once we think they are intelligent enough not to make stupid mistakes) even allow them full control over their internal structure and motivations (i.e. make them autopotent)?

> Can we at least agree that you won't hedge the initial goals with
> forty-seven coercions, or put in any safeguards against changing the
> goals?

As indicated, yes, we could allow them to change their goals (though only after they are intelligent and knowledgeable enough that they know precisely what they are doing -- just as you wouldn't allow a small child to experiment with dangerous drugs).

The "coercions" would only be necessary for early generations of AI that don't have a full understanding of what they are doing. A robot that is cleaning your house -- it could be useful if it has an internal program that monitors its actions and "coerces" it to pull back and shut down if it perceives that it is about to fall down a staircase or if it hears a human distress dry. (In fact, this kind of internal "coercion" is even useful in humans. Instinctive fear sometimes saves us from the consequences of some really stupid decisions.)

Nick Bostrom Department of Philosophy, Logic and Scientific Method London School of Economics