Eliezer S. Yudkowsky writes:
> It would be the truly dangerous part if we were dealing with corruptible
> humans. The whole principle here is that the AI doesn't *want* to disobey the
Ah, these weak, corruptible humans. Imagine an utterly inflexible SI
tyrant. Uh, rather give me the human one, at least there I have a
slight chance of changing his point of view.
> spirit of the rules.
Show me a rule, and I'll show you 10^3 ways to interprete it in
exactly the opposite way as intended. Show me two rules, and I'll show
you 10^6 ways to turn it around.
Reality doesn't like formal rules. They make for too brittle designs.
> "Don't try to be strict, the way you would be with an untrustworthy
> human. Don't worry about "loopholes". If you've succeeded worth
> a damn, your AI is Friendly enough not to *want* to exploit
> loopholes. If the Friendly action is a loophole, then "closing the
> loophole" means *you have told the AI to be unFriendly*, and that's
> much worse."
What, have you started citing yourself now? Without even waiting for historians?
> As for that odd scenario you posted earlier, curiosity - however necessary or
> unnecessary to a functioning mind - is a perfectly reasonable subgoal of
> Friendliness, and therefore doesn't *need* to have independent motive force.
Please translate that into normal language, because that is entirely
This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:50:14 MDT