Eliezer S. Yudkowsky writes:
 > It would be the truly dangerous part if we were dealing with corruptible
 > humans.  The whole principle here is that the AI doesn't *want* to disobey the
Ah, these weak, corruptible humans. Imagine an utterly inflexible SI
tyrant. Uh, rather give me the human one, at least there I have a
slight chance of changing his point of view.
 > spirit of the rules.
 
Show me a rule, and I'll show you 10^3 ways to interprete it in
exactly the opposite way as intended. Show me two rules, and I'll show 
you 10^6 ways to turn it around.
Reality doesn't like formal rules. They make for too brittle designs.
 
 >     "Don't try to be strict, the way you would be with an untrustworthy
 >      human.  Don't worry about "loopholes".  If you've succeeded worth
 >      a damn, your AI is Friendly enough not to *want* to exploit
 >      loopholes.  If the Friendly action is a loophole, then "closing the
 >      loophole" means *you have told the AI to be unFriendly*, and that's 
 >      much worse."
 
What, have you started citing yourself now? Without even waiting for historians?
 > As for that odd scenario you posted earlier, curiosity - however necessary or
 > unnecessary to a functioning mind - is a perfectly reasonable subgoal of
 > Friendliness, and therefore doesn't *need* to have independent motive force.
Please translate that into normal language, because that is entirely
incomprehensible.
This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:50:14 MDT