Remember how, back in September, I said I was taking a bit of time off
from the paper "Coding a Transhuman AI" to write a small subpaper called
"Friendly AI"? Well, the first version of "Friendly AI" is now being
circulated for commentary, and "Friendly AI" is more than twice as long as
the rest of CaTAI put together - 740K versus 360K. Hence the eight
months. It's all here - everything from Bayesian programmer-affirmed
supergoals to ethical injunctions, human psychology and AI psychology,
self-improvement and directed evolution, causal rewrite semantics and
programmer-independence, subgoal stomps and wireheading. The answer to
your question IS here.
The Singularity Institute is pleased to announce that the "open commentary" version of the long-awaited "Friendly AI" is now available to the academic and futurist communities.
Complete table of contents:
Preface [2K] INIT [1K] 4.1: Design requirements of Friendliness [50K] 4.1.1: Envisioning perfection 4.1.2: Assumptions "conservative" for Friendly AI 4.1.3: Strong Singularity, seed AI, the Transition Guide... 4.1.4: Goal-oriented behavior Interlude: The story of a blob [12K] 4.2: Humans, AIs, and SIs: Beyond anthropomorphism [78K] 4.2.1: Reinventing retaliation 4.2.2: Selfishness is an evolved trait 188.8.131.52: Pain and pleasure 184.108.40.206.1: FoF: Wireheading 1 220.127.116.11: Anthropomorphic capitalism 18.104.22.168: Mutual friendship 22.214.171.124: A final note on selfishness 4.2.3: Observer-biased beliefs evolve in imperfectly deceptive social organisms 4.2.4: Anthropomorphic political rebellion is just plain silly Interlude: Movie cliches about AIs 4.2.5: Review of the AI Advantage Interlude: Beyond the adversarial attitude [17K] 4.3: Design of Friendship systems [0K] 4.3.1: Generic goal systems [78K] 126.96.36.199: Generic goal system functionality 188.8.131.52: Layered mistake detection 184.108.40.206.1: FoF: Autonomic blindness 220.127.116.11: FoF: Non-malicious mistake 18.104.22.168: Injunctions 22.214.171.124.1: Anthropomorphic injunctions 126.96.36.199.2: Adversarial injunctions 188.8.131.52.3: AI injunctions 184.108.40.206: Ethical injunctions 220.127.116.11.1: Anthropomorphic ethical injunctions 18.104.22.168.2: AI ethical injunctions 22.214.171.124: FoF: Subgoal stomp 126.96.36.199: Emergent phenomena in generic goal systems 188.8.131.52.1: Convergent subgoals 184.108.40.206.2: Habituation 220.127.116.11.3: Anthropomorphic satisfaction 4.3.2: Seed AI goal systems [105K] 18.104.22.168: Equivalence of self and self-image 22.214.171.124: Coherence and consistency through self-production 126.96.36.199.1: Look-ahead: Coherent supergoals 188.8.131.52: Programmer-assisted Friendliness 184.108.40.206.1: Unity of will 220.127.116.11.2: Cooperative safeguard: "Preserve transparency" 18.104.22.168.3: Absorbing assists into the system 22.214.171.124.4: Programmer-created beliefs must be truthful... 126.96.36.199: Wisdom tournaments 188.8.131.52.1: Wisdom tournament structure 184.108.40.206: FoF: Wireheading 2 220.127.116.11: Directed evolution in goal systems 18.104.22.168.1: Anthropomorphic evolution 22.214.171.124.2: Evolution and Friendliness 126.96.36.199.3: Conclusion: Evolution is not safe 188.8.131.52: FAI hardware: The flight recorder Interlude: Why structure matters [7K] 4.3.3: Friendly goal systems [4K] 184.108.40.206: External reference semantics [67K] 220.127.116.11.1: Bayesian sensory binding 18.104.22.168.2: External objects and external referents... 22.214.171.124.2.1: Flexibility of conclusions... 126.96.36.199.3: Bayesian reinforcement 188.8.131.52.3.1: Bayesian reinforcement... 184.108.40.206.3.2: Perseverant affirmation... 220.127.116.11.4: Bayesian programmer affirmation... 18.104.22.168: Shaper/anchor semantics [59K] 22.214.171.124.1: "Travel AI": Convergence begins to dawn 126.96.36.199.2: Some forces that shape Friendliness 188.8.131.52.3: Beyond rationalization 184.108.40.206.4: Shapers of philosophies 220.127.116.11.4.1: SAS: Correction of programmer errors 18.104.22.168.4.2: SAS: Programmer-independence 22.214.171.124.4.3: SAS: Grounding for ERS... 126.96.36.199.5: Anchors 188.8.131.52.5.1: Positive anchors 184.108.40.206.5.2: Negative anchors 220.127.116.11.5.3: Anchor abuse 18.104.22.168.6: Shaper/anchor semantics require intelligence... 22.214.171.124: Causal rewrite semantics [37K] 126.96.36.199.1: The physicalist explanation of Friendly AIs 188.8.131.52.2: Causal rewrites and extraneous causes 184.108.40.206.3: The rule of derivative validity 220.127.116.11.4: Truly perfect Friendliness 18.104.22.168.5: The acausal level 22.214.171.124.6: Renormalization... 126.96.36.199: The secret actual definition of Friendliness [8K] 188.8.131.52.1: Requirements for "sufficient" convergence 4.3.4: Developmental Friendliness [28K] 184.108.40.206: Teaching Friendliness content 220.127.116.11.1: Trainable differences for causal rewrites 18.104.22.168: Commercial Friendliness and research Friendliness 22.214.171.124.1: When Friendliness becomes necessary 126.96.36.199.2: Evangelizing Friendliness 188.8.131.52: "In case of Singularity, break glass"... 184.108.40.206.1: The Bayesian Boundary 220.127.116.11.2: Controlled ascent Interlude: Of Transition Guides and Sysops [10K] The Transition Guide The Sysop Scenario 4.4: Policy implications [34K] 4.4.1: Comparative analyses 18.104.22.168: FAI relative to other technologies 22.214.171.124: FAI relative to computing power 126.96.36.199: FAI relative to unFriendly AI 188.8.131.52: FAI relative to social awareness 184.108.40.206: Conclusions from comparative analysis 4.4.2: Policies and effects 220.127.116.11: Regulation (-) 18.104.22.168: Relinquishment (-) 22.214.171.124: Selective support (+) 4.4.3: Recommendations 4.5: Miscellaneous [4K] 4.5.1: Relevant literature END [1K] Appendix 4.A: Friendly AI Guides and References [0K] 4.A.1: Indexed FAQ [27K] 4.A.2: Complete Table of Contents [0K]
This is not the official launch of Friendly AI; this is the "open commentary" version we're circulating in the community first. However, you are politely requested to check the Indexed FAQ before sending in your commentary, since we've already heard quite a few questions about Friendly AI, and your comments may have already been taken into account.
-- -- -- -- -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:59:47 MDT