Remember how, back in September, I said I was taking a bit of time off
from the paper "Coding a Transhuman AI" to write a small subpaper called
"Friendly AI"? Well, the first version of "Friendly AI" is now being
circulated for commentary, and "Friendly AI" is more than twice as long as
the rest of CaTAI put together - 740K versus 360K. Hence the eight
months. It's all here - everything from Bayesian programmer-affirmed
supergoals to ethical injunctions, human psychology and AI psychology,
self-improvement and directed evolution, causal rewrite semantics and
programmer-independence, subgoal stomps and wireheading. The answer to
your question IS here.
The Singularity Institute is pleased to announce that the "open commentary" version of the long-awaited "Friendly AI" is now available to the academic and futurist communities.
Complete table of contents:
Preface [2K] INIT [1K] 4.1: Design requirements of Friendliness [50K] 4.1.1: Envisioning perfection 4.1.2: Assumptions "conservative" for Friendly AI 4.1.3: Strong Singularity, seed AI, the Transition Guide... 4.1.4: Goal-oriented behavior Interlude: The story of a blob [12K] 4.2: Humans, AIs, and SIs: Beyond anthropomorphism [78K] 4.2.1: Reinventing retaliation 4.2.2: Selfishness is an evolved trait 18.104.22.168: Pain and pleasure 22.214.171.124.1: FoF: Wireheading 1 126.96.36.199: Anthropomorphic capitalism 188.8.131.52: Mutual friendship 184.108.40.206: A final note on selfishness 4.2.3: Observer-biased beliefs evolve in imperfectly deceptive social organisms 4.2.4: Anthropomorphic political rebellion is just plain silly Interlude: Movie cliches about AIs 4.2.5: Review of the AI Advantage Interlude: Beyond the adversarial attitude [17K] 4.3: Design of Friendship systems [0K] 4.3.1: Generic goal systems [78K] 220.127.116.11: Generic goal system functionality 18.104.22.168: Layered mistake detection 22.214.171.124.1: FoF: Autonomic blindness 126.96.36.199: FoF: Non-malicious mistake 188.8.131.52: Injunctions 184.108.40.206.1: Anthropomorphic injunctions 220.127.116.11.2: Adversarial injunctions 18.104.22.168.3: AI injunctions 22.214.171.124: Ethical injunctions 126.96.36.199.1: Anthropomorphic ethical injunctions 188.8.131.52.2: AI ethical injunctions 184.108.40.206: FoF: Subgoal stomp 220.127.116.11: Emergent phenomena in generic goal systems 18.104.22.168.1: Convergent subgoals 22.214.171.124.2: Habituation 126.96.36.199.3: Anthropomorphic satisfaction 4.3.2: Seed AI goal systems [105K] 188.8.131.52: Equivalence of self and self-image 184.108.40.206: Coherence and consistency through self-production 220.127.116.11.1: Look-ahead: Coherent supergoals 18.104.22.168: Programmer-assisted Friendliness 22.214.171.124.1: Unity of will 126.96.36.199.2: Cooperative safeguard: "Preserve transparency" 188.8.131.52.3: Absorbing assists into the system 184.108.40.206.4: Programmer-created beliefs must be truthful... 220.127.116.11: Wisdom tournaments 18.104.22.168.1: Wisdom tournament structure 22.214.171.124: FoF: Wireheading 2 126.96.36.199: Directed evolution in goal systems 188.8.131.52.1: Anthropomorphic evolution 184.108.40.206.2: Evolution and Friendliness 220.127.116.11.3: Conclusion: Evolution is not safe 18.104.22.168: FAI hardware: The flight recorder Interlude: Why structure matters [7K] 4.3.3: Friendly goal systems [4K] 22.214.171.124: External reference semantics [67K] 126.96.36.199.1: Bayesian sensory binding 188.8.131.52.2: External objects and external referents... 184.108.40.206.2.1: Flexibility of conclusions... 220.127.116.11.3: Bayesian reinforcement 18.104.22.168.3.1: Bayesian reinforcement... 22.214.171.124.3.2: Perseverant affirmation... 126.96.36.199.4: Bayesian programmer affirmation... 188.8.131.52: Shaper/anchor semantics [59K] 184.108.40.206.1: "Travel AI": Convergence begins to dawn 220.127.116.11.2: Some forces that shape Friendliness 18.104.22.168.3: Beyond rationalization 22.214.171.124.4: Shapers of philosophies 126.96.36.199.4.1: SAS: Correction of programmer errors 188.8.131.52.4.2: SAS: Programmer-independence 184.108.40.206.4.3: SAS: Grounding for ERS... 220.127.116.11.5: Anchors 18.104.22.168.5.1: Positive anchors 22.214.171.124.5.2: Negative anchors 126.96.36.199.5.3: Anchor abuse 188.8.131.52.6: Shaper/anchor semantics require intelligence... 184.108.40.206: Causal rewrite semantics [37K] 220.127.116.11.1: The physicalist explanation of Friendly AIs 18.104.22.168.2: Causal rewrites and extraneous causes 22.214.171.124.3: The rule of derivative validity 126.96.36.199.4: Truly perfect Friendliness 188.8.131.52.5: The acausal level 184.108.40.206.6: Renormalization... 220.127.116.11: The secret actual definition of Friendliness [8K] 18.104.22.168.1: Requirements for "sufficient" convergence 4.3.4: Developmental Friendliness [28K] 22.214.171.124: Teaching Friendliness content 126.96.36.199.1: Trainable differences for causal rewrites 188.8.131.52: Commercial Friendliness and research Friendliness 184.108.40.206.1: When Friendliness becomes necessary 220.127.116.11.2: Evangelizing Friendliness 18.104.22.168: "In case of Singularity, break glass"... 22.214.171.124.1: The Bayesian Boundary 126.96.36.199.2: Controlled ascent Interlude: Of Transition Guides and Sysops [10K] The Transition Guide The Sysop Scenario 4.4: Policy implications [34K] 4.4.1: Comparative analyses 188.8.131.52: FAI relative to other technologies 184.108.40.206: FAI relative to computing power 220.127.116.11: FAI relative to unFriendly AI 18.104.22.168: FAI relative to social awareness 22.214.171.124: Conclusions from comparative analysis 4.4.2: Policies and effects 126.96.36.199: Regulation (-) 188.8.131.52: Relinquishment (-) 184.108.40.206: Selective support (+) 4.4.3: Recommendations 4.5: Miscellaneous [4K] 4.5.1: Relevant literature END [1K] Appendix 4.A: Friendly AI Guides and References [0K] 4.A.1: Indexed FAQ [27K] 4.A.2: Complete Table of Contents [0K]
This is not the official launch of Friendly AI; this is the "open commentary" version we're circulating in the community first. However, you are politely requested to check the Indexed FAQ before sending in your commentary, since we've already heard quite a few questions about Friendly AI, and your comments may have already been taken into account.
-- -- -- -- -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:59:47 MDT