Remember how, back in September, I said I was taking a bit of time off
from the paper "Coding a Transhuman AI" to write a small subpaper called
"Friendly AI"? Well, the first version of "Friendly AI" is now being
circulated for commentary, and "Friendly AI" is more than twice as long as
the rest of CaTAI put together - 740K versus 360K. Hence the eight
months. It's all here - everything from Bayesian programmer-affirmed
supergoals to ethical injunctions, human psychology and AI psychology,
self-improvement and directed evolution, causal rewrite semantics and
programmer-independence, subgoal stomps and wireheading. The answer to
your question IS here.
The Singularity Institute is pleased to announce that the "open commentary" version of the long-awaited "Friendly AI" is now available to the academic and futurist communities.
Complete table of contents:
Preface [2K] INIT [1K] 4.1: Design requirements of Friendliness [50K] 4.1.1: Envisioning perfection 4.1.2: Assumptions "conservative" for Friendly AI 4.1.3: Strong Singularity, seed AI, the Transition Guide... 4.1.4: Goal-oriented behavior Interlude: The story of a blob [12K] 4.2: Humans, AIs, and SIs: Beyond anthropomorphism [78K] 4.2.1: Reinventing retaliation 4.2.2: Selfishness is an evolved trait 220.127.116.11: Pain and pleasure 18.104.22.168.1: FoF: Wireheading 1 22.214.171.124: Anthropomorphic capitalism 126.96.36.199: Mutual friendship 188.8.131.52: A final note on selfishness 4.2.3: Observer-biased beliefs evolve in imperfectly deceptive social organisms 4.2.4: Anthropomorphic political rebellion is just plain silly Interlude: Movie cliches about AIs 4.2.5: Review of the AI Advantage Interlude: Beyond the adversarial attitude [17K] 4.3: Design of Friendship systems [0K] 4.3.1: Generic goal systems [78K] 184.108.40.206: Generic goal system functionality 220.127.116.11: Layered mistake detection 18.104.22.168.1: FoF: Autonomic blindness 22.214.171.124: FoF: Non-malicious mistake 126.96.36.199: Injunctions 188.8.131.52.1: Anthropomorphic injunctions 184.108.40.206.2: Adversarial injunctions 220.127.116.11.3: AI injunctions 18.104.22.168: Ethical injunctions 22.214.171.124.1: Anthropomorphic ethical injunctions 126.96.36.199.2: AI ethical injunctions 188.8.131.52: FoF: Subgoal stomp 184.108.40.206: Emergent phenomena in generic goal systems 220.127.116.11.1: Convergent subgoals 18.104.22.168.2: Habituation 22.214.171.124.3: Anthropomorphic satisfaction 4.3.2: Seed AI goal systems [105K] 126.96.36.199: Equivalence of self and self-image 188.8.131.52: Coherence and consistency through self-production 184.108.40.206.1: Look-ahead: Coherent supergoals 220.127.116.11: Programmer-assisted Friendliness 18.104.22.168.1: Unity of will 22.214.171.124.2: Cooperative safeguard: "Preserve transparency" 126.96.36.199.3: Absorbing assists into the system 188.8.131.52.4: Programmer-created beliefs must be truthful... 184.108.40.206: Wisdom tournaments 220.127.116.11.1: Wisdom tournament structure 18.104.22.168: FoF: Wireheading 2 22.214.171.124: Directed evolution in goal systems 126.96.36.199.1: Anthropomorphic evolution 188.8.131.52.2: Evolution and Friendliness 184.108.40.206.3: Conclusion: Evolution is not safe 220.127.116.11: FAI hardware: The flight recorder Interlude: Why structure matters [7K] 4.3.3: Friendly goal systems [4K] 18.104.22.168: External reference semantics [67K] 22.214.171.124.1: Bayesian sensory binding 126.96.36.199.2: External objects and external referents... 188.8.131.52.2.1: Flexibility of conclusions... 184.108.40.206.3: Bayesian reinforcement 220.127.116.11.3.1: Bayesian reinforcement... 18.104.22.168.3.2: Perseverant affirmation... 22.214.171.124.4: Bayesian programmer affirmation... 126.96.36.199: Shaper/anchor semantics [59K] 188.8.131.52.1: "Travel AI": Convergence begins to dawn 184.108.40.206.2: Some forces that shape Friendliness 220.127.116.11.3: Beyond rationalization 18.104.22.168.4: Shapers of philosophies 22.214.171.124.4.1: SAS: Correction of programmer errors 126.96.36.199.4.2: SAS: Programmer-independence 188.8.131.52.4.3: SAS: Grounding for ERS... 184.108.40.206.5: Anchors 220.127.116.11.5.1: Positive anchors 18.104.22.168.5.2: Negative anchors 22.214.171.124.5.3: Anchor abuse 126.96.36.199.6: Shaper/anchor semantics require intelligence... 188.8.131.52: Causal rewrite semantics [37K] 184.108.40.206.1: The physicalist explanation of Friendly AIs 220.127.116.11.2: Causal rewrites and extraneous causes 18.104.22.168.3: The rule of derivative validity 22.214.171.124.4: Truly perfect Friendliness 126.96.36.199.5: The acausal level 188.8.131.52.6: Renormalization... 184.108.40.206: The secret actual definition of Friendliness [8K] 220.127.116.11.1: Requirements for "sufficient" convergence 4.3.4: Developmental Friendliness [28K] 18.104.22.168: Teaching Friendliness content 22.214.171.124.1: Trainable differences for causal rewrites 126.96.36.199: Commercial Friendliness and research Friendliness 188.8.131.52.1: When Friendliness becomes necessary 184.108.40.206.2: Evangelizing Friendliness 220.127.116.11: "In case of Singularity, break glass"... 18.104.22.168.1: The Bayesian Boundary 22.214.171.124.2: Controlled ascent Interlude: Of Transition Guides and Sysops [10K] The Transition Guide The Sysop Scenario 4.4: Policy implications [34K] 4.4.1: Comparative analyses 126.96.36.199: FAI relative to other technologies 188.8.131.52: FAI relative to computing power 184.108.40.206: FAI relative to unFriendly AI 220.127.116.11: FAI relative to social awareness 18.104.22.168: Conclusions from comparative analysis 4.4.2: Policies and effects 22.214.171.124: Regulation (-) 126.96.36.199: Relinquishment (-) 188.8.131.52: Selective support (+) 4.4.3: Recommendations 4.5: Miscellaneous [4K] 4.5.1: Relevant literature END [1K] Appendix 4.A: Friendly AI Guides and References [0K] 4.A.1: Indexed FAQ [27K] 4.A.2: Complete Table of Contents [0K]
This is not the official launch of Friendly AI; this is the "open commentary" version we're circulating in the community first. However, you are politely requested to check the Indexed FAQ before sending in your commentary, since we've already heard quite a few questions about Friendly AI, and your comments may have already been taken into account.
-- -- -- -- -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:59:47 MDT