From: "Alex F. Bokov" <firstname.lastname@example.org>
> Executive summary: corporations and governments (I'll charitably call
> them collective intelligences, or CIs) are a model system for how
> humans can coexist with AI. Want to make AI friendly? Come up with
> a way to make CIs friendly, and you'll have something resembling a
> game plan.
but I'd settle for sane AI.
Speaking of collective intelligence, here's some background:
There are many information-processing problems that can only be solved by the
joint action of large communities of computers each running a sophisticated
machine learning algorithm, where those algorithms are not subject to
centralized, global control. Examples are routing of air traffic, control of
swarms of spacecraft, routing of packets across the internet, and
communication between the multiple processors in a modern computer. There are
also many instances of natural systems that address such problems. Examples
here are ecosystems, economies, the organelles within a living cell.
Such problems can be addressed with the emerging science of ``COllective
INtelligence'' (COIN), which is concerned with the design of a multi-agent
Agents are ``selfish'' in that they act to try to optimize their own
utilities, without explicit regard to cooperation with other agents.
There is a well-specified global objective, and we are confronted with the
inverse problem of how to configure the system to achieve that objective.
In particular, we are interested in such collectives in which each agent runs
a reinforcement learning (RL) algorithm. Rather than use a conventional
modeling approach (e.g., model the system dynamics, and hand-tune agents to
cooperate), we aim to solve the problem of collective design problem
implicitly, via the ``adaptive'' character of the RL algorithms of each of the
agents. This approach introduces an entirely new, profound design problem:
Assuming the RL algorithms are able to achieve high rewards, what reward
functions for the individual agents will, when pursued by those agents, result
in high world utility? In other words, what reward functions will best ensure
that we do not have phenomena like the tragedy of the commons, Braess's
paradox, or the liquidity trap?
Although still very young, research specifically concentrating on this design
problem has already resulted in successes in artificial domains, in particular
in packet-routing, the leader-follower problem, and in variants of Arthur's El
Farol bar problem. It is expected that as it matures and draws upon other
disciplines related to collective design, this research will greatly expand
the range of tasks addressable by human engineers. Moreover, in addition to
drawing on them, such a fully developed field of collective intelligence may
provide insight into other already established scientific fields, such as
mechanism design, economics, game theory, and population biology.
See also: Adaptive Intelligent Systems http://www.ksl.stanford.edu/projects/AIS/
David Wolpert and Kagan Tumer "Avoiding Braess' Paradox through Collective Intelligence," (in review). Tech Report NASA-ARC-IC-99-124. Abstract . postscript (640 Kb), pdf (300 Kb).
David Wolpert, Sergey Kirshner, Chris Merz and Kagan Tumer "Adaptivity in Agent-Based Routing for Data Networks," Fourth International Conference on Automomous Agents , Barcelona, Spain, June 2000 (to appear). pdf (170 Kb).
David Wolpert and Kagan Tumer, "An Introduction to Collective Intelligence," Tech Report NASA-ARC-IC-99-63.(A shorter version of this paper is to appear in: Jeffrey M. Bradshaw, editor, Handbook of Agent Technology, AAAI Press/MIT Press, 1999). Abstract, postscript (1.4 MB), pdf (715 Kb).
David Wolpert, Kevin Wheeler and Kagan Tumer, "Collective Intelligence for Control of Distributed Dynamical Systems," Europhysics Letters , Vol. 49, No. 6, March 2000. pdf (190 Kb).
David H. Wolpert, Mike H. New, and Ann M. Bell, ``Distorting Reward Functions to Improve Reinforcement Learning,'' (in review). Tech Report NASA-ARC-IC-99-71. Abstract, postscript (570 Kb), pdf (120 Kb).
David Wolpert, Kevin Wheeler and Kagan Tumer, ``General Principles of Learning-Based Multi-Agent Systems,'' Third International Conference on Automomous Agents , pp. 77-83, Seattle, WA, May 1999. Abstract, postscript (520 Kb), pdf (230 Kb).
David H. Wolpert, Kagan Tumer, and Jeremy Frank, ``Using Collective Intelligence to Route Internet Traffic,'' Advances in Neural Information Processing Systems-11, pp. 952-958, Denver, CO, December 1998. Abstract, postscript (300 Kb), pdf (125 Kb).
C. Ronald Kube and Hong Zhang "Collective Robotic Intelligence " Abstract, postscript, pdf.
--- --- --- --- ---
Useless hypotheses, etc.: consciousness, phlogiston, philosophy, vitalism, mind, free will, qualia, analog computing, cultural relativism, GAC, Cyc, Eliza, cryonics, individual uniqueness, ego, human values, scientific relinquishment
We move into a better future in proportion as science displaces superstition.
This archive was generated by hypermail 2b30 : Sat May 11 2002 - 17:44:14 MDT