Re: Let's hear Eugene's ideas

From: Eugene Leitl (eugene.leitl@lrz.uni-muenchen.de)
Date: Fri Oct 06 2000 - 05:42:20 MDT


Damien Broderick writes:

> I do see the force of the claim, but it might still be wrong, since
> emergence of effective AI might, after all, depend on ramping up the
> substrate technologies by brute force via Moore's law.

We seem to have essentially two bottlenecks: 1) having a fast enough
hardware with enough bits to represent stuff and 2) a way to "program"
that hardware so that it acts intelligently (both are not that
unrelated, since the shape of the hardware has to support the
"program" paradigm, and vice versa, due to constraints of
computational physics).

So we have to execute a two staged approach: use the currently
available hardware for a search of how to find a general approach to
making digital systems act intelligently (using input from
neuroscience and ALife simulation as further source of constraints),
before embarking on an a self-enhancement process, where the global
optimum for the pair hardware-"software" is found.

As far as I can see, the second part of the bootstrap process can be
handled by the entities itself, is potentilly fraught with positive
autofeedback loops, and hence should be something done very, very
carefully.

Let's look at biology first, as a major source of constraints. Because
we know it works, and that way we don't have to sample the space of
duds, which must be huge. Seen from a distance, we've got an
anisotropic threedimensional excitation/inhibition medium. Stepping a
bit closer, we've got an electrochemical spiking network, with a
decaying connection density with distance. Lots of local connections,
some mid-range connections, very few long-distance
connections. Unfortunately, there is no clean separation between hard
and software. You've got system state, with a biological chronone
(time quantum) of about 1 ms. No adaptation occurs at this scale yet.
The adaptive hardware has a number of responses, with characteristic
times in sec, hour and day range. The adaptive processes are
potentially extremely complex, because relying on a large number of
diverse diffusible chemicals, expression of genes, cytoskeletal
activity, and whatnot. Figuring it all out in detail will be hard. But
necessary, because without detailed understanding of it all, there
will be uploads.

We can try to figure it out bottom up, by imaging a biological system
at molecular resolution and plugging these results into a numerical
simulation of a biological system at molecular detail (we can't do
that yet) or a mesoscale mode (eCell, virtual cell), we can look what
morphogenesis does in simple critters, and use in vivo recording with
patch clamping, and what not. We can take a middle approach, imaging
the critter at nm resolution, plugging painstakingly digitized
neuroanatomy into the simulation and try to fit experimental and "ab
initio" (i.e. stuff from the molecular level of theory) into the
simulation, say at compartmental level. We can also try to culture
neurons in vitro, and record their realtime activity in realtime,
using voltage and Ca concentration sensitive dyes (taking a large body
of data in at a glance) and multielectrode arrays. Partly, this
approach can also work in vivo. At the higher level, you try to map
neuronal pathways, corellate activity with stimuli using fMRI and MEG,
and the like.

This sounds complex, and it is. The ALife approach say: this is a
particular long-time adaptation to a particular rich system, and it's
just too damn complex to figure out, and even if we succeed it will be
of limited relevancy to us anyway, since the constraints of our
hardware (whether current or future) will certainly be different. So
they will select a (hopefully sufficiently rich) starter system
already capable of information processing, and tweak the parameters
with evolutionary algorithms, and hope they haven't pulled a
dud. Iterate untill success.

In neuronal biology, a part of a system (a patch of an excitable
membrane, a synapse, a neuron) can make the current state impact a
remote part of a system. And vice versa. The impact is depending at
the state of self and other, and how the medium between self and other
is structured, which is a function of history. A part from it is
generated straight from neuronal embryomorphogenesis, part shaped
emergently from electrochemical activity in latter phases (i.e. no
longer explicitly encoded by the genome, which is already true for
morphogenesis, since the system has already lot of state and
"computes"), part of it further shaped in active learning and adaption
outside of the womb. When an animal enters the world it is anything
but randomly wired. It comes intensely prebiased, having recapitulated
all the former successful stages of design evolution already did came
up with. For lower animals, this is already enough. They're a
conserve, facit of a large number of former learners.

If I would be co-evolving critters, I would do the following things. I
would reserve an input vector and an output vector, one for sensors,
one for motorics. This is the system's only interface to reality,
whether real or emulated. I would add an adaptively expanding envelope
of sufficiently rich machinery to map sensorics to motorics, and
represent state. Lots of state. Huge gobs of state, and then some. The
more, the less primitive the critter. Trying to encode abovementioned
framework, I would use use networks of automata. An automaton has
state, and means to impact that state upon that of distant ones. Which
one, depends on being or not being connected to it. I would leave
slack for emergence of multiple classes of automata, resulting in
different amounts of state, the exact shape how that state is impacted
by connected automata, the number of connections, and the shape of how
these connections are prewired. To reduce the number of mutable bits
and hence the search space, I would add a genetically encoded
machinery not only encoding the behaviour of these things (including
how the new connections are formed and destroyed, and how the impact
is modulated over time), but also how they are initially
prepatterned. Stepping back one step, you also have to allow the
genome to operate on how that pattern-generating machinery is
generating a pattern, for maximum flexibility. The mutation function
will also to be part of the mutable parts. It needs to learn what
parts of the genome need to be modified, when and how often. This is a
preliminary recipe, likely to have lots arbitrary constraints still
built-in, and probably needs to be revised and expanded several times.

This has been relatively abstract, but here the hardware
level/computational physics-derived abstracts start to kick in. We
know computronium, a 3d hardware implementation of a reversible
cellular automaton, is provably (Nanotechnology 9 (1998) pp. 162-176)
the ultimative non-qubit computer. Luckily, cellular automata can be
written for current hardware which are only bottlenecked by (burst)
memory bandwidth and due to the fact that only surface need to be
communicated to neighbour nodes (assuming, only locally interconnected
nearest-neighbour nodes arranged on a 3d lattice) for processing,
within reasons scale O(const) in respect to total size as counted in
cells, as long as the time necessary to exchange of the surface
information is negligeable in comparison to the time it takes the CPU
to process the volume. This indicates the need go to finer grains and
higher bandwidth/less latency interconnect, preferring large DSP
clusters to conventional Beowulfs. In fact, for such hardware it might
make sense to lift the only-nearest neighbour automaton constraint,
and allow any node within a simulated volume to instantly send
information to any other automaton in volume simulated by adjacent
Beowulf/DSP node. This reduces minimal stimulus response latency,
exploiting particularities of a given architecture (memory acesses
within the node have a temporally flat profile, which only works
because limits of computational physics are still remote).

It is easy to see that custom architectures based on embedded RAM
technology can provide a substantial speedup (I have a good paper
sitting on the hard drive somewhere which I can't find right now) on
above architectures, eventually resulting in 2 1/2 d CA hardware
completely covering the surface of an e.g. 300 mm wafer. The best way
to implement this would be probably asynchonous/clockless analog
CMOS. You could probably fit a complete cell in ~10 um^2 or less of
silicon real estate, allowing you to fit ~10 kCells on mm^2, or almost
a billion cells on a wafer. That's a lot of state, and
(nearest-neighbours) locally that cells could change state with up to
0.1 THz rate, provided we can dissipate that much heat from a
surface. Even the most primitive Langmuir-Blodgett deposited molecular
implementation of above should give us instant two orders of
magnitude, by shrinking the size of the cell to about a micron. Of
course this scales to volume integration, resulting in another three
to four orders of magnitude. We're talking about 10^9 cells in a cubic
millimeter, or ~10^12 cells in a sugar cube. You can do much better
with mechanosynthetically deposited computronium.

Clearly, this architecture scales over several implementation
technologies, so it would make sense to do it, even if initially only
in software. By attempting to evolutionary grow CA circuits running on
a custom hardware (FPGA accelerated cellular automaton machine)
controlling the behaviour of a small robot de Garis shows he's
smart. He's far from guaranteed to succeed, partly because he's alone,
the rest of AI community thinks he's on crack due to his writing on
artilects (hence automagically marking his methods as tainted), partly
because he might be missing one or two critical components, also his
hardware is comparable quite puny (in his place, I would have ordered
a rack full of 4-8 MBit MMX-capable SHARC DSPs, gluelessly wired on a
cubic lattice, and interface that to a PC or a small Beowulf for
control and I/O). In volume, these things are so cheap that he would
been able to afford 10^4..10^5 nodes. A module of 10^3 of them with
collectively pack ~1 GByte of on-die SRAM memory while fitting into
the volume of a big tower, dissipating ~2..5 kW (this is still
manageable with lamellar metal heat dissipators in strong
airflow). Here's your ~billion (~1000^3) or so of byte-sized CA cells
(a cube of ~100^3, about ~MByte in every node), and probably with a
refresh rate of ~0.1 kHz, depending on how fast the CPU interconnect
is. This would have required less development work than his custom
FPGA box he's currently using, and provided considerably more kick --
all purpose kick, allowing you to reuse the thing, in case this
particular approach blows you a raspberry, plus future opportunities
to profit from large-volume part production (DSPs are commodity and
will eventually move to embedded RAM and more on-die parallelism as
well as possibly FPGA, which you just can't do by wiring FPGAs with
RAM).

As long as Eliezer doesn't start packing hardware of this calibre and
beyond (way beyond, actually), and won't apply above or similiar
technologies with the intent to breed an SI, he's below my radar
screen ;)



This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:50:15 MDT