Re: Singularity optimization [Was: Colossus and the Singularity]

From: Anders Sandberg (
Date: Sat Jan 27 2001 - 10:58:11 MST

Thanks Jim for an excellent commentary on the movie!

Here is a longish critique of fast AI bootstrapping inspired by the
movie and discussions on this list. It is a bit unpolished and
handwaving in places, but may be of interest to some of us (I guess
Eliezer will have fun with it :-).

Jim Fehlinger <> writes:

> An unexpected functional result -- the deduction of
> the existence of the Soviet computer, which not even the
> human intelligence analysts had discovered, and the
> observation by Forbin and his team of the increased
> speed of the computer, are the first indications that
> a positive feedback loop has begun, though Forbin either is not
> aware of or chooses not to mention the implications. Since
> Colossus does not at this point have the capability to alter its
> own hardware, it must be enhancing itself by rewriting its own
> software -- perhaps it was given (inadvertently or deliberately?)
> the rudiments of a "codic cortex" as described by Eliezer Yudkowsky
> in CaTAI!

My problem with this standard singularity scenario is that it is
really based on a kind of rationalist bootstrap model of knowledge
(and skill) aquisition. Essentially the AI is supposed to be sitting
by itself thinking, deducing or testing ways of improving its code
that makes it better and better at this process. It is an appealing
idea for any programmer, we need only create the initial snowball to
get the avalanche going. But it is almost completely the same approach
the rationalists tried for bootstrapping knowledge, and it doesn't
seem to work.

Why? The learning system needs to learn about its "environment" (in
this case the space of algorithms and how it translates into computer
code) and move relevant information (relevance is measured by the
system by its value functions) from this environment into itself. Just
making deductions doesn't seem to be an efficient way of doing this,
since they are based on the pool of knowledge already in the system; a
program cannot increase its algorithmic complexity just by extending
itself with code it writes. So what is left is empirical
experimentation with code (guided by deductions from already learned
stuff). The system needs to learn from its environment about many
things, including what kind of statistical environment it is - if it
is a smooth fitness landscape code optimization is best done by
hill-climbing or something similar, while on a rugged but regular
landscape other algorithms are needed.

The problem here with the search is that if the current program is P
(which maps integers to candidate new programs which are then
evaluated), the empirical search process is of the form Pnew =
argmax_i V(P(i)) where V is the estimated value of solution
P(i). Hill-climbing can be seen as having programs that generate
programs similar to themselves, genetic algorithms would have
'programs' P that are really populations of programs and the deductive
'rationalist' approach would be a program that generates a single
successor that has a higher V than itself. Now, where does this search
end up? In the general case the landscape will not be amenable to
*efficient* optimization at all (due to the TANSTAAFL theorems) - any
initial program in the set of all seed programs will statistically do
just as well as any other. This is simply because most optimization
landscapes are completely random.

What about the space of code relevant for AI? Most likely it is nicer
than the general case, so there are some seed programs that are on
average better than others. But we should remember that this space
still contains Gödel-undecidable stuff, so it is by no means
simple. The issue here is whether even the most optimal seed program
converges towards a optimum fast enough to be interesting. To
complicate things, programs can increase the dimensionality of their
search space by becoming more complex, possibly enabling higher values
of V(P) but also making the search problem exponentially harder. So
there is a kind of limit here, where systems can quickly improve their
value in the first steps, but then rapidly get a search space that is
getting more and more unmanageable. I guess the "intelligence" of such
a system would behave like the logarithm of time or worse - definitely
not exponential.

I would recommend Christoph Adami's _Introduction to Artificial Life_,
Sprinmger, New York 1998 for a discussion of environment
information gathering in evolving programs (based on his papers on
this; see

Anyway, this is getting a bit rambling, but what I hope I have showed
above is that having a system improve itself along some metric is
likely very time-consuming / resource-consuming if done in
parallel. Evolution has just had the time to sample a tiny fraction of
terrestrial genomes througghout the last billion years, and it has
been able to run trillions of entities in parallel. This doesn't mean
it is impossible to create such an intelligence bootstrapping system,
but it may take enormous time to produce anything.

It might of course turn out that there is some underlying regularity
in all of these fitness landscapes, some trick that makes it simple to
create better programs once you have found it. Then the search would
suddently go very much faster. But it doesn't seem likely if the
fitness function is not very special, given the results of Chaitin on
randomness and undecidability even in basic arithmetic.

This has of course completely focused on a little box thinking happily
for itself, it has no connection to the external world. If we really
want an AI to bootstrap itself, then it would 1) likely benefit from
acquiring all the expensive information we have already acquired over
the last four billion years of evolution about the nature of our world
(never underestimate the effort and time the ancestors of the
procaryotes expended on learning metabolism!), and 2) it would need it
if the goal is anything beyond a box producing programs optimising
some arbitrary function - in the end intelligence is likely best
defined as how well an entity functions in its environment, and I
guess we are more interested in an entity that can thrive in both the
algorithmic and physical worlds than one that just thrives in the
algorithmic one.

It should be noted that the choice of values is not just important,
but likely the most complex problem! There seems to be some general
rule (a bit of experience-based handwaving here) that if you
reformulate an optimisation problem so that the solution algorithm
becomes very simple (like a GA or neural network) then a lot of
efforts has to be spent on setting parameters like the fitness
function or network parameters; the total effort of writing the
algorithm, setting parameters and running it appears to be roughly
constant (it would be interesting to see if this can be proven). So
the AI value function can likely be as complex as the finished program
itself if the seed program is simple! I think this is because the
above kind of systems doesn't interact much with any external
environment feeding them extra information, all the algorithmic
complexity has to come "from inside". In real life, the harsh fitness
function of a complex reality already holds a tremendous amount of

To sum up, I doubt Colossus, even given its cameras and missiles,
would have been able to bootstrap itself very fast. It seems it had
been given a very simple value function, and that would likely lead to
a very slow evolution. To get a real AI bootstrap we probably need to
connect it a lot with reality, and reality is unfortunately rather
slow even when it is Internet.

PS. I have still not pased yet,
but it seems to put some interesting bounds on the AI bootstrapping

Anders Sandberg                                      Towards Ascension!                  
GCS/M/S/O d++ -p+ c++++ !l u+ e++ m++ s+/+ n--- h+/* f+ g+ w++ t+ r+ !y

This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:56:25 MDT