Re: Classical AI vs. connectionism

J. Maxwell Legg (
Wed, 16 Sep 1998 15:29:54 +1200

Eliezer S. Yudkowsky wrote:

> I don't want to program a perfect AI. I want to program an AI that has the
> capability to consciously direct the execution of low-level algorithms. The
> high-level consciousness is no more perfect than you or I, but it can handle
> inconsistency and incompleteness. The low-level algorithm can't handle
> inconsistency or incompleteness and it's downright stupid, but it's very, very
> fast and it's "perfect" in the sense of not having the capacity to make
> high-level mistakes.
> Again: The _high_level_ is not classical, but it has the capability to _use_
> algorithmic thought, in addition to all the _other_ capabilities. Let me
> explain it your way: The high level is not connectionistic, but it has all
> the magical capabilities you attribute to neural nets, and it can direct evil
> classical AI in the same way that you can write a computer program. It
> doesn't use classical AI for anything other than, say, arithmetic or chess
> programs. And when I say "use", I don't mean it like "the brain uses neural
> networks", I mean it like "Emmanuel uses a pocket calculator".
> I don't know how much of _Coding_ you've read. If you read the whole thing,
> you should hopefully understand that _Coding_ is not remotely near to
> classical AI.
> --

I've read your _Coding_ and refer you to

What I've never before described to anybody is my view that known outcomes aren't needed for training of neural nets. In fact I don't even like the connotation that an associative net using a parallel form of Ingrid should be called a neural net. But then again I haven't even started to program my vision of how an Ingrid AI system would work. I'm still waiting for the right hardware, but my concept is simply this:-

You've seen an Ingrid plot so I don't need to remind you that insignificant items appear near a 'black hole' in the middle and using a flexible significance level the items in this iris are a different colour. In other words, given an existing grid, if we then add a new feature that bears no relationship to the grid's context, it will score in the not applicable region on all the bi-polar constructs. In other words on a 1 to 5 scale it will score in the centre in all cases. This is crucial to my idea of how Ingrid could work in parallel mode. Now, I will try to describe a working parallel model which has already been seeded with a psychological model of an AI, preferably taken from a interactive 'person as scientist' who continues to work closely with Ingrid. I do this myself and as Ingrid comes from the field of computational cognitive neuroscience and as my goal is to upload myself I find the work to be therapeutic.

It is assumed that a superordinate set of constructs and elements have been used to establish a primary repertory grid. This grid will be used as a starting point and I can show how other grids will evolve from this one. This evolution will be outwards in all directions from the primary grid and this primary grid may be relegated to a non primary position. It works like this:

If the primary grid were linked to analog sensory devices that could represent new elements and/or constructs from any context, with values being synthesized for the primary grid's existing features, then it is a simple procedure to determine if this data is significant to the primary grid. If a significance is found then the actual trajectory of the new feature is used to call up a grid which deals with a further refinement of the incoming feature, also all other features of the activated grid will in turn use their trajectories to activate outlying grids.

Each grid that finds this incoming feature to be significant will remain active for a Finite Difference Time Domain (FDTD?). This is quite easy to do because all that is needed is a frequency/iris conditioner that doesn't change the eigenvalues but just alters the state. As more and more new inputs arrive then more and more peripheral grids will become active. When there is a positive feedback, in other words many grids-in-motion have goal seeking trajectories which converge onto a particular grid, then the final 3 eigenvectors activate a conscious response mechanism to "direct the execution of low-level algorithms".

For an incoming feature where insufficient significance was found to make it able to activate an adequate response then it is repeatedly passed through the black hole to a pyramid of core grids and if it is of an entirely new context it is
stored in a 'subconscious dream processor'. Entirely new grids can be formed by the dream processor using conventional data mining techniques and, when stable enough, these new grids can enter the FDTD conscious response arena at the end of the pyramid of core grids. The super constructs that describe the core grids are also adjusted by the same process.

To avoid the exponential timings that are required to reconstruct grids in situ for each new incoming feature linear interpolations are used to find the initial trajectories from amongst the grids in the conscious arena. However the dream processor has to create stable grids using full Principal Component Analysis. Fortunately, the more this happens the bigger the conscious repertoire of grids becomes were linear response times occur. In order to handle the fine tuning or evolution of the conscious repertoire of grids the 'dream processor' also fully reprocesses the particular peripheral response grid that activated the execution of the low-level algorithm. If an elevator effect is detected, in other words if one of the eigenvectors alters its ranking then it is put on watch and recalibrated up and down the trajectory line. And so it goes.

You might say that this is just another pattern catcher, but does this make any sense to you?