Re: Moore's law

Michael Nielsen (mnielsen@tangelo.phys.unm.edu)
Tue, 7 Jul 1998 21:47:59 -0600 (MDT)

On Sat, 27 Jun 1998, Dan Clemmensen wrote:

> Michael Nielsen wrote:
> >
> > Playing Devil's Advocate here...
>
> No problem with devil's advocacy. Please take my responses constructivly
> and not as an argument. As I see it, we are exploring the problem
> together.

Okay. I'm having fun with it, and your arguments have made me less pessimistic than before.

I may as well state one of my main interests in this: whether we'll ever have enough computational power to set up a good "breeding ground" for AIs -- an artificial environment optimized to produce artifical intelligence by means of selective pressure. Using proctonumerology, I'd guess a figure of about 10^40 operations ought to be enough.

> > On Fri, 26 Jun 1998, Dan Clemmensen wrote:
> >
> > > Michael Nielsen wrote:
>
[memory addressing overhead]
>
> Addressing in a nanomechanical system occurs only as part of a read or write
> operation. When no I/O is occurring, no energy is dissipated. Dissipation per
> access depends on the size of the memory, O(logN) for random access,
> which is negligible and is mitigated further by caching.

I don't know how the proposed nanomechanical schemes work. In commonly used electronic schemes, I believe that the depth is O(log N) for random access, but the number of operations is O(N log N), at least in the schemes I'm familiar with. Are you absolutely sure that the number of operations in a nanomechanical addressing systems is O(log N)?

On a side note, I really need to read "Nanosystems" one of these days...

> > > Three-dimensional
> > > storage in a nonomechanical system, using 100 atoms per bit alllows
> > > a fair number of extra atoms for "overhead" functions such as support
> > > and heat-conduction. The energy dissipated to read or write a bit
> > > nanomechanically should be very small compared to that needed by current
> > > technology,
> >
> > Well, it would need to be. Suppose (conservatively), that you are going
> > to have 10^6 layers, each storing bits with a density of 100 atoms / bit;
> > a storage density of 1 bit for every few nanometers squared. Such a
> > device will have roughly 10^12 more bits stored on it than current
> > commerical chips. I forget the exact numbers, but the dissipation rate
> > per logical operation is something like 10^6 kT in current chips. That
> > means you have a major problem unless everything is done completely
> > dissipation free.
> >
> I'm not sure that 10^6 layers is conservative. It's nanomechanically conservative
> from a static structural standpoint (i.e., we could build it) but not from a
> Moore's Law standpoint. We started this discussion with Moore's law in 2020.

Okay. I am, I suppose, trying to see how far we can push the 3d architecture idea at this point. We already do it to some extent -- I am told that 20 layers is not that uncommon in a chip -- but I would like to know how much further we can go; can we drop the 2020 date?

> Moore's law ( in one form) calls for doubling density every 1.5 years, or roughly
> an increase of 16,000 by 2020. I can get this with an areal decrease from the current
> 10^6 nm^2/bit to 50 nm^2/bit without going into the third dimension at all. It's not
> unreasonable to assume a decrease in energy per I/O on the same order, so I don't
> have to invoke any of the other mechanisms.

Okay.

> > > and diamondoid should be able to operate at much higher
> > > temperatures than silicon-based devices. Diamondoid is a much better conductor
> > > of heat than silicon, also.
> >
> > These are good points, but they only buy you a tiny amount.
>
> As you see from the above, these points can affect the densities by a factor
> of ten to one hundred or so.

Okay. That's quite a non-trivial gain. Of course, those technologies may run into their own problems (remember bubble memories?), but they're certainly possibilities I wasn't aware of.

> By contrast, you raise the issue of error correction.
> However, even very powerful ECC schemes require less than doubling the amount
> of volume needed to store a word.

That's not really true. To do fault-tolerant computation the best known overhead, so far as I know, goes polylogarithmically in the size of the computation, with some (fairly large) constant factor in front of the first term in the polylog factor.

The error correction schemes you are talking about will buy you a little, but ultimately, fixing one bit errors goes only a tiny way to fixing the error problem. Much more powerful error correction techniques are needed to really solve the error problem, unless, as in today's computers, you have an incredibly low fundamental error rate. Even then, there's a real question of how many operations you want to do. If you "only" want to do 10^{17} operations, then error rates of about 10^{-17} are fine -- which is why today's computers don't use error correctyion. Presumably, for AI we would like to do far more operations than that -- on the order of 10^{30} does not seem unreasonable.

Assuming a fundamental error rate of about 10^{-5} for reversible computation, that implies a heavy error correction overhead. Doing a calculation of how much overhead for error correction would be required would take quite a while, but it's safe to say that most of the work going on in the computer would actually be error correction, not computation.

If I can find the time, I'll try to look up some of the papers analyzing errors in reversible computation. As I recall, there were a few in the early 80s, which I've never read.

Michael Nielsen

http://wwwcas.phys.unm.edu/~mnielsen/index.html