RE: COMP:WARS: RE: Software/Hardware Architectures

Billy Brown (
Thu, 15 Jul 1999 13:07:14 -0500

Eugene Leitl [] wrote:
> I could go on for a long time but (I hope) my
> point is nauseatingly clear: it's a big ball of yarn buried in
> yonder tarpit, and requires a whole lotta muscle to haul it out.

OK, you've convinced me to reserve judgement until someone actually builds one. It still sounds like a lot of arm-waving to me, but I'll admit that designing parallel processing hardware isn't one of my areas of expertise.

> It might be logical, yet there is plenty of reasons to blame
> widespread "investment protection" attitude for it. Investment
> protection is great for local optimization, but is deletorious
> even on the middle run. And is really really desastrous on the
> long run.

What exactly is the alternative? Write software optimized for something that doean't exist? If you want to blame Intel and IBM for not innovating faster, feel free, but don't blame the software companies for not inventing new hardware.

> Alas, woefully, free markets seem to fail here miserably. Technical
> excellence has very little to do with market permeation. Deja vu deja
> vu deja vu deja vu.

Technical excellence according to who? A superior product is one that has the combination of features, price and performance that the customers actually want. A competing product that has a really elegant implementation, but lacks basic functionality that virtually all users want, is not superior in any meaningful sense.

Free markets efficiently produce products that people want to buy. If the results don't fit your concept of what good software is, the first thing you should question is your own concept of what the goal should be. Claiming that the market has missed some perfect opportunity that only you can see is a favored tactic of those whose ideas don't work in the real world.

> There is really no fundamental difference between x86 family, PowerPC,
> diverse MIPSen or Alpha. They all suck.

My point was that big software vendors are willing to spend substantial amounts of money re-writing their products for a platform that is only marginally faster than the x86 family. We should therefore expect that they would be even more eager if a machine 100 or 1,000 times faster came along.

> As to Microsoft, I guess all the intelligence cream they've been
> skimming off academia/industry for years & all these expenses in R&D
> will eventually lead somewhere. Right now, what I see doesn't strike
> me as especially innovative or even high-quality, no Sir. Particularly
> regarding ROI in respect to all these research gigabucks pourin'
> in. Administratory hydrocephalus begets administratory hydrocephalus.

Do we really need to do the 'I hate Microsoft' thing here? I suggest we call a truce on the issue, since experience shows that no one ever changes their mind about it as a result of argument.

> > have been writing 100% object-oriented, multithreaded code for several
> > now. They use asynchronous communication anywhere there is a chance
that it
> I hear you. It is still difficult to belive.

I've worked with the code. Things have been moving pretty fast in this area lately.

> > towards designing applications to run distributed across multiple
> > on a network, and this seems likely to become the standard approach
> > high-performance software in the near future.
> I know clustering is going to be big, and is eventually going to find
> its way into desktops. It's still a back-assed way of doing things,
> maybe smart RAM will have its say yet. If only Playstation 2 would be
> already available, oh well. Marketplace will sure look different a
> year downstream. Difficult to do any planning when things are so in
> flux.

Distributed applications aren't the same thing as clustering. The idea now is that you run different parts of your app on different machines, so that the load can be distributed across a network of arbitrary size. You end up with an arbitrarily large number of different 'types' of server, each doing a different job, and each of which can be implemented across one or more clusters.

> > Regarding the Applicability of Parallelism
> > The processes on a normal computer span a vast continuum between the
> > completely serial and the massively parallel, but most of them cluster
> > the serial end of the spectrum. Yes, you have a few hundred process
> Says who.
> > memory on your computer at any given time, but only a few of them are
> > actually doing anything. Once you've allocated two or three fast CPUs
(or a
> How would you know? I gave you a list of straightforward jobs my
> machine could be doing right now. Sounds all very parallel to
> me. Remember, there is a reason why I need to build a Beowulf.

Says me. I've worked with the innards of OS software long enough to know what is and isn't going on in there. But if you don't believe me, you might want to take note of the fact that the CPU on a modern machine is usually below 10% utilization when you're running normal apps. Even with everything loaded onto one processor, the system spends almost all of its time sitting around waiting for something to do.

Yes, your simulations are highly parallel. It would therefore make sense to run them on some sort of parallel hardware. However, they are a big exception to the general rule. About 99.9% of all users never run anything that has enough parallelism to be worth the bother of re-coding. Even where parallelism exists (i.e. server apps), the complexity of the operations the software performs is too high for the kind of system you want to see (making the objects 100% independant of each other would require making each of them much too big to fit on the processor nodes you're talking about).

> > 2) You also can't get away from context switching. Any reasonably
> > task is going to have to be broken down into procedures, and each
> > will have to call a whole series of them in order to get any usefull
> > done. This isn't just an artifact of the way we currently write
> Untrue. You almost never have to switch context if you have 1 kCPUs to
> burn. You only have to do this if you run out of the allocable CPU
> heap (when the number of your objects exceed the number of your CPUs).

It sounds like you've written so much 'do a few complex operations on a huge body of data'-type code that you've forgotten that the rest of the world doesn't work that way.

Yes, you could run that genome analysis program this way. You could do the same thing with image filters, simple rendering engines, and a lot of other problems. But that encompasses only a tiny fraction of the programming world.

Most software applies vast amounts of code to relatively small amounts of data. In this case you have whole systems of relatively large objects with lots of internal procedures, all of which must interact with each other to get anything done. Instantiating the objects on different CPUs simply substitutes inter-node messaging for some of your context switching, and doesn't help matters at all. Many big problems, like evolutionary design and AI reasoning, have these characteristics, and they demand a very different kind of architecture than what you are proposing.

Billy Brown, MCSE+I