Re: Web stuff (long message)

From: Alex Future Bokov (alexboko@umich.edu)
Date: Sat Oct 14 2000 - 09:59:25 MDT


-----BEGIN PGP SIGNED MESSAGE-----

On Fri, 13 Oct 2000, Samantha Atkins wrote:

> Eugene Leitl wrote:
> >
> >Alex Future Bokov wrote:
> > > Please direct your comments at how ExtroDot specifically can improve. I
>
> One thing that bugs be about most web sites including slash types is
> that most of these systems don't have an obvious way to send comments on
> the site itself like a webmaster email link. To me it is somewhat rude
> and cowardly to not include a feedback button on a site, especially one
> that expects thousands of hits.

Excellent point. This will be done!

> > Nope, I was talking about /. clones in general. I believe it's slick,
> > loads fast, and is about as mean and lean as webpages can
> > get. Unfortunately, it's still a web page.
> >
>
> There isn't anything wrong with web pages per se that a bit of re-think
> wouldn't fix. We need imho something at the browser/client (dumb
> distinction) end that is more like an interpreter and a bag of very
> basic graphic capabilities but little more than that and a fairly simple
> scripting language for invoking more capability. What comes from the
> server is model (data) plus requests on the views to apply to that data
> (more data). The browser constructs the view, loads the model into it
> and provides some controller functionality for interpreting user
> gestures within particular standard widgets. It does what actions it
> can against the data loaded to it from the server (or other peer if you
> will) and only talks to the server (or other peer) when it needs
> services not available in the browser or the local loadable software
> module space. This is much cleaner than having a server that concocts
> the entire view, cramming in the model data as best it can with some
> ungodly ugliness or other and then sends this thing as a dumb, dead file
> that is nearly unreadable to a big fat browser that does its best to
> render the thing locally even though most of the brains (what little
> there were) have been left at the server end. Almost all client
> interaction causes another round-trip to the server for yet another dumb
> page whose data is all collected by and at the server and whose
> presentation is described and painfully built by the server.

You are abosolutely right on this one. Of course, this is not currently
possible, but I know for a fact that this sort of client-side
functionality is being developed. Slash is a better-than-nothing
solution, something we have at hand right now. I'm not claiming that
it's a final answer. However, experimenting with Slash will allow those
working on this other project to learn more about what users need and
want.

> Now I know that today there are a lot of people doing things a little
> smarter. Like sending XML data up and having it locally rendered. Some
> of them are even using Java or something better to do the rendering.
> Which gets a bit better. But too many are using yet more layers of
> obfuscation (XSLT, CSS) to get around to doing a half-assed job at
> something Smalltalk programmers did well (pluggable views) a long time
> ago. I used to subscribe to an XSL[T} list. But it was much too
> depressing reading endless posts about people attempting to do simple
> programming tasks in something not at all made for it. Why do people
> buy some half-ass language or hype of the day and try to make it do
> everything?

A question well worth asking, especially from the brain trust that is
this list. IMO, the challenge is to separate the content from the
formatting just as content-formatting have been separated from the
protocol and application layers. Have all text be as no-frills,
standardized, and machine readable as possible. Then have all the
bells and whistles rendered by the client, the aggregator, the portal,
the whatever intercepts this content... in a manner that is reversible
so something downstream from it can easily reconstruct the original
content.

> And exactly why is it that my 650Mhz+ machines with 10s of gigs of
> storage are not being utilized for better UI experience, local caching
> of data and code fragments and so on? Treat me as a "thin client" when
> I'm on a cell phone or my palm pilot but not when I have that kind of
> local power. It is a stupid waste. We clog the internet with page
> after page to be drawn on machines most of which could calculate their
> own pages from data kept in synch with its remote sources with far less
> traffic on the net and a far richer user experience and MUCH faster.

P2P is a good idea. Again, something that's being worked on by a number
of groups, and that I'm dying to see become more commonplace.
 
> The Web as such is not the problem. Falling for hype and forgetting to
> think or to dream up better solutions using all the resources at your
> disposal is. I have no problem with something that does every
> functional thing the web does. I have a big problem with its
> dysfunctional aspects and with people and companies literally refusing
> to think outside of the box called the WEB. "The customers won't buy it
> if we do anything but HTTP for our GUI." "The customer won't buy our
> server if it isn't Microsoft." "The customer definitely won't buy our
> server if it isn't written in pure Java." Balderdash. Give a business

Word up! Yes, bravo. Thank you.

> tools it can use to be many times more effective than its competitors at
> a fraction of the cost and they will be lined up at your door. Of
> course you have to find the people in the business actually capable of
> thinking vs. simply repeating what they read.

However, ignoring the inertia and existing technology investment of
your users is a good way to get Macintoshed or Betamaxed. The ideal
solution to the web's problems is something that will seamlessly
integrate into and eventually replace the prevailing paradigm instead
of attempting to singlehandedly blow it to smithereens. The most
crucial (and neglected) part of a technological revolution is a
feasible path from here to there. That's why one of the starting
points is Slash. If people out there actually have applications that
can take Slash articles/postings over an HTTP port in XML format and do
something useful with them, it's certainly doable to expand the range
of Slash's XML offerings (yes, Slash already presents certain content
in XML format... have a look at http://www.extrodot.org/articles.xml)

> Let's see. I can pick up IDE space at around 40 gigs for $300 or
> less. My time runs around $80+/hour currently. So about 4 hours net of
> time wasted putting up with someone else's organization of the data and
> waiting for them to render it in the ways their server knows how to and
> to deliver it a piece at a time over the net to me vs. spending that
> four hours of earnings for enough space to store huge libraries of
> information locally. I can see going to other machines (but much more
> efficiently) for new fresh information. But going to them to get stuff
> that hasn't changed? What for? Disks are cheap, time is expensive. A
> remote database/repository is nice for sure but I want my own not little
> links to everyone's own web-ified offerings of information. I'm sitting
> right on top of a fiber installation at work and on 780K bi-directional
> DSL at home. I can afford to download all the original data and updates
> I need. On the road I could link to my own server and see things in the
> form that I want it.

As above, we agree on this issue. The world needs a standardized,
machine readable text format, to which all kinds of arbitrary "layers"
of formatting can be applied and unapplied by arbitrary intermediaries
and end-users. XML is probably fine for the job.

The next step is to design server and client software that can handle
this. Preferrably software that will integrate into existing HTTPd
implementations and existing browsers, so that we're not reinventing
the wheel. I have my opinions on the best sequence of steps to take,
but if you can come up with a better one, I'd love to hear it.

> > I'm a digital pack rat. I don't trust content to be there tomorrow
> > when I see it today. Because, like, every third link I click is dead
> > like, totally. Rather frustrating, this. Other people are solving this
> > by running squid in total web trail mode, me, I don't have that much
> > hard drive space. So I grab whatever I deem useful. Because web
> > doesn't allow me to save the entire document, I have to resort to
> > naked .html, txt, .ps.gz and .pdf
>
> Excuse my ignorance but what is a "squid"?

Yeah, this squid thing is beginning to sound like something nice to have.

> > Opera is not quite there yet when it comes to stability (but it is
> > making good progress), nor is Galeon (also almost there). However, the
> > problem is principal. You can't render complex content in a
> > sustainably stable fashion, using state of the art software
> > development. On the other hand, I can rely on the average software to
> > render simple content (ASCII) reliably.

ASCII is less simple and reliable than you might think. Have you ever
tried to come up with a really foolproof automated way to even do
something as simple as break up paragraphs, strip out signatures,
identify duplicates based on content alone, and parse out quotes? Have
you ever tried to automatically split up the articles being pushed to
you by several *different* email newsletters?

So yes, push is a good idea (but I still say the distinction between
push and pull is in our own heads). A simple, universal content format
is a good idea (though XML is a better candidate than just raw text).
However, email is a deeply flawed medium. Its threaded nature forces
you to think in terms of a one-to-one relationship between message and
response, whereas its really many to one or even many to many. It
discourages forming connections to things outside email-space. Email
is a sedimentary rather than an additive medium. In other words, once
you post an email message you have little incentive to update it, issue
new revisions, incorporate reader feedback. You just let it get buried
under strata of new email, where you and everyone else will forget it.
Its hard to back-track an email post to what it's referring to, unlike
links/backlinks. Finally, there is lack of categorization. Web pages have
titles, and increasingly META tags. They tend to be linked to related
web pages. Email, however, is at best linked to whatever it's responding
to and tends to inherit the Subject: line of its thread. You'll rarely
see a web-page entitled--

"Yes, but... (Was: [Re: Church of the Subgenius (was Gun Control and Gover)]"

Please don't propose netiquette as a solution. If netiquette worked, we
wouldn't be having this part of the converastion.
 
> Actually you can come a lot closer if you keep the data (model)
> relatively straightforward in terms of knows sets of meta protocols it
> employs and have a stable set of widgets and view thingies that you
> might add experimental things to now and then with fallbacks to simpler
> ones plus some decent streamable scripting support (both to run
> interpreters and embed code and support dynamic configuration). The
> rest of the complexity in a mulit-user environment is in concurrency
> and transaction type issues and in other goodies like failover,
> replication and security. Some parts of transactional processing
> (especially long transactions and non-standard models of concurrency)
> are research topics. But the simpler things should not be what is
> causing unpredictability as is the case largely today. The simpler
> things also should not be eating up ungodly amounts of bandwidth. And
> most of all the simpler things should be consuming the majority of the
> programming talent. When you see that happen something is obviously
> wrong.

No argument here.

> Well, HTML is a bastardized subset of SGML (or was). SGML does give the
> real presentation and typography stuff. It would have been really good
> if actual display PostScript was used to model a lot of the rendering
> stuff and dependably customize it. At the time it came out HTML was a
> reasonable compromise for getting people sharing a lot of content over
> the internet. But we lost sight of what was and wasn't good about it
> and made it a matter of religion to do all things for all people through
> HTTP.

Uh uh uh! Not to nitpick but HTML != HTTP. Let's leave protocols out of this.

> XML is getting made to complex imho. The basic idea is fairly simple.
> Except it ignored some simply facts in its early design like the fact
> that most data exists in the form of a general graph rather than a
> hierarchy. It is taking a really long time for them to get over that
> and fix it rather than conveniently ignoring it much of the time. Then
> the official standard parsers are braindead in that a 15 meg XML data
> stream will blow up into 200 megs or so in memory using most
> implementations of DOM (well, under Java anyway). And people got so
> hooked on human readability they didn't bother to take XML data
> description or DTD and compress the actual data into some predictable
> binary format that is more efficient to transmit and decipher given the
> meta information and perhaps a bit of indexing at the front. Worse of
> all are the people who want to take perfectly good (well, maybe a bit
> too bloated) RPC and CORBA and COM pipes and turn it all into XML.

To what extent are the examples you gave examples of XML abuse rather
than XML shortcomings? What would be better than XML? What
characteristics should such a language have? It goes without saying
that it should be machine readable/writable. How important is human
readability/writability?

> > > with this point in the first place. Tell me more about your filters.
> > > They may already exist as features, or they may be relatively easy to
> > > implement.
> > >
>
> How about implementing general scripting based filters and let
> half-clueful users provide their own filters written in the script? I
> would vote for Scheme or Python based scripting myself. So you need an
> interpreter and a sandbox and some reasonable API for getting at the
> information on your site as information/data.

Separate questions. We already have everything we need to output content
from Extrodot as information/data-- XML, RDF, ASCII, MySQL dump... you
name it, it can probably be done. If there is enough demand to divert
the necessary brain-cycles from The Goal.

As for filtering, that's a tougher question. The filtering task can be
split between Extrodot and your machine... some coarse filtering
functionality could be implemented "server" side and the finer-grained,
computationally expensive stuff could be implemented at the "client"
side, and I won't have to care what language you use for filtering. For
"server" side, the syntax would probably be either that of Perl or SQL,
since that's what the front-end and back-end run on respectively.

> > > I leave everybody with this question to ponder:
> > >
> > > Is there an inherent difference between push and pull media, or is this
> > > difference arbitrary, artifactual, and potentially transparent to the
> > > user in the same way as "static location vs. search query" or "local
> > > volume vs. distributed file system"?
> >
>
> Push and pull aren't terribly relevant to me. What I really want are
> systems that make the walls disappear. I would like to not need to

Same here.

> think much about whether the data, objects, functionality I am dealing
> with is on my machine, your machine, all of our machines or what
> computer language it is in or what database vendor product and on and on
> it goes. I would like to see information/objects/fucntionality helped
> to be rearranged and duplicated and migrated to best balance the load
> dynamically. I think a big mistake "server" or "website" people make is
> assuming they more or less own the information and are the determiners
> of how others access it. At most they give some particular views onto
> information. But this functionality should not make it impossible to
> get to the informationa and bits of the functionality to use them in
> other and new ways. I'm probably not expressing that well. Even after
> 15 years I haven't found a way to say it that will reach most people.
> And most of my time gets eaten doing something far less clueful to make
> my daily bread.

Actually, this makes perfect sense. This too is part of The Goal.

- --

NSA Waco Cherokee
Why are the above words in my signature? Check out:
http://www.echelon.wiretapped.net

-----BEGIN PGP SIGNATURE-----
Version: PGP 6.5.1

iQBpAwUBOeiC3pvUJaRNHMexAQEpEAKZAQU5W/SxmmutOwCbq0EBBeXwdIZfzZWl
yAZgJKW9fMhHG9+s/jxATVOP2UmjKR0An77vdJxOgqAaxEISh+MPkcJZT+Bv/Kww
wIKN7d14Gfa84ENW
=sr/E
-----END PGP SIGNATURE-----



This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:50:17 MDT