Re: Decentralization and exhastive searches: mutually exclusive?

From: hal@finney.org
Date: Tue Aug 01 2000 - 21:23:09 MDT


Alex Future Bokov wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
>
> Okay, this is a theoretical question that I'd like to ask without
> getting bogged down in politics. I'm using Napster and Gnutella as
> convenient examples and not because I approve or disapprove of either.
>
> 1. Gnutella--
> It's distributed and therefore robust. You cannot shut down
> gnutellanet by shutting down one, or ten, or a thousand servers.
> However, you cannot do exhaustive searches on it. The content
> you're looking for might be out there and yet not be guaranteed
> to show up on a search. Another disadvantage is that the
> searches make inefficient use of bandwidth.
>
> 2. Napster--
> It is exhaustive. For better or worse, you'll find every single
> instance of Michael_Jackson_Thriller.mp3 that anybody on the
> network is serving. However, the server/s that store the
> content listings and corresponding locations of the content are
> all under the control of one company, which means they are
> vulnerable to legal action, censorship, company-wide technical
> failure, corporate abuse, and attack by hackers.
>
> So, what about combining the best of the two? Decentralization and
> exhaustive searching? Is it a logical impossibility, or is it merely
> something that hasn't been done yet?

Freenet does do this to some extent. It is a decentralized network which
can hold data. Each data item has a name or "key" which controls where
it ends up in the network. If you know the name of a data item you can
then fetch it, and your fetch request will be routed to the same place
so that it finds the data.

Presently you have to know the exact name of the item, which makes it
less useful for music where you might want to search just by artist,
album, or part of a song name. The Freenet team is working on providing
a method for searches to work. It will use a different approach than
gnutella, similar to fuzzy logic. However it will not be 100% reliable
and will still have the problem that you will not be guaranteed to find
all data items that are in the network.

A simple approach to combine decentralization and reliable searches is
to simply have a large number of servers, each of which replicate the
entire database (or a substantial fraction of it). An example might be
the PGP key server infrastructure, with a dozen or so key servers each
holding PGP public keys, periodically exchanging data to make sure all
are obsolete. The DNS root servers use a similar concept. These are
not fully decentralized, but the basic idea could probably scale up to
having on the order of 100 or even 1000 servers, which would make it
difficult to shut them down one by one.

Hal



This archive was generated by hypermail 2b29 : Mon Oct 02 2000 - 17:35:28 MDT