How? How does the robot know where to go and look? And does each
robot have to search the entire space? With USENET news distribution,
you only need to talk to one neighbor. And he talks to his neighbors,
and so on, and so on... The whole idea here is to quit doing this
N^2 thing where all clients talk to all servers (or all robots scan
all servers...) and start doing some Nlog(N) style things.
>If you want an index of resources related to a particular topic you
>can either use ALIWEB simple or form-based search to do it for you,
>copy the data to your machine and do it locally, or even get the
>list of hosts and go and get the information yourself in a standard
>format.
Each of these is an all-or-nothing proposition: in the first
case, I have to locate an ALIWEB server with all the data in the
world on it (scalability test says: BZZZZT). Or I can copy
all the data to my machine (BZZZZT). Or I can get "the list of
hosts" (BZZZT) and do it myself.
With my broadcast strategy, I just set up a process that gathers new
articles and expires old ones. Server sites wouldn't necessarily renew
their announcements at the same interval, but let's say the maximum
interval for 95% of the sites is one month. Then after two months,
my database reaches steady-state. From then on, It maintains itself.
And its scalable: everybody has access to everything without anybody
having to do everything.
Dan