Re: The future of meta-indices/libraries?

Kevin Altis ([email protected])
Tue, 15 Mar 1994 22:49:03 --100


At 9:16 PM 3/15/94 +0000, John Franks wrote:
>According to Kevin Altis:
>>
>> We can search on a list like this (just the titles or titles and URLs) on
>> our local server which has turned out to be quite simple and fast. Simply
>> extending this list to include <A HREF> items makes it extremely easy to
>> find items on your *local* server or local links to documents on other
>> servers as long as the link name isn't something like "here." For local
>> servers this can be extended to include <A NAME> tags. Document authors can
>> include <A NAME> tags to go along with H1-H6 headers to increase hits.
>
>You only want to index *titles* and only of documents on the local server.
>If I have an HREF on my server to a document on your server that should
>not be in an index associated with my server. Also we don't want to
>index HREFs (or worse NAMEs) that are local to my server. The reason
>is they don't have enough information -- many of them would be "click here"
>or "check this out". If the document referenced is local, it should have
>a title and be in the list elsewhere in any case.

Well, I did suggest separating local references versus local documents.
Being able to quickly search your local server is often as important as
searching the whole web and while searching my local server I certainly
want to pick up local references. My main point however is that *just*
indexing TITLEs is not good enough. For one thing, I would like to see
non-HTML documents within the index, so a reference such as <A
HREF="/some_local_dir/install.txt">installation instructions for widget</A>
would be included. Including <A NAME> entries and getting a lot of "click
here" entries is a potential problem, but again those could be filtered. As
authors use a better style, those <A NAME> references will have more value,
especially if they put them with each header reference, major list, etc. If
local references were allowed, then this might also help make up for the
problem of most server administrators not providing indexes of their
servers.

It is also important that no extra work need to be done in order to
generate decent indexes of the Web server material. If the indexes are
going to contain keywords and other meta-information then that should be
stored as part of a normal HTML document or a meta-information file
associated with the document. We are already experimenting with this within
our group at Intel, but I would welcome a standard meta file format,
extension (.meta?) that servers would recognize for Expires: info. among
other things. I think the Aliweb work is headed in the right direction.

ka