Re: Searchable Web info (was Finding CGI spec...)

[email protected]
Mon, 9 Jan 1995 18:29:27 +0100


Nick Arnett wrote:
>[...]
> > I did a search on "cgi" and got back a doc with a name I didn't
> > recognise. Now although I have several hundreds of HTML files,
> > like my children, I know most of them by name :*) I think you got
> > the href from a file that has a Base tag pointing to another server.
>
> Our spider doesn't follow links to servers other than the one where it
> starts (we trigger each index for each server individually). Documents
> from other servers would have come from distinct indexing sessions.
>
> Having said that, I'm not sure exactly what you're describing here. Can
> you describe it a bit more?
>
OK: as I'm not sure what you're not sure of, pls excuse if I
explain the obvious :*). Relative URLs are normally understood
to be relative to the directory the file is in. But the Base tag
can make the URL be relative to any other directory - and on any
other server. In the particular instance I had noticed, the file
was in fact adapted from the TOC of Ian Graham's HTML tutorial;
I didn't want to move all the sub files over so I just made the
Base tag point to the original TOC - not on my server. So if the
spider finds a reference in this file to "server-cgi-bin.html"
it should realise I don't actually *have* that file - it's where
Base says it is, i.e. some other server, in this case. If it doesn't
want to go on sidetrips to other sites I guess it's just going to
have to ignore relative URLs in files having Bases pointing to other
servers.
Alan.
________________Alan_&_Lucy_Richmond__________________________
CyberWeb / Virtual Library: a wealth of information on World
SoftWare http://WWW.Charm.Net/~web/ Wide
WWW Systems Engineering ***** [email protected] ***** Web