Re: converting URLs in .html files

Erik Ostrom ([email protected])
Tue, 31 Aug 93 13:04:45 CDT


The only thing I can think of that would work well for determining whether
or not a link is "internal" or "external" would be to violate the URL,
looking inside the opaque descriptor and determine if the two documents
were in the same directory.

This is how I was originally going to respond:

HTTP paths are not opaque; nor are FTP paths or Gopher paths
(according to the URL spec, although according to the Gopher spec they
should be).

The path is interpreted in a manner dependent protocol being used. However, when it contains slashes, these
must imply a hierarchical structure.

(from page 9 of ftp://info.cern.ch/pub/www/doc/url6.txt)

And on the next page, a set of rules is given for how

certain characters ("/") and certain path elements ("..", ".")
have a significance reserved for representing a hierarchical
space

So, effectively, the path _can_ be used to determine whether or not
two files are in the same `directory'. (Of course, this isn't true in
the JHM web gateway, and it probably isn't true in many other places
where slashes are abused. The point is, paths aren't supposed to be
opaque, and if we indicate a hierarchical structure where there is
none, we deserve to lose.)

So that was how I was going to respond. But then, on a whim, I did a
search through the same document that one of the necessary parts of the structure of a name (huh? I
thought this was a `locator'.) was

. Information to be passed to the server. This may be
private to the server, as all names may be generated
and used by the same server. This part of the name
should be opaque to the client.

This seems pretty clearly to refer to the `path' element. So why are
clients mucking around with the contents of a path given in partial
form?

Maybe I'm unclear on what's meant here by "opaque".