Re: Reliable links [Was: Stab in the dark ]

Daniel W. Connolly ([email protected])
Mon, 21 Mar 1994 19:56:53 --100


In message <[email protected]>, Josh Osborne writes:
>[...disscussion of how to deal with links to data that change over time...]
>>For the content-type dimension, the format negociation algorithm in
>>HTTP works pretty well...
>>
>>But in all these cases, I'd like to be able to put version, format,
>>language, etc. info in the reference itself, if I choose. For example,
>>I may know that
>> ftp://foo.com/lksjfli4jlij43
>>is a postscript file. But there's currently no way to express this.
>>And Mosaic, for example, will assume it's a plain text file.
>
>
>I don't think we want to tie a URI to a format, version or language.

I did not suggest that. Please read carefully. I took the trouble to
construct a concrete example motivating the OPTION to specify a
specific sequence of octets as the referent of my citation.

>If a document has a link to http://foo.com/lksjfli4jlij43, which is
>some sort of picture and I am at a color workstation with a fast network
>link I want to retrive whatever image looks best in color, regardless of
>(data) size. If I am at a workstation with a slow link, and a monochrome
>display, I want the smallest image that looks good in monochrome.

I agree. I wrote:
>>For the content-type dimension, the format negociation algorithm in
>>HTTP works pretty well...

Now, back to my scenario: what about FTP?

>Also http://foo.com/dr.fun/most_recent.jpg, should not be constained to
>pointing at one image, it should be free to change from day to day.

What if I have a complaint about tuesday's version which is fixed in
wednesday's version, and then somebody reads my complaint on thursday?
Then I look silly, cuz it's not apparent to the thursaday reader that
my complaint was once valid. Consider online document reviews,
contract negociations, most other CSCW applications...

>If you want to cache at the application level, you can assume whatever
>object you fetched last is still valid, unless the user changed some
>defualt or other that might effect things ('color/monochrome'). You
>should also timeout data when it expires (if the protocall has a TTL or
>expire date like HTTP), or after some fixed length of time (a few hours).
>If you are doing caching _outside_ the application (like a gateway) you
>need to understand enough of the protocall being used to figure out what
>version of the bits is being fetched, and either handle it from the cache,
>or fetch it from the real source.

So much for scalable distributed systems... if I've got my own caching
strategy, and you've got yours, and we don't agree on how to encode
meta-information, then we can't share cache namespaces.

fI really disagree fundamentally with the notion that every resource
has a "home" and you must make a round-trip to that home to access the
resource in any well-defined way. This means that caches are all
heuristic optimizations, doomed to failure in unanticipated cases
(since all the cases aren't specified.)

Some resources are opaque and need to work that way, but many are not.
RFC822 is just a sequence of bytes. It doesn't matter where you get
them from, as long as you get the right bytes.

Dan