Re: more comments on Robinson's CGI spec.

David Robinson ([email protected])
Thu, 30 Nov 95 15:40 GMT


[Sorry about the delay in replying to this.]
Dave Kristol wrote:
>[email protected] (David Robinson) wrote:
> > >[DMK wrote]
> > >Here are some more comments on David Robinson's (Oct 16 and Nov 15,
> > >1995) CGI 1.1 specification, http://www.ast.cam.ac.uk/~drtr/cgi.html.
> > >
> > >PATH_INFO
> > > I think it's important for the CGI to be able to reconstruct the
> > > original request URI.
> >
> > I agree. If you want relative links, then it is useful. However...
> > >
> > >With NCSA's server, the original request is
> > > http://$SERVER_NAME$SERVER_PORT$SCRIPT_NAME$PATH_INFO{?$QUERY_STRING}
> >
> > Unfortunately not in a few obscure cases.
>could you elaborate, please?

Principally when the CGI invokation is not directly due to the client request,
but is due to an 'internal redirection', e.g.
* CGI scripts for error messages, e.g. a script is called when the server
wants to send a 404 message
* CGI scripts called by the server when parsing a .shtml file with the
#include or #exec directives.
* CGI scripts being used as 'handlers', for specific document types, e.g.
the server always calls a particular script when it encounters a .fun file

In fact, what I had in mind when I made the comment was the following 'feature'
of NCSA httpd;
if http://foo.com/cgi-bin/script and http://foo.com/htbin/script both
reference the same script, then the server was likely to set
SCRIPT_NAME to /cgi-bin/script irrespective of which URL was used.

> > >PATH_TRANSLATED
> > > I think the description is wrong. enc-path conventionally encodes
> > > both the name of the script and PATH_INFO, not just PATH_INFO. I
> > > agree that PATH_TRANSLATED is a translation of the PATH_INFO.
> >
> > I don't understand this. What convention do you mean? Are you suggesting
> > that the name is poorly chosen, that it should be enc-path-info?
>
>I think I see what your description is trying to say, but I think it
>needs to be improved. You're saying (yes?) that PATH_TRANSLATED
>corresponds to the file (resource?) you would be referring to if the
>original request had been
> protocol://SERVER_NAME:SERVER_PORT enc-path
>where "enc-path" is the URL-encoded PATH_INFO. (Isn't "protocol"
>always "http"?) Somehow I didn't get that the first few times I read this.

Exactly.

>I think my problem is that here (and in the PATH_INFO section) you
>leave unspecified how it is that PATH_INFO is arrived at, yet under
>"Defining a script URI" you make it clear it's the part after
>enc-script, which is what I would have expected. I would guess you're
>drawing a distinction between CGI's that are called as a result of URL
>processing and CGI's that are called by some other mechanism, in which
>case PATH_INFO would be derived some other way.

How about rewriting it as follows:
* Firstly, define a 'script URL' which identifies the resource output by the
CGI script. This URL is only meaningful in the server-script context.
* State that this _may_ be the URL requested by the client, but then it might
not; for example, in the cases I listed above.
* Define SCRIPT_NAME, PATH_INFO, QUERY_STRING in terms of the 'script URL'.

Do you think that would be clearer? I think it would be saying the same thing.

[...]
> > >Recommendations for scripts
> > > Why shouldn't the CGI script expect the server to fold PATH_INFO
> > > (PATH_TRANSLATED) information that contains "." or ".." the same
> > > way that the server handles them in a URL? In that case, the CGI
> > > would never actually see such stuff. Likewise for "//".
> > Maybe they should, but some servers don't. NCSA folded "..", but not ".".
> > I now don't think the server should touch "//", because the script may
> > have a use for it; one example would be passing URLs in the PATH_INFO.
>I guess this is another case where either what happens should be spelled
>out (for all systems), or it should be made "implementation-defined".

I've changed it to be more explicit. Have another look.

Thanks for your comments,

David.