Re: Questions and comments

Dan Connolly ([email protected])
Sun, 29 Nov 92 14:34:39 CST


>I'm new to this list, so forgive me if I hit things already dealt with.

Actually, your questions are quite timely.

>I'm implementing yet another browser (text mode, written in perl).
>It's actually basically done. I have implemented the following tags:
>
>TITLE, A, NEXTID (currently ignored), ISINDEX (ignored), PLAINTEXT,
>PRE, LISTING, XMP, P, H1-H6, HP1-HP6 (ignored), DL, UL, MENU, and DIR.
>
>also, I have done PRE and OL. But along the way I've seen several other
>things in several different places. For instance the following seem to
>be defined in viola:
> COMMENT, XMPA, S, ST, VOBJ, XMPA
>I've also seen references to DOCUMENT, KEYWORDS, DOCTYPE, and perhaps others.
>
>Which brings me to my main question:
>Is there a definitive list somewhere of everything that's been proposed.

The current specification is

http://info.cern.ch/hypertext/WWW/MarkUp/MarkUp.html

I hope to replace that with a more rigorous specification soon.
I hope to use the same spec to register text/html with the IANA for
MIME purposes.

>Other stuff:
>I'm not sure what the difference is supposed to be between an OL and
>a UL. Should the browser actually sort the list items for a OL?

An OL was never a sorted list. It's just a numbered list, as opposed
to a bulleted list. It's for stuff where the order of the items in
the list is significant; e.g. step 1: do this. step 2: do that...

>Also, I was under the impression that PRE was like PLAINTEXT, meaning their
>is no ending tag, just end of file. I hope I've misunderstood, if you
>are proposing to replace XMP with PRE. Another problem with this replacement
>is the quoting problem. With XMP, you don't need to worry about whether
>or not your arbitrary text contains something which looks like an HTML
>tag. This is an important feature, and one which should be kept IMHO.

Well, you have to throw out SGML conformance if you want the current
PLAINTEXT semantics. Even the XMP semantics are no good. In SGML, the
string "</" is recognized as markup iff it's followed by a name start
character (a letter). The above HTML documentation says </ is only
markup if it's followed by XMP, i.e. "</XMP>" is the _only_ string
that ends an XMP section. This is not expressible in SGML.

I'm defining HTML in terms of SGML. Period. I'm punting on Plaintext. The
idea is that plaintext data is not part of the HTML data format. Plaintext
is governed by the MIME text/plain data format. Any HTTP servers
that return some HTML followed by <PLAINTEXT> followed by more data
are thought to return two MIME entities: a text/html entity, terminated
by the <PLAINTEXT> tag, and a text/plain entity.

As for the <PRE> tag, I think I'm going to call it FIXED, and go with
a <p> tag at the end of every line.

Details as they develop...

Dan