Re: What all does html support?

Larry Masinter ([email protected])
Sat, 18 Mar 1995 12:55:30 +0500


Yeeesh! No! Where's that rolled-up newspaper?

Rupesh Kapoor asked:
>> Hi,
>> It's amazing to find more syntax being supported by these browsers
>> each day. For instance, when I converted a MSWord document into text only
>> form today, was shocked to see netscape 1.0 correctly interpreting octal
>> codes like \225, \227 etc as bullets, regd & copyright symbols.
>>
>> Can anyone supply me a pointer to the exact set of such symbols
>> supported by these browsers? Of particular interest are Mosaic & netscape
>> 1.{0,1}

And Bill Perry answered:
> This is generally a font issue - was this in windows netscape? Try it
> with netscape/X, and it might/might not work. I hardcoded in a few
> conversions for emacs-w3 based on the more popular ones people use from
> windows fonts (quotes, cpoyright, registered, etc).

This isn't a _FONT_ issue, it is a _CHARACTER SET_ issue. HTML is
normatively sent in ISO-8859-1, a character set that has several
special characters in it. If you're building a browser that accepts
HTML, you should do the best you can rendering ISO-8859-1 on the
user's terminal, even if you don't have the right fonts. Similarly, if
a server has documents written using something other than ISO-8859-1
(e.g., Macintosh character set) it should either _translate_ the
document into the right character set or else _label_ it
appropriately:
content-type: text/html;charset="whatever"

This is especially true if you want to send, oh, Korean documents to
Korean-capable clients. Don't rely on the happenstance of
client/server accidentially translating the characters the same way!

We have to head off this particularly nasty barrier to
interoperability. You don't have to all use the same character set,
but please pay attention to the charset and label the things that you
have.

(I'm getting on WMPerry's case because he's the implementor of
the browser that _I_ use most frequently. :))

Larry