RE: SPEC - Text

Gavin Nicol ([email protected])
Tue, 18 Apr 1995 13:49:03 -0400


>I think we can do something that will work by sidestepping the
>character set switching mechanism. Each "string" must be in a single
>character set. We could reuse PEX terminology and refer to these as
>"mono-encdoded" strings.

PEX is wrong.

>A "character set" is a set of relations between "code points"
>(integer indices) and generic notions of glyphs. For example, 0x41 is
>an "A" in ASCII (& 8859/1 or ISO Latin-1). The character set has
>nothing to do with what type face (appearance -Helvetica, Old English,
>etc) the "A" is.

A character has nothing to do with the glyph image either.

>A "font" is a set of specific glyphs which can be indexed.

Sorry. Try messing with litagitures.

>Can anyone summarize what is going on with I18n in the HTML world? I
>saw a link to some info the other day, but did not follow it.

I would guess that my proposal for the document character set for HTML
be ISO 10646, and we'll be using the MIME charset parameter to figure
out encoding and perhaps character set. Don't discuss the above if you
don't understand the full meaning of "document character set".

Your basic ideas are OK though, even if they're not worded well. How
about:

Text3
Fields
MFString text
MFString language
MFString encoding
MFString charset