Parsing HTML using SGMLS and HTML.DTD

R J Partington ([email protected])
Mon, 14 Aug 1995 01:01:43 +0100 (BST)


To: [email protected]
Date: Mon, 14 Aug 1995 00:59:11 +0100 (BST)

I've got a problem trying to parse HTML using the `html.dtd'
(comes with GF, written by Daniel W. Connolly :) using sgmls.

What happens is:
(This also happens for a lot of other DTD's : mainly the snafu ones
that come with GF)

sgmls complains about
> Parameter entity name longer than (NAMELEN-1); truncated
> Length of name, number, or token exceeded NAMELEN or LITLEN
> limit

I know that the GF documentation warns about this, and suggests you
put a header in your documents to increase NAMELEN etc.
However -- I have too many HTML documents to do this for each one
(although I could automate it with a shell script) and I would like
to know if

* I can recompile sgmls with a bigger NAMELEN (and whatever else)
* Force sgmls to include the header each time (I tried the -iname
option for sgmls, but I can't get that to work as expected --
I think I'm misunderstanding the manual page.)
* There's anything else I can do?

rjp