http://www.w3.org/hypertext/WWW/MarkUp/old-9508/Connolly/921130/MarkUp.html
The bulk of it is a by-example implementors guide. I distributed some
ANSI C source code to implement it too. Unfortunately, I changed jobs
just before I got a chance to send my Mosaic patches to NCSA. Some of
the bugs I mentioned in that guide (processing instructions) are still
around in implementations today.
The libHTML.tar.gz distribution mentioned therein isn't available
now, but most of the code is available in an html2mif distribution
I updated recently:
ftp://ftp.w3.org/pub/contrib/html2mif-19950714.tar.gz
Unfortunately, the work of getting all the HTML implementations to
interoperate now is MUCH bigger than it was back in '92.
Dan