A tar archive is available for anonymous FTP:
dri.cornell.edu/pub/davis/print-www.tar
It contains:
README
print-www
print-www.l
html-to-latex
html2latex.sed (modified version of original CERN version)
The hardest part was writing the perl script to obtain documents
via http protocol - turns out you cant just run pipes through telnet.
The conversion from HTML to LaTex is not really robust yet -
this is doubly hard since there is no guarentee that the HTML
is legal. But at least it works for my test cases. No doubt
it will be improved in time.
best wishes