The upshot is that you can expect compression ratios of 95% with some
simple pre-processing and gzip. Detailed info follows.
As you hopefully know, the IMF put together a largish team of volunteers
to build a sizeable VRML site for this year's Festival. The result of our
labor is at http://vrml.arc.org. The goal was to produce the Variety Arts
Center in VRML, including all the exhibits at the Festival.
The issue of file sizes was definitely one we were concerned with. We
saw many huge files -- the first floor that we produce was 2.3M. 2.3M
is an intolerable download time for many people, whether you have a T-1
or not.
Given this, the files were examined and many different "fatty" areas were
found. For one thing, the 3DS->VRML process was leaving numbers like
3.9123e-9 around. This is more compactly represented as 0. ;) Also
there were many numbers like 3.4567890. This can be truncated to 3.456
with no noticeable loss in detail. Finally, there was a lot of white
space lying around.
Fixing the above reduced the size of our files consistently by 50%. These
were many real-world examples, so I'm confident this number is accurate
for large files.
Later, I noticed that 3DS was producing a large amount of normals data that
I believed to be unnecessary. I added more code to remove the normal data,
which reduced the file size by another 25% with no perceived change to the
file. This WILL NOT WORK with some programs, but it works with 3DS. For
example, doing the same thing to LVS files is a bad idea.
Then, finally, Festival time arrived and we were there with many files
that were around the 500K range. Our link to the SF office was bad, and
this was unacceptable, so the files were gzipped as the current version
of WebSpace will decompress gz'd files on the fly. This achieved another
90% compression on most files (some slightly more, other slightly less,
but this is a firm number).
Fat-removal and compression together achieved 95% compression ratios.
Here are some real-world numbers:
RATIO
First original: 1421711 bytes
post-munge: 87870 bytes 93.9%
Second original: 741721 bytes
post-munge: 42472 bytes 94.3%
Third original: 1613568 bytes
post-munge: 92119 bytes 94.3%
Fourth original: 2273093 bytes
post-munge: 125935 bytes 94.5%
Theater* original: 1683979 bytes
post-munge: 151455 bytes 91.1%
* The theater was done with LVS and could not have its normals removed
More importantly than the compression ratios is the fact that even our
largest, most polygon-glutted file (2.3M) was reduced to 125K -- a
reasonably download for even dialup 14.4K modems. Even with normals
kept in compression ratios should be above 90%.
Given the above, I see no reason why we need expend our efforts on
a binary file format. I especially believe this when you consider
LOD techniques and the possibility of loading details on the fly.
James Waldrop (JLW3) \ [email protected] / Ubique, Inc.
Systems Administrator \ [email protected] / 657 Mission #601
http://www.ubique.com/ \ 415.896.2434 / San Francisco, CA