Re: The Superhighway Steamroller

Simon E Spero ([email protected])
Sun, 26 Jun 94 18:38:13 -0400


Dr. Hart.
I am replying to this usenet article, which contained several errors
of fact which once corrected should help reassure you and encourage you
to keep up your excellent work on the Gutenberg project.

I do not have time to deal with all the points that need addressing, but
I'll try to answer a few of your concerns, and send you a more detailed
response later.

1) NCSA does not use a Cray as an HTTP server - in fact they use a few
HP workstations. The speed of a computer has little to do with how
much traffic it can place on a network. What does limit the volume of
traffic is the speed of the slowest network link. That's why WAIS
queries run on CMNS.think.com (a connection machine) don't swap the
internet.

2) HTTP and FTP use virtually the same technique for file transfers. FTP does
not transfer all the file you download over a single connection. Instead
the ftp server creates a new connection to the client for each file you
transfer. Once the connection has been created, both protocols just
blast the files out as fast as TCP can carry it.

3) There is no plot to ban ISO-676 characters from the internet. If your
text can be encoded in 676, there's no need to worry; even when the network
does move to UCS, it will likely be in a way that will allow 676 characters
to be read as is. Let a thousand cows bloom.

4) When using FTP, HTTP, or GOPHER over a dialup line, you don't have to
drop the phone line and redial for every file you transfer. Instead you
just create multiple TCP connections over the same link.

5) It is easy to create suites of HTML documents that can be read standalone.
My hypertext version of the National Performance Review (Al Gore's
re-inventing government) was converted to standalone form by putting
a floppy disk into SunSITE(tm) and just copying the directory.
Interestingly enough, this book was originally delivered in serial
text form; using a few simple emacs macros, I marked the document up in
SGML and used that to generate the hypertext version. I have found this
version to be far more useful than the plain ASCII form, and even to be
better than the paperback form. Try following the structure of the
Ash Tray specification without using hyperlinks...

Other people may wish to respond to other points; If not I'll try and
get back to you soon.

Thanks again for all the great work you've been doing with Gutenburg; I hope
I can convince you that your fears are misplaced, and that hypermedia texts
have a place alongside their serial bretheren.

Simon

--- ORIGINAL ARTICLE ---

"Michael S. Hart" <[email protected]> writes:
THE SUPERHIGHWAY STEAMROLLER IS COMING TO YOU

These articles are still in progress, and your suggestions are to
be encouraged. Many have already been included and I thank those
who have made them.

Some people will find these entirely offensive, as they feel .asm
files are just so much Jurassic Pork, while others will feel .asm
is the only way to make a computer run efficiently. [Assembly is
the fastest language other than binary we can write in, although,
it would appear it is a vanishing breed.

Others will find it offensive, as they feel the only way to "Surf
the Nets" is with a GUI [Graphic User Interface] while others can
see surfing the nets with a GUI as comparable to surfing on waves
made of peanut butter.

I have seen good applications of both, and would like to see some
kind of optimization of the combined advantages.

The following is a series of articles I have been working on in a
preparation for changes in computers and networks I was hoping in
desperation would never come; desperation because I would prefer,
in my heart and mind, not to take the time and energy away from a
productive creation and distribution of information on the net to
what I am afraid will be an unproductive non-conversation about a
trend that is sweeping the world.

People are using inexpensive computers that are 100 times more in
capability than the first IBM PCs and XTs, but their performances
are not all that much faster due to the extremely sluggish update
programs and files that have replaced the extremely fast programs
and files of only a few years ago.

A lot of this is because the computer market is no longer for the
educated computer user, an educated computer user would realize a
new program was operating less efficiently than an old one; a lot
of new programs don't tell you how well they are operating in the
areas of speed and storage.

For instance: my old FTP [File Transfer Protocol] program says a
file was transferred in so many seconds, how big it was, and what
the average rate of transfer was. [150000 bytes transferred in 1
second at a rate of 150KB/sec] The new programs do not allow for
such an easy recording of how much data was transferred or length
of time required for the transfer.

As the programs transfer more and more files, to get you less and
less information in fancier and fancier manners, you get less for
your time and your money, especially as more "bells and whistles"
are added to these files.

If the Moguls get their way there will no longer be any transfers
in "Plain Vanilla ASCII" such as this message, other than for the
short Email-type messages.

This means that you may have to transfer as much as the "Complete
Works of Shakespeare" at 5 megabytes. . .just to browse through a
set of cute menus on your GUI [Graphic User Interface] netsurfing
program. . .which will make it more like surfing peanut butter in
contrast to the speeds available with Plain Vanilla ASCII.

In addition, your computer has to spend an inordinate amount of a
job just figuring out how to place the information on your screen
when it is presented in a cute graphic manner, as it also does in
programs in which network connections are made and broken several
times to the same machine during the same job.

Some people think this is a better way than to stay connected for
an entire session, others thing a single connection is best. The
process is much like what happens if you are running errands in a
car. . .let's suppose you are going to make half a dozen stops in
the next hour. . .and let's further suppose you have someone with
you to run in and out.

The question is: do you turn off the car at every stop or do you
keep it running?

I am one of those people who keeps it running because I know that
it takes a huge amount of energy to start a car, and also creates
an inordinate amount of wear and tear.

I had the same philosophy about making and breaking the networked
connections for transferring files. What you save may not be the
equivalent of what you have spent in the process.

[A detailed explanation is below]
[I understand that since I started writing this series, "caching"
of often used files by these programs is taking place, which I am
certainly going to encourage. It would be even nicer if I should
be able to cache some of these programs myself, so I did not have
to download them and run them every time I logged in, or rather a
program did not have to do it for me. In my latest ventures I am
downloading the local "front page" to my own PC in the hopes that
the programs will not download and format it again, which takes a
huge amount of cpu time, but so far that hasn't stopped the first
download of the front pages.]

Starting a car takes an awfully great amount of energy that comes
from the battery, goes to an electric motor in the starter, turns
over the engine until it starts, and then lets go. During this a
thousand amps of electric current could be expended for a time; I
just made a call and found that the new cars are more efficient--
825 amps was the most powerful battery the neighborhood place has
in stock at the moment. I try to be accurate, so let's say these
cars take 500 amps to start.

To give you a comparison, when you blow a fuse in your house, car
or whatever [or blow a breaker] it is usually because the current
hit 15 amps for a fraction of a second. Thus running a starter a
few seconds can use up the same amount of energy and gasoline the
car would take to keep running for a much longer period of time.

When you are running an automated downloading program that breaks
the connection after every question you ask it, and you spend the
next few seconds or minutes trying to get a new connection. . .it
will become obvious to you just what I am talking about.

Not only have I seen this in my own experience, but most demos of
these kinds of programs have the same kinds of results, except of
course, that tens or hundreds of people get to sit there waiting,
while the operator gets to explain that this happens sometimes.

My own philosophy is that everything possible should be kept here
. . .here being wherever you are. . .because connections cost the
same as the storage to save the material.

Example: let's suppose you buy your floppies four for a dollar--
1.44M floppies are available at that price via mail order: three
for a dollar at the best retail outlet I have seen.

Let's further suppose you want to look at Alice in Wonderland and
that making a network connections costs a nickel [or a penny. . .
around here a phone call from home costs 5.6 cents].

If you dial in with one of these "make connection and break" type
of programs, you may have several connections before you actually
see ANY of Alice in Wonderland. [I know this is hard to imagine,
especially on the part of those of you who call 800 numbers.] It
still costs something to someone every time you make a connection
to any phone or machine.

Alice in Wonderland is a short book, and about 10 copies could be
put on one floppy, not counting compression. If you compress, it
becomes 20 or more copies, depending on your compression program.

The point is. . .it costs MORE to download the file, if that were
the whole point of your connection, than it does to store it. . .
right there with you all the time. . .whether the connections are
open that day or not.

If you got 23 copies of Alice on the floppy you paid 23 cents for
then it only costs you one cent to keep Alice locally.

50 cents for Shakespeare or the Bible.

And the prices are dropping constantly.

However, if you try to download these in HTML, not only will this
take more space and time to download, it will also take more time
to get it onto your screen, more time to search, etc.

Try it sometime.

Are the bells and whistles and the picture of Shakespeare worthy?

I am trying to keep in perspective that only 10% of the computers
in the world are on the nets and that this figure is not changing
very much. There are now maybe 250 million computers: and maybe
25 million of them are connected in some manner to the nets; when
there were 200 million computers, only 20 million were, etc.

This perspective requires that distributions of anything intended
for more than a few percent of the population be distributed in a
manner that allows these files to be easily used by those without
a network connection, without 16 megabytes of RAM, etc.

I would prefer to keep the compatibility high and the cost low in
terms of both money and time.

My idea of preparing for 2001 is not to create a library which is
likely to take all the resources of the average computer of 2001,
but rather to create a library that would take only a fraction of
that computer's space and processing power. Our test Pentium can
search the Complete Works of Shakespeare in a second or two and I
can predict that the comparable computers of 2001 will be able to
search our whole library in less than a minute. Of course I hope
our library will not be the same size then as it is now. However
to keep the same level of efficiency the library will be in Plain
Vanilla ASCII until computers and networks with enough power will
be available to include the kinds of graphical bells and whistles
the advertising media would like you to think are feasible today.

The truth is that those will cost a lot, and could bring the Nets
grinding to a halt. . .which would allow for the argument that an
Internet free to all is not feasible. . .

while of course the truth is simply that an Internet carrying all
those bells and whistles to all is not feasible. . .yet.

The question is. . .do we deprive the masses of general education
and literacy so that the top few percent can have their bells and
whistles? Of course the total bandwidth available is rising, but
the bandwidth available to each person is not really increasing--
and as most of you well know, on slow days it is not unlike an LA
traffic jam.

Ten years ago there just weren't than many files sized a megabyte
or larger. . .now no one seems to think twice about putting out a
"Front Page" that requires a megabyte every time you log in to an
Internet WWW server. This kind of thing could become a problem a
greater number would recognize if the communication companies get
their way and can charge you more for a line your modem is using,
than for the same line when you are talking on it: especially if
they can charge your calls by the minute or by the kilobyte.

Probably not so many of you remember when some commerical service
networks charged by the minute. . .at my first computer meeting a
single movie review cost us $13 to get, as we had to go through a
maze of menus to get to it; thus many of the $25 free trials were
total rip offs, since before you knew it, you had already run out
of your $25 and were running up a real bill. The new offers of a
free month online are much better, when it is really unlimited.

I hear NSFnet is going to be history this year. Any information?

If everyone on the Nets put in $1, maybe we could save the Nets?

More about all that later, too.

My hope is that three factors will save us from these problems:

1. Wider bandwidth, of course, and hopefully the generic variety
will still be free of charge, or very close to it.

2. Better compression, without having to take huge cpu power for
uncompression. Some uncompression is so fast you will hardly
notice it, others take so long you begin to wonder if it is a
crash or not, unless it has some indication of progress.

3. More judicious usage of bells and whistles.

A. If a picture is really worth a thousand words then we are
paying too much for our pictures. The average word, used
by Shakespeare, is only 4 letters, five, if you count the
spaces between the words; how many good pictures have you
seen that were only 5,000 bytes?

B. Sound is pretty much in the same category, except for the
MIDI type sound files. Using MIDI, we can put 5,000 MIDI
files the size of Beethoven's 5th Symphony on one CDROM--
you can download the 5th and try it out, free of charge.

C. Bells and whistles merely go further in creating what our
media calls the "Information Rich and Information Poor."

If we keep Shakespeare in Plain Vanilla ASCII anyone with
a computer is likely to be able to read it for under $1--
if we add on even one bell or whistle, we eliminate a 90%
majority of the potential audience simply because they do
not have the proper hardware or software to read it.

THE SUPERHIGHWAY STEAMROLLER IS COMING TO YOU

Have you heard the rumbling? Besides on television news, I mean.

The Superhighway Steamroller is coming to you, resulting in slow,
then slower, then slowest computer and network performance.

Have you noticed the FTP rates this year are only a fraction of a
year ago? I remember FTP rates of 175k/second, but today's tests
showed an average of:

Have you noticed that the "New Kids On The Block" are now "Power-
Users" who take up more bandwidth and answer less of their Email?
The Fortune 500 types never did answer their mail very well and a
leopard can't really be expected to change its spots. . .but does
that mean we have to change ours?

What will happen when they want so much bandwidth there is hardly
any left for you?

It will be the same as following semi-trailers on the Interstate.

What will happen when your Emailbox is full of messages you can't
send a reply to?

It will be the same as the junk mail you get in the paper mail.

What will happen when it takes an hour to FTP 5M because everyone
is using less efficient programs than FTP, such as Gopher, WWW or
Mosaic, or downloading the Internet Radio Program?

What will happen when they download video, not just audio?

The answer is. . .there will be no room left for you. . .you will
be forced either to become one of THEM and pay the price, or else
forced OFF the Nets, and pay the price that way.

The very programs and "programming" that seem to hold the promise
of the future of the Nets are what are killing the Nets, due to a
very low efficiency. NCSA had to use a CRAY as a SERVER, because
they wanted their Mosaic Home Page to appear every time anyone is
using Mosaic, a Home Page complete with pictures. This is not an
efficient use of either the Nets or the hardware backing them up.

There is something wrong with the software if it takes so much of
new and expensive hardware to run it.

If a Cray is running full-time, just sending out Home Pages, then
just think how much that is loading down the Nets.

It used to be that computer operators were so valuable that their
jobs ran three shifts, 24 hours a day, back when computers had no
power and cost millions of dollars to run.

Now that today's computers can do so much more, for so much less,
and so many more people are depending on them, there is no one on
duty at night anymore. Something is not quite right there.

This is attacking the home market, too. When is the last time an
efficiently optimized program was purchased for YOUR computer?

Do you remember how small and fast DOS 3.X was? WordPerfect 4.2?
Norton's 4.5? WordStar 3.X? PCTALKIII?

Do the new versions really do that much more?

A Research Assistant of mine says that poor programming is meant,
intentionally, to force everyone to buy more hardware in order to
run the sluggish programs.

1. Increased network delay due to mass amounts of low efficiency
"programming" and programs on the Nets.

A. Internet Radio

B. Gopher, WWW, and Mosaic

C. The very programs and programming that appear to promise
the most are the ones that are removing the most.

2. "No Reply Email" . . . probably in your mailbox now.

A. How do you deal with a message you can't answer?

B. What happens when it is a commercial message [ad]?

C. What happened to "postmaster" and "root" as the place to
send your replies and questions?

D. What about people who never answer their Email?

1. If they are just late, it costs them more, why do it?

2. If they never reply, should you read their Email?

3. Remember the first Email Ad Campaigns?

Can commercialization of the Nets be a self-fulfilling prophesy??

******

Is An Information Superhighway Steamroller Coming to Flatten YOU?

Is A Superhighway Steamroller Coming to Flatten YOU?

If you have been keeping up with the way the Internet is going at
all these days, you have probably noticed the following:

1. The media is mentioning the Internet constantly.

2. The big money players have finally decided the Internet is at
the developmental stage at which they would like a monopoly--
i.e. the "Metered Internet" and "Metered Software."

3. The original Internet providers are being more restricted.

These "incidents" are probably not "accidents" and are probably a
factor in each other. . .as follows:

Premise #1. Scarcity, rarity and limited distribution have been,
and some say "should be", part of our society, from prehistory to
the remote recesses of the future.

Answer #1. The Renaissance and the Industrial Revolution and all
other major civilizations are based on non-scarcity and nearly an
unlimited distribution as compared to the severe limitations of a
previous period.

Prediction #1. The Computer Age will be known more for products,
first as electronic words, pictures and sounds, than for actually
computing anything. These products will extend beyond flat, two-
dimensional screens and printouts, to three-dimensional objects a
"3-D Computer Printer" will "print" for you.

Premise #2. The end of scarcity is not always a pleasant event--
as per those who opposed the Industrial Revolution, the Steamboat
and even the Electric Light. You might not be aware of destroyed
electrical equipment at the eve of lighting New York City; nearly
kept Edison from making his deadline and thus destroying contract
negotiation with the city. It appeared as though those companies
who controlled the previous lighting method, gaslights, were none
too pleased with progress beyond their own monopolies.

Footnote: While planning how to keep his dynamos running against
all odds and other efforts against them, Edison is reported using
the term "to keep the bugs out" in reference to keeping the light
system working, a term still in use today's advanced electronics,
and new applications. "We will make electric light so cheap that
that only the rich will be able to burn candles," was his thought
on the progress he was making.

National Geographic, Those Inventive Americans, p 140

[Sometimes I feel the same way about electronic books.]

Answer #2. Be prepared for there to be "Information Wars"--about
who controls what information, how soon you get it, how accurate,
and how up to date it is. "Knowledge is Power" and many feel the
giving away of knowledge is the giving away of their own power as
opposed to merely empowering others; why do you think educational
systems have totally fallen apart? Why do you think half of this
adult population is illiterate? [U.S. Report on Adult Literacy.]

Prediction #2. Copyright Law will become even more impossible to
understand, and more impossible for anything to be Public Domain.
The likelihood that anything copyrighted on the day your children
are born will still be under copyright the day they die. Patents
are still limited to 17 years. Is there something I am missing?

Along with copyright; I predict the Internet will become private,
simply because they cannot seem to watch people become empowered,
without feeling disempowered themselves. In the teaching world,
this should be regarded as:

1. A simple logical contradiction.

2. The barest form of hypocrisy.

3. Breach of contract.

***

Ladies and Gentlemen,

We have before us, in the Internet, a tool to bring
all information to all people, without undue delay;
a tool that is providing ever faster access to ever
increasing amounts of information.

Within a decade it should be able to provide access
to something the size and complexity of the Library
of Congress.

In fact, in certain ways, it already does.

What are the reactions?

Michael Sperberg McQueen, one of the foremost of an
assortment of authorities who spoke at recently the
"31st Annual Clinic on Library Applications of Data
Processing: Literary Texts in an Electronic Age"*1
said that many scholars are guilty of the hypocrisy
keeping their knowledge to themselves in a business
in which they are supposed to be giving a knowledge
base to the world at large. He said they fall into
a situation in which they are pressured, by various
promotion techniques, not to share this information
in order to gain more fame and glory for themselves
and for their local departments and universities.

This reminded me of a statement, defining different
kinds of professors in the following manner:

"One kind of [Shakespeare] professor believe he/she
is the best possible professor [of Shakespeare] for
knowing something [about Shakespeare] that no other
person knows, and that to share this information to
the masses would immediately result in a fall from,
as it were, the graces of [fill in the blank].

"The other kind of [Shakespeare] professor believes
that she/he is the best professor of [Shakespeare],
only when she/he teaches the [Shakespeare] greatest
amount to the greatest number of people, regardless
of any other factors."

We have a wide variety of teacher and professor and
librarian and other roles in the Age of Information
Science, and each role includes individuals who are
members of both ends of the spectrum, as well as an
assortment of varieties in between.

I'm sure most of you are at least somewhat familiar
with the "Mrs. Grundy School of Librarianship" type
of librarian or teacher, who is caricatured as this
very defensive old lady who might as well be saying
"You're not going to get any information out of me"
or "You're only going to get what I want to you get
out of this library [or school].

Hopefully, you are also somewhat familiar with this
opposite type, who would do as much as possible for
you, and might call you back three weeks after your
question, having finally tracked down the answer.

I assure you I have been exposed to both varieties.

Now what does this all have to do with the Internet
. . .everything, as we shall soon see.

More of "What are the reactions?"

When they first get on the Internet, some say:

"We are faced with an insurmountable opportunity."

"It is like trying to drink from a firehose."

"I'm drowning in a sea of information."

While others say:

"Come on in, the water's fine."

"You never need thirst for anything again, it's all
going to be here, and it's free."

"Wow, you could spend your whole life in here."

My own personal favorite is something from Grolier's
Electronic Encyclopedia stating:

=====================================================
| The trend of library policy is clearly toward
| the ideal of making all information available
| without delay to all people.
|
|The Software Toolworks Illustrated Encyclopedia (TM)
|(c) 1990 Grolier Electronic Publishing, Inc.

The question is: will we be allowed to do this?

Read on to find out.

***

What if the Information Superhighway
Becomes The Information Supertollway

At various presentations I have made
about Project Gutenberg I have heard
the question raised as to whether we
are increasing or decreasing logical
distance between the Haves/Have-Nots
with our giving away of the trillion
Etexts that is our goal by December,
in the year 2001.

Their argument is that for people of
little means, that even a phone call
to get Alice in Wonderland is a hard
thing to do, even if no charges were
made on our end of the call.

Recently I had a chance to follow up
on one of these conversations, and I
was pleased to learn that telephones
were finally in service there.

At the time of the question however,
my response was that we could not do
a project such as this any better to
provide information to the masses.

Project Gutenberg is, and always has
been, designed for implementation at
the end of the year 2001, and it has
been based on the kinds of networks,
computers, and other facilities that
time should provide, which should be
such that your local warehouse buyer
outlet "should" be selling computers
with enough space and processing for
the whole Project Gutenberg Library,
10,000 items, to run comfortably.

Project Gutenberg is, and always has
been, designed for implementation in
the majority of computer households,
but has not been tailored to fit the
1% at one end or the other of curves
representing the populations of this
world's countries.

We are doing our best to insure that
our Etexts are readable on hardware,
software and operating system of any
nature one is likely to encounter.

This means we have to avoid any such
particular implementation that would
make them more easily available to a
user of one kind of system but would
make them more difficult to use on a
system[s] of different nature.

However, there is certainly one item
we can all agree on, and that is the
cost of getting such a Public Access
Library, most of which is legally in
the Public Domain, should be free.

Any charges for the creation, using,
or maintaining of such a library are
likely to destroy either the library
or much of its impact on Information
Haves/Have Nots. . .or both.

Most of us manage to stay healthy in
the general aspects of our lives and
need minimal health care. . .yet our
government and also countries around
more of this world say the health of
population is so valuable that these
cares should be taken care of by the
application of universal health care
for everyone.

Yesterday I heard Ms. Hillary Rodham
Clinton say to us for a commencement
speech that 15% of the United States
expenditures were on health care.

Most of us do not manage to learn to
read, write, and otherwise manage an
extraordinary range of knowledge and
skills. . .without intensive school.

Many of us still take it for granted
that we can read and write something
of the complexity of this article.

That is because, even though we were
exposed to it on the news last year,
most of us are still unaware that an
illiteracy rate of just about 50% is
currently obtaining in the U.S.

This is a far greater rate than that
of those without health care, health
being something most of us have been
born with, and prior to the medicine
of the modern era, most of us should
have done without and rarely noticed
it in passing.

Preceding the revolutions for modern
medical revolutions were revolutions
in reading, writing and arithmetic--
upon which the medical revolution or
any other scientific revolution must
depend in a totality beyond which an
argument cannot be made: after all,
can you imagine modern research in a
medical institution in which doctors
could not write down their findings,
or read the findings of others, make
mathematical analyses, confirm those
findings with other doctors.

Nearly everything we build or use is
dependent on a basic ability of this
communicating and reasoning; without
reading, writing and arithmetic, and
all those disciplines that depend on
them entirely or in part, we can say
little about being civilized.

If we don't defend the Internet, who
else is going to?

=====================================================
| The trend of library policy is clearly toward
| the ideal of making all information available
| without delay to all people.
|
|The Software Toolworks Illustrated Encyclopedia (TM)
|(c) 1990, 1991 Grolier Electronic Publishing, Inc.

[Project Gutenberg Director of Communication]

=====================================================

Thank you for your time and consideration,

Michael S. Hart, Professor of Electronic Text
Executive Director of Project Gutenberg Etext
Illinois Benedictine College, Lisle, IL 60532
No official connection to U of Illinois--UIUC
hart @uiucvmd.bitnet or [email protected]

THIS MESSAGE IS A PRIVATE COMMUNICATION, INTENDED ONLY TO BE
READ BY THE PEOPLE TO WHOM IT IS ADDRESSED. RECIPIENTS OF
THIS MESSAGE MAY NOT COPY OR DISTRIBUTE IT IN WHOLE OR IN PART
WITHOUT MICHAEL HART'S WRITTEN PERMISSION, OTHER THAN TO REPLY.

COPYRIGHT 1994 PROF. MICHAEL S. HART, ALL RIGHTS RESERVED
THIS MESSAGE MAY NOT BE COPIED WITHOUT WRITTEN PERMISSION