This is the first draft "standard.940" which is the short form
of the 1994 Project Gutenberg standards file.  You should also
receive .941 and .942 later, which will be medium and long.
 
Given that this is an entirely new format, I am sure we have a
few things left out, so your suggestions and corrections are a
a great value to us, as always.
 
The standard.gut files for 1994 will be provided a new format,
[as are most Project Gutenberg files this year] including both
a simplified procedure for those working alone [use a "program
editor" to check your work" AND also reincluding a few mark-up
procedures looking forward to the future, when bold, underline
and italics will be a part of UNIcode or whatever ASCII super-
set comes over the horizon].
 
Here is the draft, in the simplest possible :
 
 
 
standard.940
 
 
Table of Contents  [Search for [[X]] to find that section]
 
 
[[1]]  Goals
 
[[2]]  How We Hope To Accomplish These Goals
 
[[3]]  Current Standards For Release
 
[[4]]  Standards For Future Etexts
 
[[5]  Hyphenation and Margination
 
 
 
standard.940
 
 
[[1]]  Goals
 
The goal of Project Gutenberg is the creation and distribution
of 10,000 Etexts to 100,000,000 people by the end of 2001, and
after that a Public Domain  that will include not just
include a LISTING of all the materials that are going into the
Public Domain, but also the CONTENTS of some or all of them; I
would hope the Public Domain  would eventually include
ALL Public Domain materials in both an Index and in Content.
 
 
[[2]]  How We Hope To Accomplish These Goals
 
To accomplish this goal, Project Gutenberg encourages the book
to Etext conversion in a number of ways, and also maintains an
extensive distribution network.  All of this is done in "hands
off" manners, in which nearly total independence is granted to
all those working with or for Project Gutenberg.  We create an
Etext edition from a variety of sources that have been cleared
of copyright restrictions by our legal volunteers [we have one
of the finest legal teams in U.S. and International copyrights
which also doubles in output each year, to keep up with double
output in the number of Etexts we distribute each year.  Since
copyright is a vastly important subject to us, we request that
you put us in  with ANY lawyers you know who might have
an interest in copyrights in any manner.
 
 
[[3]]  Current Standards For Release
 
Currently most of our files are distributed in a Plain Vanilla
ASCII file structure very similar to that used in email.  Each
line ends in a "hard return" [cr/lf=carriage return/line feed]
so that it can be read in DOS, Mac and UNIX programs.  The DOS
programs want both cr and lf, Macs want a cr, UNIX wants a lf.
 
The reason for the Plain Vanilla ASCII format is simple, other
formats simply won't REACH 95% of the computer populations out
there, since NOT MORE THAN 5% of the 200,000,000 computers use
any particular form of markup, with WordPerfect being the most
popular form of markup at 5%, with 10,000,000 sales recorded.
 
Our goal is to get these Etexts to EVERYONE on and off the net
and on whatever hardware/software combinations they like.
 
 
 
[[4]]  Standards For Future Etexts
 
In the future we hope the common ASCII character [super]set is
going to include ways to include bold, italics and underscore,
but for the present these methods of emphasis carry little for
the average reader, other than for general emphasis which will
currently be represented by CAPITALIZING the emphasized words,
and other, more technical reasons for using these, to indicate
names of boats, newspapers, books, etc., are not used today in
Project Gutenberg Etexts because they clutter up the page with
little or no addition of meaning, they are merely conventions.
 
However, as we do hope that some easy and searchable UNIcodes,
or whatever, will be developed, we still keep some of this for
future use, and you are welcome to include an extra copy of an
Etext with the following forms of markup:
 
~italics~. . .of the NON-emphatic nature:  titles, etc.
~~italics~~. . .of the EMPHATIC nature, now done in CAPS.
*bold*
_underline_ or _underscore_
 
This simple form of markup will allow you [or us] to eliminate
the extra markings after capitalizing the emphatic italics and
still to preserve a second file for the future system.
 
, you can't find "To *be* or *not* to be" with "search
programs" looking for "To be or not to be". . .yet. . .and the
ability to search an Etext for what you are looking for in the
space of a few seconds is one of the prime uses of Etexts.
 
Other than this, the primary differences you may note in Etext
formatted by Project Gutenberg are:
 
Two spaces after every sentence or after a colon [:].
 
Two hard returns after each paragraph.
 
Three for wide paragraph separations or between sections.
 
Four hard returns at the end of a chapter.
 
In a nutshell, this is it.
 
We gratefully accept Etexts in ANY format:  and are willing to
do the work ourselves to get them into this format if needed.
 
 
[[5]  Hyphenation and Margination
 
You may have also noted that most Project Gutenberg Etexts had
most hyphenation at the ends of lines removed.  We now have an
experimental program to do this, and you are encouraged to see
how this works on YOUR Etext by ing Rick McGowan to put
your Etext through this process.  [[email protected]]
 
You may also have noted that many Project Gutenberg Etexts are
additionally marginated to eliminate "widows and orphans" on a
line by line level, so that phrases or sentences do not have a
word of importance dangling off on another line.  This makes a
Project Gutenberg Etext easier to read AND to search, as many/
most search programs do NOT search from the end of one line to
the beginning of the next.  If you consider that this article,
such as it is, has somewhere just over 10 words of the average
line, searching for a phrase of three or four words should get
a "hit" only about twice as often as a "miss", because a third
of the time some of the phrase would be on a different line.
 
This is a VERY important factor in using Etexts, and while few
people enjoy doing the remargination, it can easily make Etext
searching a much more powerful tool for our readers, and hopes
are that a new experimental margination program will also help
you in this process.