Project Gutenberg (1971-2008)

English
Project Gutenberg (1971-2008)

Project Gutenberg (1971-2008)

Project Gutenberg
The Magna Carta, the first English constitutional text, signed in 1215. From April 2002 to October 2003, in 18 months, the number of books doubled, going from 5,000 to 10,000, with a monthly average of 300 new digitized books.

10,000 books. An impressive number if we think about all the scanned and proofread pages this number represents. A fast growth thanks to Distributed Proofreaders, a website launched in October 2000 by Charles Franks to share the proofreading of books between many volunteers. Volunteers choose one of the books listed on the site and proofread a given page. They don't have any quota to fulfill, but it is recommended they do a page per day if possible. It doesn't seem much, but with hundreds of volunteers it really adds up.

Books are also copied on CDs and DVDs. Blank CDs and DVDs cost next to nothing, as does their burning on a CD or DVD writer. Project Gutenberg sends a free CD or DVD to anyone who asks for it, and people are encouraged to make copies for a friend, a library or a school. Released in August 2003, the "Best of Gutenberg" CD contained over 600 books, as a follow-up to other CDs in the past). The first Project Gutenberg DVD was released in December 2003 to celebrate the landmark of 10,000 books, with most of the existing titles (9,400 books).

= 10,000 to 20,000 Books

In December 2003, there were 11,000 books digizited in several formats, most of them in ASCII, and some of them in HTML or XML. This represented 46,000 files, and 110 G. On 13 February 2004, the day of Michael Hart's presentation at UNESCO, in Paris, there were exactly 11,340 books in 25 languages. In May 2004, the 12,581 books represented 100,000 files in 20 different formats, and 135 gigabytes. With more than 300 new books added per month (338 books in 2004), the number of gigabytes is expected to double every year.

The Project Gutenberg Consortia Center (PGCC) was officially affiliated to Project Gutenberg in 2003. Since 1997, PGCC had been working on gathering collections of existing eBooks, as a complement to Project Gutenberg which was focusing on the production of eBooks.

In December 2003, Distributed Proofreaders Europe (DP EUrope) were launched by Project Rastko, followed by Project Gutenberg Europe (PG Europe) in January 2004. Project Gutenberg Europe celebrated its first 100 books in June 2005. These books were in several languages, a reflection of European linguistic diversity, with 100 languages planned for the long term.

In January 2005, Project Gutenberg reached the landmark of 15,000 books. eBook number 15000 is The Life of Reason, by George Santayana (published in 1906). In July 2005, Project Gutenberg of Australia (launched in 2001) reached the landmark of 500 books. New teams were getting ready to launch Project Gutenberg Canada, Project Gutenberg Portugal and Project Gutenberg Philippines over the next years.

What about languages? If there where were works in 25 languages only in February 2004, there were works in 42 languages in July 2005, including Iroquoian, Sanskrit and the Mayan languages. On July 27, 2005, out of a total of 16,800 books, the seven "main" languages were: English (with 14,548 books), French (577 books), German (349 books), Finnish (218 books), Dutch (130 books), Spanish (103 books) and Chinese (69 books). There were books in 50 languages in December 2006. On December 16, 2006, out of a total of 19,996 books, the main languages were English (17,377 books), French (966 books), German (412 books), Finnish (344 books), Dutch (244 books), Spanish (140 books), Italian (102 books), Chinese (69 books), Portuguese (68 books) and Tagalog (51 books).

In December 2006, Project Gutenberg reached the landmark of 20,000 books. eBook number 20000 was the audio book of Twenty Thousand Leagues Under the Sea (Vingt mille lieues sous les mers), by Jules Verne (published in 1869). Half of these 20,000 books were produced by Distributed Proofreaders since October 2000, with a monthly average of 346 new digitized books in 2006. If 32 years were necessary to digitize the first 10,000 books, between July 1971 and October 2003, 3 years and 2 months were necessary to digitize the following 10,000 books, between October 2003 and December 2006. Project Gutenberg of Australia was about to reach 1,500 books (this goal was achieved in April 2007) and Project Gutenberg Europe reached 500 books.

The section Project Gutenberg PrePrints was set up in January 2006 to collect items submitted to Project Gutenberg which for some reason were interesting enough to be available online, but not quite ready yet to be added to the main Project Gutenberg collection, the reason being for example missing data, low-quality files, formats which were not handy, etc. This new section had 379 files in December 2006.

= 20,000 to 25,000 Books

Project Gutenberg News began in November 2006 with Mike Cook as its editor and webmaster, as a complement to the weekly and monthly newsletters that had existed since a number of years. The website gives for example the weekly, monthly and yearly production stats since 2001. The weekly production was 24 books in 2001, 47 books in 2002, 79 books in 2003, 78 books in 2004, 58 books in 2005, 80 books in 2006 and 78 books in 2007. The monthly production was 104 books in 2001, 203 books in 2002, 348 books in 2003, 338 books in 2004, 252 books in 2005, 345 books in 2006 and 338 books in 2007. The yearly production was 1,244 books in 2001, 2,432 books in 2002, 4,176 books in 2003, 4,058 books in 2004, 3,019 books in 2005, 4,141 books in 2006 and 4,049 books in 2007.

Project Gutenberg of Canada (PGC) was founded on July 1st, 2007, on Canada Day, by Michael Shepard and David Jones, and Distributed Proofreaders of Canada (DPC) started production in December 2007. There were 100 books in March 2008, with books in English, French and Italian.

The combined Project Gutenberg projects have produced a total of 26,161 titles in 2007.

Project Gutenberg sent out 15 million books via snail mail in 2007, under the form of CDs and DVDs. Dated July 2006, the latest DVD included 17,000 books. Since 2005, CD and DVD files have also been periodically generated as ISO files to be downloaded and used to make a CD or DVD using a CD or DVD writer.

As for volunteers, Distributed Proofreaders (DP), who started production in October 2000, had over 52,000 volunteers in January 2008. DP processed 11,934 books since its beginnings. Distributed Proofreaders of Europe (DP Europe), who started production in December 2003, had over 1,500 volunteers in January 2008. Distributed Proofreaders Canada (DPC), who started production in December 2007, had over 250 volunteers in January 2008.

Project Gutenberg reached the landmark of 25,000 books in April 2008. eBook number 25000 was English Book Collectors, by William Younger Fletcher (published in 1902). On April 21, 2008, out of a total of 25,004 books, the main languages were English (21,475 books), French (1,168 books), German (530 books), Finnish (433 books), Dutch (326 books), Portuguese (217 books), Chinese (196 books), Spanish (180 books), Italian (128 books), Latin (55 books) and Tagalog (54 books). And there were books in Esperanto (45 books), Swedish (40 books), Danish (20 books), Catalan (19 books), Welsh (10 books), Norwegian (10 books), Russian (7 books), Icelandic (7 books), Hungarian (7 books), Middle English (6 books), Greek (6 books) and Bulgarian (6 books).


Whether digitized years ago or now, all the books are digitized in 7-bit plain
ASCII (American Standard Code for Information Interchange), called Plain Vanilla
ASCII. Used since the beginnings of computing, it is the set of unaccented
characters present on a standard