‏اللغة: English
Project Gutenberg (1971-2008)

دار النشر: Project Gutenberg
Proofreaders became an official Project Gutenberg site. In May 2006, Distributed Proofreaders became a separate entity and continues to maintain a strong relationship with Project Gutenberg.

Volunteers don't have a quota to fill, but it is recommended they do a page a day if possible. It doesn't seem much, but with hundreds of volunteers it really adds up. In 2003, about 250-300 people were working each day all over the world, producing a daily total of 2,500-3,000 pages, the equivalent of two pages a minute. In 2004, the average was 300-400 proofreaders participating each day, and finishing 4,000-7,000 pages per day, the equivalent of four pages a minute. The number of books that have been processed through Distributed Proofreaders has grown fast, with a total of 3,000 books in February 2004, 5,000 books in October 2004 and 7,000 books in May 2005, 8,000 books in February 2006 and 10,000 books in March 2007, with five books produced per day and 52,000 volunteers in December 2007.

From the website one can access a program that allows several proofreaders to be working on the same book at the same time, each proofreading on different pages. This significantly speeds up the proofreading process. Volunteers register and receive detailed instructions. For example, words in bold, italic or underlined, or footnotes are always treated the same way for any book. A discussion forum allows them to ask questions or seek help at any time. A project manager oversees the progress of a particular book through its different steps on the website.

The website gives a full list of the books that are: a) completed, i.e. processed through the site and posted to Project Gutenberg; b) in progress, i.e. processed through the site but not yet posted, because currently going through their final proofreading and assembly; c) being proofread, i.e. currently being processed. On August 3, 2005, 7,639 books were completed, 1,250 books were in progress and 831 books were being proofread. On May 1st, 2008, 13,039 books were completed, 1,840 books were in progress and 1,000 books were being proofread.

Each time a volunteer (proofreader) goes to the website, s/he chooses a book, any book. One page of the book appears in two forms side by side: the scanned image of one page and the text from that image (as produced by OCR software). The proofreader can easily compare both versions, note the differences and fix them. OCR is usually 99% accurate, which makes for about 10 corrections a page. The proofreader saves each page as it is completed and can then either stop work or do another. The books are proofread twice, and the second time only by experienced proofreaders. All the pages of the book are then formatted, combined and assembled by post-processors to make an eBook. The eBook is now ready to be posted with an index entry (title, subtitle, author, eBook number and character set) for the database. Indexers go on with the cataloging process (author's dates of birth and death, Library of Congress classification, etc.) after the release.

Volunteers can also work independently, after contacting Project Gutenberg directly, by keying in a book they particularly like using any text editor or word processor. They can also scan it and convert it into text using OCR software, and then make corrections by comparing it with the original. In each case, someone else will proofread it. They can use ASCII and any other format. Everybody is welcome, whatever the method and whatever the format.

New volunteers are most welcome too at Distributed Proofreaders (DP), Distributed Proofreaders Europe (DP Europe) and Distributed Proofreaders Canada (DPC). Any volunteer anywhere is welcome, for any language. There is a lot to do. As stated on both websites, "Remember that there is no commitment expected on this site. Proofread as often or as seldom as you like, and as many or as few pages as you like. We encourage people to do 'a page a day', but it's entirely up to you! We hope you will join us in our mission of 'preserving the literary history of the world in a freely available form for everyone to use'."


What about languages? First Project Gutenberg's books are mostly in English. As it has been based in the United States since 1971, it has focused on the English-speaking community in the country and worldwide. Multilingualism started in 1997.

In October 1997, Michael Hart expressed his intention to include books in other languages. At the beginning of 1998, the catalog had a few titles in French (10 titles), German, Italian, Spanish and Latin. In July 1999, Michael wrote: "I am publishing in one new language per month right now, and will continue as long as possible."

In February 2004, there were works in 25 languages. In July 2005, there were works in 42 languages, including Iroquoian, Sanskrit and the Mayan languages. The seven main languages — with more than 50 books — were English, French, German, Finnish, Dutch, Spanish and Chinese. In December 2006, there were books in 50 languages. They were ten main languages, the above ones plus Italian, Portuguese and Tagalog. In April 2008, there were books in 55 languages, with eleven main languages, the above ones plus Latin. Esperanto was not far with 45 books, and Swedish followed with 40 books.

French is the second main language after English. On February 13, 2004, there were 181 books in French (out of a total of 11,340 books). On May 16, 2005, there were 547 books in French (out of a total of 15,505 books). The number tripled in 15 months. On July 27, 2005, there were 577 books in French (out of a total of 16,800 books). On December 16, 2006, there were 966 books in French (out of a total of 19,996 books). On April 21, 2008, there were 1,168 books in French (out of a total of 25,004 books). The number of French books is expected to rise significantly in a few years, when Project Gutenberg Europe will run at full speed.

What were the first books posted in French? They were six novels by Stendhal and two novels by Jules Verne, all released in early 1997. The six novels by Stendhal were: L'Abbesse de Castro, Les Cenci, La Chartreuse de Parme, La Duchesse de Palliano, Le Rouge et le Noir and Vittoria Accoramboni. The two novels by Jules Verne were: De la terre à la lune and Le tour du monde en quatre-vingts jours. In early 1997, whereas Project Gutenberg offered no English version of any of Stendhal's writings (yet), three of Jules Verne's novels were available in English: 20,000 Leagues Under the Seas (original title: Vingt mille lieues sous les mers), posted in September 1994; Around the World in 80 Days (original title: Le tour du monde en quatre-vingts jours), posted in January 1994 and From the Earth to the Moon (original title: De la terre à la lune), posted in September 1993. Stendhal and Jules Verne were followed by Edmond Rostand with Cyrano de Bergerac, posted in March 1998.

In late 1999, the "Top 20" —the 20 most downloaded authors— included Jules Verne at 11 and Emile Zola at 16. They still have a very good ranking in the present "Top 100".

As a side remark, the first "images" ever made available by Project Gutenberg were French Cave Paintings, posted in April 1995, with an XHTML version posted in November 2000. This book contains four photos of paleolithic paintings found in a grotto located in Ardèche, a region of south-eastern France. These photos, which are copyrighted, were made available to Project Gutenberg thanks to Jean Clottes, a French general curator for cultural heritage (conservateur général du patrimoine), for everyone to enjoy them.

In 2004, multilingualism became one of the priorities of Project Gutenberg, like
internationalization. Michael Hart went off to Europe, with stops in Paris,
Brussels and Belgrade.