You are here

قراءة كتاب Project Gutenberg (1971-2005)

تنويه: تعرض هنا نبذة من اول ١٠ صفحات فقط من الكتاب الالكتروني، لقراءة الكتاب كاملا اضغط على الزر “اشتر الآن"

‏اللغة: English
Project Gutenberg (1971-2005)

Project Gutenberg (1971-2005)

تقييمك:
0
No votes yet
المؤلف:
دار النشر: Project Gutenberg
الصفحة رقم: 5

and the background color. Another eagerly expected conversion is that of a book from one language to another by machine translation software. This may be possible in a few years, when machine translation is accurate to 99%.

5. DISTRIBUTED PROOFREADERS, TO HANDLE SHARED PROOFREADING

The main "leap forward" of Project Gutenberg in the last few years is due to
Distributed Proofreaders.

Distributed Proofreaders was conceived in 2000 by Charles Franks to help in the digitizing of public domain books. Originally meant to assist Project Gutenberg in the handling of shared proofreading, Distributed Proofreaders became the main source of Project Gutenberg eBooks. In 2002, Distributed Proofreaders became an official Project Gutenberg site.

The number of eBooks that have been processed through Distributed Proofreaders has grown fast, with a total of 3,000 eBooks in February 2004, 5,000 eBooks in October 2004 and 7,000 eBooks in May 2005. On August 3, 2005, 7,639 books were complete (processed through the site and posted to Project Gutenberg), 1,250 books were in progress (processed through the site but not yet posted, because currently going through their final proofreading and assembly), and 831 books were being proofread (currently being processed).

From the website one can access a program that allows several proofreaders to be working on the same book at the same time, each proofreading on different pages. This significantly speeds up the proofreading process. Volunteers register and receive detailed instructions. For example, words in bold, italic or underlined, or footnotes are always treated the same way for any eBook. A discussion forum allows them to ask questions or seek help at any time. A project manager oversees the progress of a particular book through its different steps on the website.

Each time proofreaders go to the website, they choose the book they want. One page of the book appears in two forms side by side: the scanned image of one page and the text from that image (as produced by OCR software). The proofreader can easily compare both versions, note the differences and fix them. OCR is usually 99% accurate, which makes for about 10 corrections a page. The proofreader saves each page as it is completed and can then either stop work or do another. The books are proofread twice, and the second time only by experienced proofreaders. All the pages of the book are then formatted, combined and assembled by post-processors to make an eBook. (For more detailed information, check the FAQ Central.) The eBook is now ready to be posted with an index entry (title, subtitle, author, eBook number and character set) for the database. Indexers go on with the cataloguing process (author's dates of birth and death, Library of Congress classification, etc.) after the release.

Volunteers don't have a quota to fill, but it is recommended they do a page a day if possible. It doesn't seem much, but with hundreds of volunteers it really adds up. In 2003, about 250-300 people were working each day all over the world, producing a daily total of 2,500-3,000 pages, the equivalent of two pages a minute. In 2004, the average was 300-400 proofreaders participating each day, and finishing 4,000-7,000 pages per day, the equivalent of four pages a minute.

Volunteers can also work independently, after contacting Project Gutenberg directly, by keying in a book they particularly like using any text editor or word processor. They can also scan it and convert it into text using OCR software, and then make corrections by comparing it with the original. In each case, someone else will proofread it. They can use ASCII and any other format. Everybody is welcome, whatever the method and whatever the format.

New volunteers are most welcome too at Distributed Proofreaders (DP-INT) and Distributed Proofreaders Europe (DP Europe). Any volunteer anywhere is welcome, for any language. There is a lot to do. As stated on both websites, "Remember that there is no commitment expected on this site. Proofread as often or as seldom as you like, and as many or as few pages as you like. We encourage people to do 'a page a day', but it's entirely up to you! We hope you will join us in our mission of 'preserving the literary history of the world in a freely available form for everyone to use'."

6. EBOOKS IN MORE AND MORE LANGUAGES

What about languages?

Initially, the eBooks were mostly in English. As Project Gutenberg is based in the United States, it first focused on the English-speaking community in the country and worldwide.

In October 1997, Michael Hart expressed his intention to expand the publishing of eBooks in other languages. At the beginning of 1998, the catalog had a few titles in French (10 titles), German, Italian, Spanish and Latin. In July 1999, Michael wrote: "I am publishing in one new language per month right now, and will continue as long as possible."

In early 2004, there were works in 25 languages. In July 2005, there were works in 42 languages, including Iroquoian, Sanskrit and the Mayan languages. The seven "main" languages were: English (with 14,548 books on July 27, 2005), French (577 books), German (349 books), Finnish (218 books), Dutch (130 books), Spanish (103 books) and Chinese (69 books).

Let us take French as an example. On February 13, 2004, there were 181 eBooks in French (out of a total of 11,340 eBooks). On May 16, 2005, there were 547 eBooks in French (out of 15,505 Books). The number tripled in 15 months. This number should rise significantly during the next few years, notably with Project Gutenberg Europe (launched in June 2005).

What were the first eBooks posted in French? They were six novels by Stendhal and two novels by Jules Verne, all released in early 1997. The six novels by Stendhal were: L'Abbesse de Castro, Les Cenci, La Chartreuse de Parme, La Duchesse de Palliano, Le Rouge et le Noir and Vittoria Accoramboni. The two novels by Jules Verne were: De la terre à la lune and Le tour du monde en quatre-vingts jours. In early 1997, whereas Project Gutenberg offered no English version of any of Stendhal's writings (yet), three of Jules Verne's novels were available in English: 20,000 Leagues Under the Seas (original title: Vingt mille lieues sous les mers), posted in September 1994; Around the World in 80 Days (original title: Le tour du monde en quatre-vingts jours), posted in January 1994 and From the Earth to the Moon(original title: De la terre à la lune), posted in September 1993. Stendhal and Jules Verne were followed by Edmond Rostand with Cyrano de Bergerac, posted in March 1998.

In late 1999, the "Top 20" —the 20 most downloaded authors— included Jules Verne at 11 and Emile Zola at 16. They still have a very good ranking in the present "Top 100".

As a side remark, the first "images" ever made available by Project Gutenberg were French Cave Paintings, posted in April 1995, with an XHTML version posted in November 2000. This eBook contains four photos of paleolithic paintings found in a grotto located in Ardèche, a region of south-eastern France. These photos, which are copyrighted, were made available to Project Gutenberg thanks to Jean Clottes, a French general curator for cultural heritage (conservateur général du patrimoine), for everyone to enjoy them.

Multilingualism is now one of the priorities of Project Gutenberg, like internationalization. In early 2004, Michael Hart went off to Europe, with stops in Paris, Brussels and Belgrade. He gave a lecture on February 12, 2004 at UNESCO (United Nations Educational, Scientific and Cultural Organization) headquarters in Paris. He chaired a discussion at the French National

Pages