I’m sorry, I just can’t come up with the great posting titles the rest of you do.
The first book I looked for was Lux Mundi (1890), a collection of Anglo-Catholic theological essays edited by Charles Gore. My reason for doing so was practical, since Travis Brown and I are using scanned images from this book, fed through OCR tools like Tesseract and OCRopus, for the ActiveOCR project at MITH. I won’t say the book was chosen at random, but close to it. Travis wanted something from the late 19th century, and suggested that I search for everything in the Hathi Trust collection published in 1890.
The fact that the only other collection it appears in, however, is Google Books rules it out for the purpose of this assignment.
Deciding to stick with the theme of 19th century divines, I looked for John Henry Newman’s The Idea of the University, and found it on Project Gutenberg, the Internet Archive, Hathi Trust and Google Books.
As several other have noted, Project Gutenberg provides the most formats and the least provenance information. The book is available in HTML, EPUB, Kindle, PDF, Plucker, QiOO Mobile, Plain Text UTF-8 and TEI. All of these in addition, of course, to the Online Reader. Some of these formats seem a bit obscure to me — I had to look up Plucker (apparently an e-book reader for PalmOS devices), and QiOO (I’m guessing a reader for Android phones, since it’s Java-based, although they didn’t use the name Android). I fired up the oXygen editor to take a look at the TEI file , and it appears to be TEI (P5?) Lite with a Project Gutenberg-specific modified DTD. Although there are credits for the people responsible for preparing the files for Project Gutenberg, there is no information about which printed text(s) provide the basis for the electronic text.
I got 26 results when I searched the Internet Archive for The Idea of a University by Newman. One of these results was for the Project Gutenberg record, which offers the book in several formats not immediately visible on Project Gutenberg’s own page, including DAISY Digital Talking Book and DjVu (pronounced déjà vu, this is a format for scanned documents that its promoters, although I suspect few others, consider a competitor to image PDFs). There were also at least three (one may have been a duplicate) results from Google Books (digitized from the University of California, Harvard, and New York Public Libraries).
I chose to look at one (26 was way too many) in detail that was contributed by “Kelly – University of Toronto”. While my first reaction was that “Kelly” might be an individual, a Google search indicated that it is a reference to the John M. Kelly Library at the University of St. Michael’s College, a Catholic university that has an institutional relationship with the public University of Toronto. This version was available in Full Text, PDF, EPUB, Kindle, Daisy and DjVu formats. The documents is in the Public Domain. There is no apparent way for users to report or correct errors. This is probably as good a place as any to note that I find the default online reader, which navigates through the text by “turning” pages, incredibly annoying. This is an misguided as the attempts of late 15th century printers to recreate the look of manuscripts in printed texts.
(This is as far as I’m going to be able to get before class, but I will update the post later with the information on the Hathi Trust and Google Books sites.)