• Email
  • Single Page
  • Print

The Library in the New Age

The letters of many other philosophers, from Locke and Bayle to Bentham and Bernardin de Saint-Pierre, will be integrated into this database, so that scholars will be able to trace references to individuals, books, and ideas throughout the entire network of correspondence that undergirded the Enlightenment. Many other such projects—notably American Memory sponsored by the Library of Congress1 and the Valley of the Shadow created at the University of Virginia2 —have demonstrated the feasibility and usefulness of databases on this scale. But their success does not prove that Google Book Search, the largest undertaking of them all, will make research libraries obsolete. On the contrary, Google will make them more important than ever. To support this view, I would like to organize my argument around eight points.

  1. According to the most utopian claim of the Googlers, Google can put virtually all printed books on-line. That claim is misleading, and it raises the danger of creating false consciousness, because it may lull us into neglecting our libraries. What percentage of the books in the United States —never mind the rest of the world— will be digitized by Google: 75 percent? 50 percent? 25 percent? Even if the figure is 90 percent, the residual, nondigitized books could be important. I recently discovered an extraordinary libertine novel, Les Bohémiens, by an unknown author, the marquis de Pelleport, who wrote it in the Bastille at the same time that the marquis de Sade was writing his novels in a nearby cell. I think that Pelleport’s book, published in 1790, is far better than anything Sade produced; and whatever its aesthetic merits, it reveals a great deal about the condition of writers in pre-Revolutionary France. Yet only six copies of it exist, as far as I can tell, none of them available on the Internet.^3

    If Google missed this book, and other books like it, the researcher who relied on Google would never be able to locate certain works of great importance. The criteria of importance change from generation to generation, so we cannot know what will matter to our descendants. They may learn a lot from studying our Harlequin novels or computer manuals or telephone books. Literary scholars and historians today depend heavily on research in almanacs, chapbooks, and other kinds of “popular” literature, yet few of those works from the seventeenth and eighteenth centuries have survived. They were printed on cheap paper, sold in flimsy covers, read to pieces, and ignored by collectors and librarians who did not consider them “literature.” A researcher in Trinity College, Dublin recently discovered a drawer full of forgotten ballad books, each one the only copy in existence, each priceless in the eyes of the modern scholar, though it had seemed worthless two centuries ago.

  2. Although Google pursued an intelligent strategy by signing up five great libraries, their combined holdings will not come close to exhausting the stock of books in the United States. Contrary to what one might expect, there is little redundancy in the holdings of the five libraries: 60 percent of the books being digitized by Google exist in only one of them. There are about 543 million volumes in the research libraries of the United States. Google reportedly set its initial goal of digitizing at 15 million. As Google signs up more libraries—at last count, twenty-eight are participating in Google Book Search—the representativeness of its digitized database will improve. But it has not yet ventured into special collections, where the rarest works are to be found. And of course the totality of world literature—all the books in all the languages of the world—lies far beyond Google’s capacity to digitize.

  3. Although it is to be hoped that the publishers, authors, and Google will settle their dispute, it is difficult to see how copyright will cease to pose a problem. According to the copyright law of 1976 and the copyright extension law of 1998, most books published after 1923 are currently covered by copyright, and copyright now extends to the life of the author plus seventy years. For books in the public domain, Google probably will allow readers to view the full text and print every page. For books under copyright, however, Google will probably display only a few lines at a time, which it claims is legal under fair use.

    Google may persuade the publishers and authors to surrender their claims to books published between 1923 and the recent past, but will it get them to modify their copyrights in the present and future? In 2006, 291,920 new titles were published in the United States, and the number of new books in print has increased nearly every year for the last decade, despite the spread of electronic publishing. How can Google keep up with current production while at the same time digitizing all the books accumulated over the centuries? Better to increase the acquisitions of our research libraries than to trust Google to preserve future books for the benefit of future generations. Google defines its mission as the communication of information—right now, today; it does not commit itself to conserving texts indefinitely.

  4. Companies decline rapidly in the fast-changing environment of electronic technology. Google may disappear or be eclipsed by an even greater technology, which could make its database as outdated and inaccessible as many of our old floppy disks and CD-ROMs. Electronic enterprises come and go. Research libraries last for centuries. Better to fortify them than to declare them obsolete, because obsolescence is built into the electronic media.

  5. Google will make mistakes. Despite its concern for quality and quality control, it will miss books, skip pages, blur images, and fail in many ways to reproduce texts perfectly. Once we believed that microfilm would solve the problem of preserving texts. Now we know better.

  6. As in the case of microfilm, there is no guarantee that Google’s copies will last. Bits become degraded over time. Documents may get lost in cyberspace, owing to the obsolescence of the medium in which they are encoded. Hardware and software become extinct at a distressing rate. Unless the vexatious problem of digital preservation is solved, all texts “born digital” belong to an endangered species. The obsession with developing new media has inhibited efforts to preserve the old. We have lost 80 percent of all silent films and 50 percent of all films made before World War II. Nothing preserves texts better than ink imbedded in paper, especially paper manufactured before the nineteenth century, except texts written on parchment or engraved in stone. The best preservation system ever invented was the old-fashioned, pre-modern book.

  7. Google plans to digitize many versions of each book, taking whatever it gets as the copies appear, assembly-line fashion, from the shelves; but will it make all of them available? If so, which one will it put at the top of its search list? Ordinary readers could get lost while searching among thousands of different editions of Shakespeare’s plays, so they will depend on the editions that Google makes most easily accessible. Will Google determine its relevance ranking of books in the same way that it ranks references to everything else, from toothpaste to movie stars? It now has a secret algorithm to rank Web pages according to the frequency of use among the pages linked to them, and presumably it will come up with some such algorithm in order to rank the demand for books. But nothing suggests that it will take account of the standards prescribed by bibliographers, such as the first edition to appear in print or the edition that corresponds most closely to the expressed intention of the author.

    Google employs hundreds, perhaps thousands, of engineers but, as far as I know, not a single bibliographer. Its innocence of any visible concern for bibliography is particularly regrettable in that most texts, as I have just argued, were unstable throughout most of the history of printing. No single copy of an eighteenth-century best-seller will do justice to the endless variety of editions. Serious scholars will have to study and compare many editions, in the original versions, not in the digitized reproductions that Google will sort out according to criteria that probably will have nothing to do with bibliographical scholarship.

  8. Even if the digitized image on the computer screen is accurate, it will fail to capture crucial aspects of a book. For example, size. The experience of reading a small duodecimo, designed to be held easily in one hand, differs considerably from that of reading a heavy folio propped up on a book stand. It is important to get the feel of a book—the texture of its paper, the quality of its printing, the nature of its binding. Its physical aspects provide clues about its existence as an element in a social and economic system; and if it contains margin notes, it can reveal a great deal about its place in the intellectual life of its readers.

Books also give off special smells. According to a recent survey of French students, 43 percent consider smell to be one of the most important qualities of printed books—so important that they resist buying odorless electronic books. CaféScribe, a French on-line publisher, is trying to counteract that reaction by giving its customers a sticker that will give off a fusty, bookish smell when it is attached to their computers.

When I read an old book, I hold its pages up to the light and often find among the fibers of the paper little circles made by drops from the hand of the vatman as he made the sheet—or bits of shirts and petticoats that failed to be ground up adequately during the preparation of the pulp. I once found a fingerprint of a pressman enclosed in the binding of an eighteenth-century Encyclopédie—testimony to tricks in the trade of printers, who sometimes spread too much ink on the type in order to make it easier to get an impression by pulling the bar of the press.

I realize, however, that considerations of “feel” and “smell” may seem to undercut my argument. Most readers care about the text, not the physical medium in which it is embedded; and by indulging my fascination with print and paper, I may expose myself to accusations of romanticizing or of reacting like an old-fashioned, ultra-bookish scholar who wants nothing more than to retreat into a rare book room. I plead guilty. I love rare book rooms, even the kind that make you put on gloves before handling their treasures. Rare book rooms are a vital part of research libraries, the part that is most inaccessible to Google. But libraries also provide places for ordinary readers to immerse themselves in books, quiet places in comfortable settings, where the codex can be appreciated in all its individuality.

In fact, the strongest argument for the old-fashioned book is its effectiveness for ordinary readers. Thanks to Google, scholars are able to search, navigate, harvest, mine, deep link, and crawl (the terms vary along with the technology) through millions of Web sites and electronic texts. At the same time, anyone in search of a good read can pick up a printed volume and thumb through it at ease, enjoying the magic of words as ink on paper. No computer screen gives satisfaction like the printed page. But the Internet delivers data that can be transformed into a classical codex. It already has made print-on-demand a thriving industry, and it promises to make books available from computers that will operate like ATM machines: log in, order electronically, and out comes a printed and bound volume. Perhaps someday a text on a hand-held screen will please the eye as thoroughly as a page of a codex produced two thousand years ago.

Meanwhile, I say: shore up the library. Stock it with printed matter. Reinforce its reading rooms. But don’t think of it as a warehouse or a museum. While dispensing books, most research libraries operate as nerve centers for transmitting electronic impulses. They acquire data sets, maintain digital repositories, provide access to e-journals, and orchestrate information systems that reach deep into laboratories as well as studies. Many of them are sharing their intellectual wealth with the rest of the world by permitting Google to digitize their printed collections. Therefore, I also say: long live Google, but don’t count on it living long enough to replace that venerable building with the Corinthian columns. As a citadel of learning and as a platform for adventure on the Internet, the research library still deserves to stand at the center of the campus, preserving the past and accumulating energy for the future.


Who Will Digitize the World’s Books? August 14, 2008

Google Without Pix July 17, 2008

  1. 1

    It is, according to the site, “a digital record of American history and creativity,” including sound recordings, prints, maps, and many other images.

  2. 2

    An archive of letters, diaries, official records, periodicals, and images documenting the life of two communities— one Northern, one Southern—two hundred miles apart in the Shenandoah Valley during the years 1859–1870.

  • Email
  • Single Page
  • Print