The expanding jumble of art, science, metaphysics, practical knowledge, merchandise, gossip, and other trivia stored electronically on the World Wide Web is directly descended from the unprocessed babble transmitted haphazardly by word of mouth and from place to place from which our ancestors forged the wisdom of our species. For millennia this babble had been held in tribal memory, in languages and cultures long forgotten, until the exigencies of burgeoning commerce some six thousand years ago—a recent event in the long career of Homo sapiens—compelled the invention of written language, the sine qua non of today’s documented world including the Web itself.
The invention during World War II of electronic memory and of the World Wide Web a mere seventeen years ago originally as a way for scientists to communicate with distant colleagues is a further—perhaps the ultimate—evolution of the momentous transition from collective memory dependent largely on mnemonic verse to prosaic inscription on clay, stone, and paper. With these primitive tools human beings were at last able to record, in language of great beauty and profound understanding, the lore and wisdom accumulated during our long prehistory. What further triumphs of the human spirit may be shaped from the World Wide Web, should our species survive its current folly, are beyond imagining.
In 1998 two Stanford graduate students, Larry Page and Sergey Brin, founded Google.com, a search engine that uses a better technology than had previously existed for indexing and retrieving information from the immense miscellany of the World Wide Web and for ranking the Web sites that contain this information according to their relevance to particular queries based on the number of links from the rest of the Internet to a given item. This PageRank system transformed the Web from its original purpose as a scientists’ grapevine and from the random babble it soon became a searchable resource providing factual data of variable quality to millions of users. And once again it was the exigencies of commerce that transformed Google itself from an ingenious search technology without a business plan to a hugely profitable enterprise offering a variety of services including e-mail, news, video, maps, and its current, expensive, and utterly heroic, if not quixotic, effort to digitize the public domain contents of the books and other holdings of major libraries. This new program would provide users wherever in the world Internet connections exist access to millions of titles while enabling libraries themselves to serve millions of users without adding a foot of shelf space or incurring a penny of delivery expense.
Spurred by Google’s initiative and by the lower costs, higher profits, and immense reach of unmediated digital distribution, book publishers and other copyright holders must at last overcome their historic inertia and agree, like music publishers, to market their proprietary titles in digital form either to be read on line or, more likely, to be printed on demand at point of sale, in either case for a fee equal to the publishers’ normal costs and profit and the authors’ contractual royalty, thus for the first time in human history creating the theoretical possibility that every book ever printed in whatever language will be available to everyone on earth with access to the Internet.
Not everyone welcomes the revolution wrought by Google. Jean-Noël Jeanneney, director of the Bibliothèque Nationale, worries in Google and the Myth of Universal Knowledge that national libraries, including his own, will suffer under Google’s worldwide dominance, but nothing prevents the Bibliothèque Nationale and its counterparts from digitizing their own collections or permitting Google to do it for them as Oxford’s Bodleian Library has done. Chris Anderson, the editor of Wired, has expanded his influential essay “The Long Tail” into a best-selling book in which he shows that the vast “shelf space” of the Web permits virtually limitless digital content whose variety creates heretofore unexpected demand for relatively obscure or specialized items in a heterogeneous marketplace whose aggregate audience with its multiform interests far exceeds that for best sellers, whose current dominance reflects today’s highly centralized retail structure dependent on quick turnover of largely undifferentiated items.
The radical decentralization of the digital marketplace has already been demonstrated in the music industry and preliminary evidence suggests that greater choice will, as Chris Anderson foresees, create greater demand for a wide range of books as well. An obvious example is books in Spanish to serve the 40 million Hispanics now living in the United States and poorly served by sparse retailers. According to Mark Sandler of the University of Michigan Library, in an essay in Libraries and Google, an experiment by the library involving the digitization of 10,000 “low use” monographs offered on the Web produced “between 500,000 and one million hits per month. In the past, these works were accessible,” Sandler writes,
to a base population of 40,000 students, faculty, and staff. That’s about four readers for each book included in the project. When electronic versions of these works were made accessible to the entire world, suddenly 40,000 potential readers became 4 billion, and the odds of consumer interest jumped from 4:1 to 400,000:1. Add to that the extent to which Web access overcomes the impediments of physical delivery—request a book (sight unseen) from storage, wait twenty-four hours for delivery, come physically to the library to pick it up, etc. Electronically, we’re talking about instant gratification of a one in a million need. This is a service dream come true for libraries and library users, especially those without immediate access to a great research library collection.
Fear of a worldwide Google monopoly may therefore be unfounded as rivals add specialized segments to Google’s own long and lengthening tail.
Google was not the first search engine to filter the contents of the Web but its PageRank innovation has become the most popular way to arrange Web sites on a given subject according to their possible relevance to specific queries. Google’s inventors were also not the first to grasp the commercial implications of a technology that brings millions of searchers to specific topics and thus guides self-selected customers to a vast range of goods and services; but Google’s unique technology provides the most efficient means for juxtaposing ads with appropriate search results. Hundreds of thousands of advertisers, most of them small businesses, bidding at auction for placement adjacent to Web sites of interest to their potential customers, now pay Google for each time a searcher clicks through to their site, making Google one of the richest corporations in the world: in effect an interactive yellow pages of infinite variety serving a radically democratized world market.
The self-proclaimed goal of Google’s idealistic founders is to practice virtue, which is reflected in the company’s unofficial motto, “don’t be evil.” The confrontation of founders who wish to do only good with the complex reality of their astonishing commercial achievement is an issue of biblical scope which calls to mind the expulsion, naked and trembling, of our ancestral parents from prelapsarian Eden into a world where choice is obligatory and error inevitable, a blessing and a burden upon themselves and what Milton called, with mixed feelings, their hapless seed.
We may share private information…[if] we conclude that we are required by law or [believe] that access, preservation or disclosure of such information is reasonably necessary to protect the rights, property or safety of Google, its users or the public.
“In other words,” writes John Battelle, the author of The Search, an informative, enthusiastic, but not uncritical account of Google’s extraordinary achievement, “if Google decides that tracking and acting upon your private information is in its best interest, it can and it will.”
According to David Vise in The Google Story, another excellent history to place alongside Battelle’s The Search, the idea for Google Book Search occurred to Larry Page, Google’s co-founder, when he was still a Ph.D. candidate at Stanford and recalled his difficulty as a high school student in finding the manuals he needed for assembling electronic gadgets. In graduate school he encountered a more severe version of the problem. “Right now,” he said, “it is really hard for scholars to work outside their area of expertise because of the physical limitations of libraries.” What he envisioned was an electronic library loan system in which libraries would lend one another titles digitally rather than ship physical copies. From this practical insight grew Google Book Search with its commitment to digitize as many as 20 million public domain titles from the collections of major libraries and to challenge publishers of protected works by copying their authors’ property to permit allowable citations. How money is to be made from this vast project remains unclear and may have been a matter of indifference to the public-spirited Page when he conceived it, but sooner or later Google and its avatars will become not only the world’s multilingual library of libraries but a universal bookstore offering millions of titles to readers worldwide and monetization will follow, raising the theoretical possibility that every book ever printed in whatever language may indeed be accessed wherever Internet connections exist.
Page’s original conception for Google Book Search seems to have been that books, like the manuals he needed in high school, are data mines which users can search as they search the Web. But most books, unlike manuals, dictionaries, almanacs, cookbooks, scholarly journals, student trots, and so on, cannot be adequately represented by Googling such subjects as Achilles/wrath or Othello/jealousy or Ahab/whales. The Iliad, the plays of Shakespeare, Moby-Dick are themselves information to be read and pondered in their entirety. As digitization and its long tail adjust to the norms of human nature this misconception will cure itself as will the related error that books transmitted electronically will necessarily be read on electronic devices. Only those who have not read the Iliad or Moby-Dick, or Bleak House or Swann’s Way or The Origin of Species, will entertain this improbability. Until human beings themselves evolve as electronic receivers, readers will select such books as these—the embodiment of civilizations—as files from the World Wide Web, whence they will be transmitted either to a personal computer and printed out—a cumbersome procedure resulting in a stack of unbound sheets—or, much more satisfactorily, to a nearby machine not much bigger than an ATM which will automatically print, bind, and trim requested titles on demand that are indistinguishable from factory-made books, to be read as books have been read for centuries.
Meanwhile Google, together with the Gutenberg Project and the Open Content Alliance, and similar programs, has turned a new page in the history of civilizations leaving to us the privilege and the burden of carrying the story further. As part of this effort, On Demand Books, a company in which I have an interest, has installed in the World Bank bookstore in Washington, D.C., an experimental version of a machine such as I have just described, one that receives a digital file and automatically prints and binds on demand a library-quality paperback at low cost, within minutes and with minimal human intervention—an ATM for books. A second experimental machine has been sent to the Alexandrina Library in Egypt and will soon be printing books in Arabic. A newer version will be installed later this year or early next year in the New York Public Library.2
Wikipedia, unlike Google, Yahoo, and Microsoft, refused to restrict its content, and for the last year has been banned in China.↩
See Jason Epstein, "The Future of Books," Technology Review, January 2005.↩
‘Books @ Google’ November 30, 2006