• Email
  • Single Page
  • Print

Google & the Future of Books

Fortunately, this picture of the hard facts of life in the world of learning is already going out of date. Biologists, chemists, and physicists no longer live in separate worlds; nor do historians, anthropologists, and literary scholars. The old map of the campus no longer corresponds to the activities of the professors and students. It is being redrawn everywhere, and in many places the interdisciplinary designs are turning into structures. The library remains at the heart of things, but it pumps nutrition throughout the university, and often to the farthest reaches of cyberspace, by means of electronic networks.

The eighteenth-century Republic of Letters had been transformed into a professional Republic of Learning, and it is now open to amateurs—amateurs in the best sense of the word, lovers of learning among the general citizenry. Openness is operating everywhere, thanks to “open access” repositories of digitized articles available free of charge, the Open Content Alliance, the Open Knowledge Commons, OpenCourseWare, the Internet Archive, and openly amateur enterprises like Wikipedia. The democratization of knowledge now seems to be at our fingertips. We can make the Enlightenment ideal come to life in reality.

At this point, you may suspect that I have swung from one American genre, the jeremiad, to another, utopian enthusiasm. It might be possible, I suppose, for the two to work together as a dialectic, were it not for the danger of commercialization. When businesses like Google look at libraries, they do not merely see temples of learning. They see potential assets or what they call “content,” ready to be mined. Built up over centuries at an enormous expenditure of money and labor, library collections can be digitized en masse at relatively little cost—millions of dollars, certainly, but little compared to the investment that went into them.

Libraries exist to promote a public good: “the encouragement of learning,” learning “Free To All.” Businesses exist in order to make money for their shareholders—and a good thing, too, for the public good depends on a profitable economy. Yet if we permit the commercialization of the content of our libraries, there is no getting around a fundamental contradiction. To digitize collections and sell the product in ways that fail to guarantee wide access would be to repeat the mistake that was made when publishers exploited the market for scholarly journals, but on a much greater scale, for it would turn the Internet into an instrument for privatizing knowledge that belongs in the public sphere. No invisible hand would intervene to correct the imbalance between the private and the public welfare. Only the public can do that, but who speaks for the public? Not the legislators of the Mickey Mouse Protection Act.

You cannot legislate Enlightenment, but you can set rules of the game to protect the public interest. Libraries represent the public good. They are not businesses, but they must cover their costs. They need a business plan. Think of the old motto of Con Edison when it had to tear up New York’s streets in order to get at the infrastructure beneath them: “Dig we must.” Libraries say, “Digitize we must.” But not on any terms. We must do it in the interest of the public, and that means holding the digitizers responsible to the citizenry.

It would be naive to identify the Internet with the Enlightenment. It has the potential to diffuse knowledge beyond anything imagined by Jefferson; but while it was being constructed, link by hyperlink, commercial interests did not sit idly on the sidelines. They want to control the game, to take it over, to own it. They compete among themselves, of course, but so ferociously that they kill each other off. Their struggle for survival is leading toward an oligopoly; and whoever may win, the victory could mean a defeat for the public good.

Don’t get me wrong. I know that businesses must be responsible to shareholders. I believe that authors are entitled to payment for their creative labor and that publishers deserve to make money from the value they add to the texts supplied by authors. I admire the wizardry of hardware, software, search engines, digitization, and algorithmic relevance ranking. I acknowledge the importance of copyright, although I think that Congress got it better in 1790 than in 1998.

But we, too, cannot sit on the sidelines, as if the market forces can be trusted to operate for the public good. We need to get engaged, to mix it up, and to win back the public’s rightful domain. When I say “we,” I mean we the people, we who created the Constitution and who should make the Enlightenment principles behind it inform the everyday realities of the information society. Yes, we must digitize. But more important, we must democratize. We must open access to our cultural heritage. How? By rewriting the rules of the game, by subordinating private interests to the public good, and by taking inspiration from the early republic in order to create a Digital Republic of Learning.

What provoked these jeremianic- utopian reflections? Google. Four years ago, Google began digitizing books from research libraries, providing full-text searching and making books in the public domain available on the Internet at no cost to the viewer. For example, it is now possible for anyone, anywhere to view and download a digital copy of the 1871 first edition of Middlemarch that is in the collection of the Bodleian Library at Oxford. Everyone profited, including Google, which collected revenue from some discreet advertising attached to the service, Google Book Search. Google also digitized an ever-increasing number of library books that were protected by copyright in order to provide search services that displayed small snippets of the text. In September and October 2005, a group of authors and publishers brought a class action suit against Google, alleging violation of copyright. Last October 28, after lengthy negotiations, the opposing parties announced agreement on a settlement, which is subject to approval by the US District Court for the Southern District of New York.2

The settlement creates an enterprise known as the Book Rights Registry to represent the interests of the copyright holders. Google will sell access to a gigantic data bank composed primarily of copyrighted, out-of-print books digitized from the research libraries. Colleges, universities, and other organizations will be able to subscribe by paying for an “institutional license” providing access to the data bank. A “public access license” will make this material available to public libraries, where Google will provide free viewing of the digitized books on one computer terminal. And individuals also will be able to access and print out digitized versions of the books by purchasing a “consumer license” from Google, which will cooperate with the registry for the distribution of all the revenue to copyright holders. Google will retain 37 percent, and the registry will distribute 63 percent among the rightsholders.

Meanwhile, Google will continue to make books in the public domain available for users to read, download, and print, free of charge. Of the seven million books that Google reportedly had digitized by November 2008, one million are works in the public domain; one million are in copyright and in print; and five million are in copyright but out of print. It is this last category that will furnish the bulk of the books to be made available through the institutional license.

Many of the in-copyright and in-print books will not be available in the data bank unless the copyright owners opt to include them. They will continue to be sold in the normal fashion as printed books and also could be marketed to individual customers as digitized copies, accessible through the consumer license for downloading and reading, perhaps eventually on e-book readers such as Amazon’s Kindle.

After reading the settlement and letting its terms sink in—no easy task, as it runs to 134 pages and 15 appendices of legalese—one is likely to be dumbfounded: here is a proposal that could result in the world’s largest library. It would, to be sure, be a digital library, but it could dwarf the Library of Congress and all the national libraries of Europe. Moreover, in pursuing the terms of the settlement with the authors and publishers, Google could also become the world’s largest book business—not a chain of stores but an electronic supply service that could out-Amazon Amazon.

An enterprise on such a scale is bound to elicit reactions of the two kinds that I have been discussing: on the one hand, utopian enthusiasm; on the other, jeremiads about the danger of concentrating power to control access to information.

Who could not be moved by the prospect of bringing virtually all the books from America’s greatest research libraries within the reach of all Americans, and perhaps eventually to everyone in the world with access to the Internet? Not only will Google’s technological wizardry bring books to readers, it will also open up extraordinary opportunities for research, a whole gamut of possibilities from straightforward word searches to complex text mining. Under certain conditions, the participating libraries will be able to use the digitized copies of their books to create replacements for books that have been damaged or lost. Google will engineer the texts in ways to help readers with disabilities.

Unfortunately, Google’s commitment to provide free access to its database on one terminal in every public library is hedged with restrictions: readers will not be able to print out any copyrighted text without paying a fee to the copyright holders (though Google has offered to pay them at the outset); and a single terminal will hardly satisfy the demand in large libraries. But Google’s generosity will be a boon to the small-town, Carnegie-library readers, who will have access to more books than are currently available in the New York Public Library. Google can make the Enlightenment dream come true.

But will it? The eighteenth-century philosophers saw monopoly as a main obstacle to the diffusion of knowledge —not merely monopolies in general, which stifled trade according to Adam Smith and the Physiocrats, but specific monopolies such as the Stationers’ Company in London and the booksellers’ guild in Paris, which choked off free trade in books.

Google is not a guild, and it did not set out to create a monopoly. On the contrary, it has pursued a laudable goal: promoting access to information. But the class action character of the settlement makes Google invulnerable to competition. Most book authors and publishers who own US copyrights are automatically covered by the settlement. They can opt out of it; but whatever they do, no new digitizing enterprise can get off the ground without winning their assent one by one, a practical impossibility, or without becoming mired down in another class action suit. If approved by the court—a process that could take as much as two years—the settlement will give Google control over the digitizing of virtually all books covered by copyright in the United States.

This outcome was not anticipated at the outset. Looking back over the course of digitization from the 1990s, we now can see that we missed a great opportunity. Action by Congress and the Library of Congress or a grand alliance of research libraries supported by a coalition of foundations could have done the job at a feasible cost and designed it in a manner that would have put the public interest first. By spreading the cost in various ways—a rental based on the amount of use of a database or a budget line in the National Endowment for the Humanities or the Library of Congress—we could have provided authors and publishers with a legitimate income, while maintaining an open access repository or one in which access was based on reasonable fees. We could have created a National Digital Library—the twenty-first-century equivalent of the Library of Alexandria. It is too late now. Not only have we failed to realize that possibility, but, even worse, we are allowing a question of public policy—the control of access to information—to be determined by private lawsuit.

While the public authorities slept, Google took the initiative. It did not seek to settle its affairs in court. It went about its business, scanning books in libraries; and it scanned them so effectively as to arouse the appetite of others for a share in the potential profits. No one should dispute the claim of authors and publishers to income from rights that properly belong to them; nor should anyone presume to pass quick judgment on the contending parties of the lawsuit. The district court judge will pronounce on the validity of the settlement, but that is primarily a matter of dividing profits, not of promoting the public interest.

As an unintended consequence, Google will enjoy what can only be called a monopoly—a monopoly of a new kind, not of railroads or steel but of access to information. Google has no serious competitors. Microsoft dropped its major program to digitize books several months ago, and other enterprises like the Open Knowledge Commons (formerly the Open Content Alliance) and the Internet Archive are minute and ineffective in comparison with Google. Google alone has the wealth to digitize on a massive scale. And having settled with the authors and publishers, it can exploit its financial power from within a protective legal barrier; for the class action suit covers the entire class of authors and publishers. No new entrepreneurs will be able to digitize books within that fenced-off territory, even if they could afford it, because they would have to fight the copyright battles all over again. If the settlement is upheld by the court, only Google will be protected from copyright liability.

Google’s record suggests that it will not abuse its double-barreled fiscal-legal power. But what will happen if its current leaders sell the company or retire? The public will discover the answer from the prices that the future Google charges, especially the price of the institutional subscription licenses. The settlement leaves Google free to negotiate deals with each of its clients, although it announces two guiding principles: “(1) the realization of revenue at market rates for each Book and license on behalf of the Rightsholders and (2) the realization of broad access to the Books by the public, including institutions of higher education.”

What will happen if Google favors profitability over access? Nothing, if I read the terms of the settlement correctly. Only the registry, acting for the copyright holders, has the power to force a change in the subscription prices charged by Google, and there is no reason to expect the registry to object if the prices are too high. Google may choose to be generous in it pricing, and I have reason to hope it may do so; but it could also employ a strategy comparable to the one that proved to be so effective in pushing up the price of scholarly journals: first, entice subscribers with low initial rates, and then, once they are hooked, ratchet up the rates as high as the traffic will bear.

Free-market advocates may argue that the market will correct itself. If Google charges too much, customers will cancel their subscriptions, and the price will drop. But there is no direct connection between supply and demand in the mechanism for the institutional licenses envisioned by the settlement. Students, faculty, and patrons of public libraries will not pay for the subscriptions. The payment will come from the libraries; and if the libraries fail to find enough money for the subscription renewals, they may arouse ferocious protests from readers who have become accustomed to Google’s service. In the face of the protests, the libraries probably will cut back on other services, including the acquisition of books, just as they did when publishers ratcheted up the price of periodicals.

No one can predict what will happen. We can only read the terms of the settlement and guess about the future. If Google makes available, at a reasonable price, the combined holdings of all the major US libraries, who would not applaud? Would we not prefer a world in which this immense corpus of digitized books is accessible, even at a high price, to one in which it did not exist?

Perhaps, but the settlement creates a fundamental change in the digital world by consolidating power in the hands of one company. Apart from Wikipedia, Google already controls the means of access to information online for most Americans, whether they want to find out about people, goods, places, or almost anything. In addition to the original “Big Google,” we have Google Earth, Google Maps, Google Images, Google Labs, Google Finance, Google Arts, Google Food, Google Sports, Google Health, Google Checkout, Google Alerts, and many more Google enterprises on the way. Now Google Book Search promises to create the largest library and the largest book business that have ever existed.

Whether or not I have understood the settlement correctly, its terms are locked together so tightly that they cannot be pried apart. At this point, neither Google, nor the authors, nor the publishers, nor the district court is likely to modify the settlement substantially. Yet this is also a tipping point in the development of what we call the information society. If we get the balance wrong at this moment, private interests may outweigh the public good for the foreseeable future, and the Enlightenment dream may be as elusive as ever.

Letters

Google & Books: An Exchange March 26, 2009

  1. 2

    The full text of the settlement can be found at www.googlebooksettlement.com/agreement.html. For Google’s legal notice concerning the settlement, see page 35 of this issue of The New York Review.

  • Email
  • Single Page
  • Print