University libraries have little defense against excessive pricing. If they cancel a subscription, the faculty protest about being cut off from the circulation of knowledge, and the publishers impose drastic cancellation fees. Those fees are written into contracts, which often cover “bundles” of journals, sometimes hundreds of them, over a period of several years. The contracts provide for annual increases in the cost of the bundle, even though a library’s budget may decrease; and the publishers usually insist on keeping the terms secret, so that one library cannot negotiate for cheaper rates by citing an advantage obtained by another library. A recent court case in the state of Washington makes it seem possible that publishers will no longer be able to prevent the circulation of information about their contracts. But they continue to sell subscriptions in bundles. If in negotiating the renewal of a contract a library attempts to unbundle the offer in order to eliminate the least desirable journals, the publishers commonly raise the prices of the other journals so much that the total cost remains the same.
While prices continued to spiral upward, professors became entrapped in another kind of vicious circle, unaware of the unintended consequences. Reduced to essentials, it goes like this: we academics devote ourselves to research; we write up the results as articles for journals; we referee the articles in the process of peer reviewing; we serve on the editorial boards of the journals; we also serve as editors (all of this unpaid, of course); and then we buy back our own work at ruinous prices in the form of journal subscriptions—not that we pay for it ourselves, of course; we expect our library to pay for it, and therefore we have no knowledge of our complicity in a disastrous system.
Professors expect services from their libraries, even if they never set foot in them and consult Tetrahedron or The Journal of Comparative Neurology from computers in their labs. A few, however, have stared the problem in the face and seized it by the horns. In 2001 scientists at Stanford and Berkeley circulated a petition calling for their colleagues to submit articles only to open-access journals—that is, journals that made them available from digital repositories free of charge, either immediately or after a delay.
The effectiveness of such journals had been proven by BioMed Central, a British enterprise, which had been publishing a whole series of them since 1999. Led by Harold Varmus, a Nobel laureate who is now director of the National Cancer Institute, American researchers allied with the Public Library of Science founded their own series, beginning with PLoS Biology in 2003. Foundations provided start-up funding, and ongoing publication costs were covered by the research grants received by the authors of the articles. Thanks to rigorous peer review and the prestige of the authors, the PLoS publications were a great success.
According to citation indexes and statistics of hits, open-access journals were consulted more frequently than most commercial publications. By 2008, when the National Institutes of Health required the recipients of its grants to make their work available through open access—although it permitted an embargo of up to twelve months—cracks were appearing everywhere in the commercial monopoly of publishing in the medical sciences.
But what could be done in all the other disciplines, especially those in the humanities and social sciences, where grants are not so generous, if they exist at all? Several universities passed resolutions in favor of open access and established digital repositories for articles, but the compliance rate of the professors, often 4 percent or less, made them look ineffective. At Harvard we developed a new model. By a unanimous vote on February 12, 2008, professors in the Faculty of Arts and Sciences bound themselves to deposit all of their future scholarly articles in an open-access repository to be established by the library and also granted the university permission to distribute them.
This arrangement had an escape clause: anyone could refuse to comply by obtaining a waiver, which would be granted automatically. In this way, professors retained the liberty to publish in closed-access journals, which might refuse to accept an article available elsewhere on open access or might require an embargo. This model has now spread to other faculties at Harvard and to other universities, but it is not a business model. If the monopolies of price-gouging publishers are to be broken, we need more than open-access repositories. We need open-access journals that will be self-sustaining.
A supplementary program at Harvard now subsidizes publishing fees for articles submitted to open-access journals, up to a yearly limit, for each professor. The idea is to reverse the economics of journal publishing by covering costs, rationally determined, at the production end instead of by paying for an exorbitant profit in addition to the production costs at the consumption end. If other universities adopt the same policy and if professors apply pressure on editorial boards, journals will shift, little by little, one after the other, to open access. The Compact for Open-Access Publishing Equity (COPE), launched this year, is an attempt to create a coalition of universities to push journal publishing in this direction. It also envisages subsidies for authors who cannot expect financial help from grants or their home universities.
If COPE succeeds, it could save billions of dollars in library budgets. But it will only succeed in the long run. Meanwhile, the prices of commercial journals continue to rise, and the balance sheets of university presses continue to sink into the red. In 2003 Walter Lippincott, the director of Princeton University Press, predicted that twenty-five of the eighty-two university presses in the United States would disappear within five years. They are still alive, but they are barely holding on by their fingernails. They may find a second life by publishing online and by taking advantage of technological innovations such as the Espresso Book Machine. This can download an electronic text from a database, print it out within four minutes, and make it available at a moderate price as an instant print-on-demand paperback.
But just when this glimmer of hope appeared on the horizon, it was overshadowed by the most powerful technological innovation of them all: relevance-ranking search engines linked to gigantic databases, as in the case of Google Book Search, which is already providing readers with access to millions of books. This brings me to Jeremiad 3.
Google represents the ultimate in business plans. By controlling access to information, it has made billions, which it is now investing in the control of the information itself. What began as Google Book Search is therefore becoming the largest library and book business in the world. Like all commercial enterprises, Google’s primary responsibility is to make money for its shareholders. Libraries exist to get books to readers—books and other forms of knowledge and entertainment, provided for free. The fundamental incompatibility of purpose between libraries and Google Book Search might be mitigated if Google could offer libraries access to its digitized database of books on reasonable terms. But the terms are embodied in a 368-page document known as the “settlement,” which is meant to resolve another conflict: the suit brought against Google by authors and publishers for alleged infringement of their copyrights.
Despite its enormous complexity, the settlement comes down to an agreement about how to divide a pie—the profits to be produced by Google Book Search: 37 percent will go to Google, 63 percent to the authors and publishers. And the libraries? They are not partners to the agreement, but many of them have provided, free of charge, the books that Google has digitized. They are being asked to buy back access to those books along with those of their sister libraries, in digitized form, for an “institutional subscription” price, which could escalate as disastrously as the price of journals. The subscription price will be set by a Book Rights Registry, which will represent the authors and publishers who have an interest in price increases. Libraries therefore fear what they call “cocaine pricing”—a strategy of beginning at a low rate and then, when customers are hooked, ratcheting up the price as high as it will go.
To become effective, the settlement must be approved by the district court in the Southern Federal District of New York. The Department of Justice has filed two memoranda with the court that raise the possibility, indeed the likelihood, that the settlement could give Google such an advantage over potential competitors as to violate antitrust laws. But the most important issue looming over the legal debate is one of public policy. Do we want to settle copyright questions by private litigation? And do we want to commercialize access to knowledge?
I hope that the answer to those questions will lead to my happy ending: a National Digital Library—or a Digital Public Library of America (DPLA), as some prefer to call it. Google demonstrated the possibility of transforming the intellectual riches of our libraries, books lying inert and underused on shelves, into an electronic database that could be tapped by anyone anywhere at any time. Why not adapt its formula for success to the public good—a digital library composed of virtually all the books in our greatest research libraries available free of charge to the entire citizenry, in fact, to everyone in the world?
To dismiss this goal as naive or utopian would be to ignore digital projects that have proven their worth and feasibility throughout the last twenty years. All major research libraries have digitized parts of their collections. Since 1995 the Digital Library Federation has worked to combine their catalogues or “metadata” into a general network. More ambitious enterprises such as the Internet Archive, Knowledge Commons, and Public.Resource .Org have attempted digitization on a larger scale. They may be dwarfed by Google, but several countries are now determined to out-Google Google by scanning the entire contents of their national libraries.
In December 2009 President Nicolas Sarkozy of France announced that he would make €750 million available for digitizing the French cultural “patrimony.” The National Library of the Netherlands aims to digitize within ten years every Dutch book, newspaper, and periodical produced from 1470 to the present. National libraries in Japan, Australia, Norway, and Finland are digitizing virtually all of their holdings; and Europeana, an effort to coordinate digital collections on an international scale, will have made over ten million objects—from libraries, archives, museums, and audiovisual holdings—freely accessible online by the end of 2010.
If these countries can create national digital libraries, why can’t the United States? Because of the cost, some would argue. Far more works exist in English than in Dutch or Japanese, and the Library of Congress alone contains 30 million volumes. Estimates of the cost of digitizing one page vary enormously, from ten cents (the figure cited by Brewster Kahle, who has digitized over a million books for the Internet Archive) to ten dollars, depending on the technology and the required quality. But it should be possible to digitize everything in the Library of Congress for less than Sarkozy’s €750 million—and the cost could be spread out over a decade.