The main impediment to the DPLA’s growth is legal, not financial. Copyright laws could exclude everything published after 1964, most works published after 1923, and some that go back as far as 1873. Court cases during the last few months have opened up the possibility that the fair use provision of the copyright act of 1976 could be extended to make more recent books available for certain purposes, such as service to the visually impaired and some forms of teaching. And if, as expected, the DPLA excludes books that are still selling on the market (most exhaust their commercial viability within a few years), authors and publishers might grant the exercise of their rights to the DPLA.
In any case, we cannot wait for courts to untangle legalities before creating an effective administration. The informal secretariat at Harvard is being replaced by a nonprofit corporation organized according to the 501(c)3 provisions of the tax code. The steering committee has been succeeded by a board of directors. And the six groups will evolve into a committee system with carefully defined functions, such as outreach to public libraries and community colleges. The choice of an executive director, Daniel Cohen, a superb historian and Internet expert from George Mason University, was announced on March 5; the first staff members have already been hired; and administrative headquarters are being set up in Boston.
Those first steps will not lead to the creation of a top-heavy bureaucracy. On the contrary, the “distributed” character of the DPLA means that its operations will be spread across the country. Its growing collection of metadata (Harvard has already made available 12 million openly accessible metadata records) will be stored in computer clouds, and its activities will be funneled through two kinds of “hubs.”
The DPLA’s “content hubs” are large repositories of digital material, usually held in physical locations like the Internet Archive in San Francisco. They will make their data accessible to users directly through the DPLA without passing through any intermediate aggregators. “Service hubs”—centers for collecting material—will aggregate data and provide various services at the state or regional level. The DPLA cannot deal directly with all the libraries, archives, and museums in the United States, because that would require its central administration to become involved in developing hundreds of thousands of interfaces and links. But development among local institutions is now being coordinated at the state level, and the DPLA will work with the states to create an integrated system for the entire country.
Forty states have digital libraries, and the DPLA’s service hubs—seven are already being developed in different parts of the country—will contribute the data those digital libraries have already collected to the national network. Among other activities, these service hubs will help local libraries and historical societies to scan, curate, and preserve local materials—Civil War mementos, high school yearbooks, family correspondence, anything that they have in their collections or that their constituents want to fetch from trunks and attics. As it develops, digital empowerment at the grassroots level will reinforce the building of an integrated collection at the national level, and the national collection will be linked with those of other countries.
The DPLA has designed its infrastructure to be interoperable with that of Europeana, a super aggregator sponsored by the European Union, which coordinates linkages among the collections of twenty-seven European countries. Within a generation, there should be a worldwide network that will bring nearly all the holdings of all libraries and museums within the range of nearly everyone on the globe. To provide a glimpse into this future, Europeana and the DPLA have produced a joint digital exhibition about immigration from Europe to the US, which will be accessible online at the time of the April 18 launch.
Of course, expansion, at the local or global level, depends on the ability of libraries and other institutions to develop their own digital databases—a long-term, uneven process that requires infusions of money and energy. As it takes place, great stockpiles of digital riches will grow up in locations scattered across the map. Many already exist, because the largest research libraries have already digitized enormous sections of their collections, and they will become content hubs in themselves.
For example, in serving as a hub, Harvard plans to make available to the DPLA by the time of its launch 243 medieval manuscripts; 5,741 rare Latin American pamphlets; 3,628 daguerreotypes, along with the first photographs of the moon and of African-born slaves; 502 chapbooks and “penny dreadfuls” about sensational crimes, a popular genre of literature in the eighteenth and nineteenth centuries; and 420 trial narratives from cases involving marriage and sexuality. Harvard expects to provide a great deal more in the following months, notably in fields such as music, cartography, zoology, and colonial history. Other libraries, archives, and museums will contribute still more material from their collections. The total number of items available in all formats on April 18 will be between two and three million.
How will such material be put to use? I would like to end with a final example. About 14 million students are struggling to get an education in community colleges—at least as many as those enrolled in all the country’s four-year colleges and universities. But many of them—and many more students in high schools—do not have access to a decent library. The DPLA can provide them with a spectacular digital collection, and it can tailor its offering to their needs. Many primers and reference works on subjects such as mathematics and agronomy are still valuable, even though their copyrights have expired. With expert editing, they could be adapted to introductory courses and combined in a reference library for beginners.
At one time or other, nearly every student comes in contact with a poem by Emily Dickinson, who probably qualifies as America’s favorite poet. But Dickinson’s poems are especially problematic. Only a few of them, horribly mangled, were published in her lifetime. Nearly all the manuscript copies are stored in Harvard’s Houghton Library, and they pose important puzzles, because they contain quirky punctuation, capitalization, spacing, and other touches that have profound implications for their meaning. Harvard has digitized the originals, combined them with the most important printed editions (one edited by Thomas H. Johnson in 1955 and one edited by Ralph W. Franklin in 1981), and added supplementary documentation in an Emily Dickinson Archive, which it will make available through its own website and the DPLA.
The online archive will enrich the experience of students at every level of the educational system. Teachers will be able to make selections from it and adjust them to the needs of their classes. By paying close attention to different versions of a poem, the students will begin to appreciate the way poetry works. They will sharpen their sensitivity to language in general, and the lessons they learn will help them gain possession of their cultural heritage. It may be a small step, but it will be a pragmatic advance into the world of knowledge, which Jefferson, in a utopian vein, described as “the common property of mankind.”