This article will appear as the second epilogue to the chapter “The Dream of the Human Genome” in the paperback edition of Richard Lewontin’s It Ain’t Necessarily So: The Dream of the Human Genome and Other Illusions, to be published in October by New York Review of Books.
On Monday, February 12, 2001, The New York Times, on its front page above the fold, leaked the news that the two competing projects to sequence the human genome were about to announce on that very day that they had indeed located the Holy Grail. Then, on Thursday and Friday, the scientific papers giving the details appeared, surrounded by a penumbra of commentary, analysis, and promises of a rosy future for human health and self-knowledge. It might seem remarkable that both publicly funded and commercial projects should have independently accomplished their ends of sequencing the three billion nucleotides of the human genome, analyzing the sequence, and publishing their findings within a day of each other, but it was no coincidence. It was, in fact, the carefully prearranged and orchestrated outcome of a truce between the contenders announced at a joint press conference the previous June.
Their decision that the human DNA sequence was now definitively determined was an arbitrary one since there are admitted to be gaps amounting to about 6 percent of the sequence yet to be filled in. As in longstanding political struggle, the exhausted parties simply decide that enough is enough, but, as in political cases, some occasional sniper fire is still heard. So, Celera Genomics’ commercial project claims that its sequence is more accurate than the publicly funded one, while the International Human Genome Sequence Consortium claims that only by use of their publicly available intermediate results could Celera have assembled their sequence in the first place.
A small irony of the simultaneous publication is that the public project, supported in large part by American government funds, used as its vehicle the English commercial scientific journal Nature, owned by Macmillan, while the commercial Celera project used Science, the organ of the American Association for the Advancement of Science, a nonprofit professional society. Some of the details of publication are immensely revealing of the sociology of science and scientific writing. Modern natural scientific work often requires the joint efforts of several professional participants, all of whom depend on publication for their career advancement and the acquisition of further research funding. The result has been the dominance of the jointly authored scientific report. The most recent issue of Genetics, the major international publication in the field, contains forty-one research papers, none of which was the product of only a single author.
As befits the monster human genome sequencing projects, their author lists are monsters: 275 authors for the commercial project and 250 for the International Consortium (both lists being characterized as “partial”). The order of authors is generally also of great import in the acquisition of scientific credit and the decoding of these lists would be an interesting exercise for sociologists. Aside from the predictable first authorship of Craig Venter, the head of Celera, on the commercial publication and of Eric Lander, the director of the Whitehead Institute, which was responsible for more sequencing than the other cooperators in the public project, it is not obvious how we are to understand the order of authors. Nor is there any hint of who, among hundreds of “authors,” actually wrote the papers. This too is a revelation of the assumptions of scientific work. Scientists, by their practices, seem to place little importance on the actual composition of their communications. For example, they never read written papers aloud when they give talks about their work, but speak ex tempore. For other intellectuals the words are the matter, but scientists think of themselves as simply reporting objectively the facts of nature. Like the Delphic Oracle, they sit perched on their tripods, with upturned eyeballs, and out of their mouths issue nature’s words. But, of course, the long reports on the human genome, like any scientific report, are filled with analysis and interpretation all informed by communal and individual judgments of what is significant and what is to be ignored.
And what is significant in the human genome sequence? The major irony of the sequencing of the human genome is that the result turns out not to provide the answer to the chief question that motivated the project. Now that we have the complete sequence of the human genome we do not, alas, know anything more than we did before about what it is to be human. At the time of the completion of the human genome sequence, scientists already knew the complete DNA sequences of thirty-nine species of bacteria, a yeast, a nematode worm, the fruit fly, Drosophila, and the mustard weed, Arabidopsis. In each case it is possible to estimate how many genes are present in the genome, using two methods. The first is to compare stretches of DNA sequence with sequences of particular genes already known from a variety of organisms. The other, for DNA that does not match already known genes, is to use certain sequence motifs that are common to all genes. When this so-called “annotation” of the human genome was done it was estimated that humans have about 32,000 genes. This seems a rather small number when the comparison is made with the fruit fly (13,000), the nematode worm (18,000), and the mustard weed (26,000). Can human beings really only have 75 percent more genes than a tiny worm and a mere 25 percent more than a weed? If, as the eminent molecular biologist Walter Gilbert wrote, a knowledge of the human genome would cause “a change in our philosophical understanding of ourselves,” that change has not been quite what was hoped for. It appears that we are not much different from vegetables, if we can judge from our genomes.
The reaction to the discovery that human beings do not have much more genomic information than plants and worms has been to call for a new and even more grandiose project. It is now agreed among molecular biologists that the genome was not really the right target and that we now need to study the “proteome,” the complete set of all the proteins manufactured by an organism. Surely the very complex human being must have many more different proteins than a small flowering plant. Although the devotees of the genome project kept assuring us that genes made proteins and therefore when we had all the genes we would know all the proteins, they now say that, of course, they knew all along that genes don’t make proteins. Genes only specify the sequence of amino acids that are linked together in the manufacture of a molecule called a polypeptide, which must then fold up to make a protein. But there are many different ways in which a long polypeptide can fold, resulting in different proteins. The way in which the folding occurs may be different in different cells of different organisms and depends in part on the presence of small molecules, like sugars, and on other proteins.
Moreover, a gene is divided up into several stretches of DNA, each of which specifies only part of the complete sequence in a polypeptide. Each of these partial sequences can then combine with parts specified by other genes, so that from only a few genes, each made up of a few subsections, a very large number of combinations of different amino acid sequences could be made by mixing and matching. So knowing all the genes of a human being doesn’t really tell us what we want to know.
One prominent opponent of the genome sequencing project, William Haseltine, CEO of Human Genome Sciences, has long claimed that the right way to find all the human genes is not to sequence the genome itself, but to go directly to the products that the cell makes when it reads the genome. These products, nucleotide sequences called “messenger RNAs,” are then used by the cell to manufacture the polypeptides. Haseltine claims to have detected 90,000 of these messengers in human cells, but whether that means there are 90,000 different genes or 90,000 different combinations of bits and pieces from approximately 32,000 genes is unclear, given that no detailed accounts of his findings have been published.1
The call for a proteome project comes just in time to solve the practical problem created by the completion of the genome project. What is Big Science going to do now? A proteome project will be much larger than the genome project and will take much longer to finish. There are, we suppose, a lot more different proteins than there are genes. Moreover, the sequencing of the DNA of a gene is technologically trivial in comparison with the determination of the three-dimensional structure of a protein. In the past it would take a Ph.D. candidate three years to determine the sequence of a single protein. New automated technologies will undoubtedly be developed, but the proteome project will be guaranteed to occupy a large number of scientist-years well into the future.
As interest shifts from genes to proteins, so the promises of cures for all of our ills will shift from genome fixes to protein fixes. The special Human Genome issues of Science and Nature already prefigure this change. Amid the many articles of the standard sort like “Toward Behavioral Genomics” and “Cancer and Genomics” is one called “Proteomics in Genomeland,” and one, “Dissecting Human Disease in the Post-Genomic Era,” which describes the shift from genomics to proteomics as one of the “Paradigm Shifts in Biomedical Research.” As yet the promise that the study of DNA sequences will lead to cures for illness has remained unfulfilled for any human disease, although some gene-based drugs are undergoing clinical trials. Proteomics has arrived in the nick of time to assume the burden, and with more reason. The historical successes of molecular medicine have been precisely in developing drug therapies, dietary regimes, or substitute sources of faulty or missing proteins. The provision of insulin to diabetics and the amelioration of at least the most debilitating symptoms of the inherited metabolic disease Phenylketonuria (PKU) by dietary restriction are the best-known examples.
The subject of DNA seems filled with ironies. The struggle over the forensic use of DNA profiles to link defendants to crime scenes is now over. The use of such evidence is now routine despite the fact that the problems posed by the presentation of quantitative probability arguments to juries and the lack of uniform rigorous quality control of laboratory work have never been resolved; nor is there any effort being made to deal with these issues. The cessation of that struggle is partly a result of the feeling on the part of those who originally opposed the introduction of DNA evidence that the battle was unwinnable. The second National Academy of Sciences report placed the full weight of the scientific establishment behind the use of DNA profiles and a series of court decisions have validated their admissibility as evidence.2
But the frustrated opponents of DNA profiles introduced as incriminating evidence have partly made use of the legitimation of the technique to turn it to the opposite purpose. People who, on the basis of eyewitness identification or circumstantial evidence, were convicted of violent crimes like rape and murder and who were given long sentences or threatened with execution are now being released on the basis of their DNA profiles. In many cases physical evidence in the form of blood or semen samples recovered from the crime scene or victim have been preserved. When these are subsequently compared with the DNA profile of the convicted person and a mismatch is found, then that person is definitively exonerated.
The word of these successful rescues from prison and execution has spread and the demand on the part of prisoners for reopening of their cases has grown enormously. Prosecutorial forces have resisted these demands as strongly as they can, fearful of a deluge of reconsiderations of their successful past prosecutions, and only a few defense attorneys have the resources to take up these old cases again in the face of the strong resistance on the part of the state. But there have been some notable successes.
The leaders of the movement to use DNA evidence for exculpations have been the attorneys Peter Neufeld and Barry Scheck, experts on forensic uses of DNA. Using the resources and fame they acquired in their successful defense of O.J. Simpson, they have organized the Innocence Project, which, together with other efforts inspired by it, has thus far succeeded in obtaining the release of more than ninety prisoners serving long-term sentences or living under the threat of execution. Unfortunately, the necessary physical evidence has often not been preserved, and when it has, considerable effort of time and money is needed to obtain access to it, so the Innocence Project is not likely to lead to a wholesale reconsideration of past convictions. Prosecutors, however, can be counted on to make increasing use of DNA evidence to secure convictions in order to protect those convictions against challenges. Moreover, the demonstration that innocent people have been sentenced to death has given opponents of the death penalty a very powerful argument.
July 19, 2001
See the story about Human Genome Sciences in The Financial Times (London), June 12, 2001, p. 15. ↩
The first report of the National Academy was DNA Technology in Forensic Science (National Academy Press, 1992). The second was the National Research Council’s The Evaluation of Forensic DNA Evidence (National Academy Press, 1996). ↩