• Email
  • Single Page
  • Print

The Genetic Adventurer

1.

Most science proceeds quietly. Many scientific problems are tackled by one or a few laboratories and the results are published in a journal that the public has never heard of. And even when science does make it into the mainstream press, it’s almost always briefly and after the fact: the study is done and its findings are reported in a few sentences.

The effort to sequence the human genome—to determine the exact order of the “letters” making up the DNA sequence of our species—was never like this. Instead, the press covered the project for years before its completion. And when the project was finally finished, its results were the first ever to be announced from the White House. While much of this attention reflected the possible medical implications of the work—the public was told that the human genome project would cure what ails it—at least some of the attention reflected the drama of the undertaking itself. Sequencing the human genome was no ordinary scientific venture but one that featured big money, big personalities, and a big race.

The race was between two programs, one public (centered at the National Institutes of Health) and the other private (centered at a company called Celera Genomics). As races go, this one was not particularly sportsmanlike. Instead, the contest featured more than its fair share of political intrigue and mudslinging. Now one of the key players—Craig Venter, leader of Celera’s effort—tells his story. (Or, more accurately, his side of the story.) The resulting tale is part autobiography, part popular science, and part an attempt to settle old scores. Apparently, there are plenty to settle.1

Venter’s youth offered little reason to expect success in life. Growing up in Eisenhower-era California, he was a wild child and an abysmal student. By adolescence, he seemed destined to a life as a beach bum. Then Vietnam intervened. Joining the navy, he became a medic and endured the Tet Offensive as well as the horrors of performing triage. His experiences during the war, which included an aborted suicide attempt, seemed to transform Venter and he returned to the US determined to make something of himself.

Attending community college and then the University of California, San Diego, he settled on biology. He remained at San Diego to perform graduate work on how hormones affect cells and found, to his surprise, that he was a formidable researcher. He rapidly racked up an impressive number of published articles. Following several academic positions, in 1983 Venter landed at the NIH in Bethesda, a near paradise for biologists: research money was abundant and grant applications unnecessary.

During his time there, Venter’s work gradually shifted to DNA. Following a series of squabbles with James Watson and other NIH leaders—disputes that foreshadowed his later, bitter feuds with federal scientists—Venter departed in 1992, choosing instead to lead research at a private institute where he was free to pursue his increasingly ambitious projects on DNA sequencing. In the end, of course, Venter set his sights on the human genome, opting to compete directly with the public program. He also decided to decode the genome using a controversial method that differed radically from that favored by scientists in the public effort. The race that ensued makes up much of the story of A Life Decoded. By 2000, with the completion of the project, the former beach bum found himself on the cover of Time magazine, along with Francis Collins, who led the public program.

The portrait of Venter that emerges from A Life Decoded is fascinating, if less than wholly attractive. Confident, domineering, and a risk-taker, Venter enjoys his reputation as the bad boy of biology. Most of all, he seems determined to win, whether sequencing or sailing (his avocation). Though Venter longs to be seen as one of the greats in the history of science, A Life Decoded makes it clear he is only partly a scientist. He is also part entrepreneur and part PR man. (Alas, he is also only part writer. As Venter acknowledges, he wrote text and then hired a reporter to help “trim and reorganize” his work. The result is some annoyingly bumpy prose.)

Though relentlessly self-serving, Venter manages, almost despite himself, to produce a book that’s engaging and, in places, charming. A Life Decoded is a tale told by an ego of epic proportions, but once this fact is accepted, the drama of Venter’s narrative takes hold and even his bravado becomes perversely, if mildly, entertaining. In any case, the personality on display in the book—combative, with healthy doses of chutzpah and showmanship—surely had much to do with Venter’s success in the ruthlessly competitive world of Big Science, a world that looks more like Russian capitalism than it does popular pictures of the noble pursuit of truth. While the partisan spin that runs through A Life Decoded won’t be of much lasting interest, the science that underlies the human genome story will be.

2.

DNA is the material that gets passed from parents to offspring and which partly explains why offspring resemble parents. A long molecule, DNA is a string of chemicals, each of which can be represented by a letter. Unlike English, DNA uses only four letters—A, T, G, and C. Like the information carried by this sentence, the information carried by DNA depends on the precise sequence of letters. Genes are particular regions of DNA that have a special role: they tell a cell to make a certain kind of molecule, namely a protein. One gene sequence (say, AATTCGGTC…) tells the cell to make one kind of protein, while a different sequence (TTCGCTAGC…) tells the cell to make a different kind of protein. But not all DNA functions as genes: much of our DNA is filler that sits between genes.

The sum total of all this DNA, genes and filler, is known as the genome. And genomes are very large. The human genome, for instance, is about three billion letters long and includes around 30,000 genes. These three billion letters of DNA are not, however, all strung together in a single molecule. Instead, the human genome is divided into twenty-three different molecules of DNA and each of these molecules resides on a small cellular body called a chromosome. So chromosome 3, say, carries one particular long DNA molecule and includes a distinct set of genes; chromosome 4 carries a different long DNA molecule and includes a different set of genes, and so on.

Until recently, decoding the sequence of letters in any stretch of DNA required tedious work in the laboratory and biologists typically confined themselves to sequencing only one or a few genes. In the late 1980s a new technology—automated DNA sequencing—promised to change all this, simplifying and speeding the process. Inject DNA from an organism into the machine and, by means of chemistry and lasers, the device would reveal the desired DNA sequence.

As automated sequencing improved, biologists raised their expectations and some began considering sequencing not merely this or that individual gene but entire genomes. In the Nineties, a group of biologists convinced the federal government to fund an extensive effort to sequence the entire human genome, the largest concerted undertaking in the history of biomedical research and one that would ultimately cost about three billion dollars.

To decode the genome, leaders of the public project, which was based at the NIH—but also included scientists at the Department of Energy as well as those funded by the Wellcome Trust in Britain—settled on a two-step strategy. In the first, the genome would be broken into large fragments of DNA and the physical location in the genome of each fragment would be painstakingly ascertained. A particular fragment might, for instance, sit at the tip of chromosome 17. Once a large collection of fragments had been mapped, researchers would choose a subset of fragments that showed little overlap with one another. This would ensure that the project was left with a manageable number of fragments that, taken together, covered the whole genome. The public project would then move to step two: each DNA fragment in this set would be sequenced. This work would be farmed out to an international consortium of laboratories, each running automated sequencers.

Venter, aggressive and fond of shortcuts, stood this systematic approach on its head. He advocated an alternative approach called whole genome shotgun sequencing. In shotgun sequencing, many copies of the genome are sheared randomly into small pieces. A huge number of these pieces, many of which overlap, are then immediately sequenced using automated machines. A laboratory thus has no idea where the particular piece of DNA it is sequencing resides in the genome. It might derive from chromosome 2 or from chromosome 17, etc. And because the pieces of DNA are generated randomly, some parts of the genome might get sequenced once, some twice, others three times, and so on. (Worse, some parts might not get sequenced at all. But enough pieces are sequenced that this outcome is rare.)

After decoding many thousands of such pieces, one faces the daunting task of correctly stitching them all together into a complete genome. This so-called genome assembly step is performed using computers that search for regions of sequence overlap between pieces of DNA: if the right end of one piece of DNA has the same sequence of letters as the left end of another piece of DNA, they can be overlaid to form a single, longer sequence. By repeating this process over and over, sophisticated computer algorithms can stitch together an entire genome—at least in principle.2

The shotgun method thus required both extensive sequencing and powerful computational capacities. On the upside, it promised faster results than the public approach; one needn’t spend years mapping every piece of DNA to its location in the genome. On the downside, there was no guarantee that the computer algorithms were up to the task of assembling the many pieces of DNA into a seamless whole—potentially leaving Venter with many fragments but no genome.

Venter’s first test of the shotgun method involved not human beings but a tiny microbe, Haemophilus influenzae, which can cause serious infections such as childhood meningitis. And it worked. In 1995, Venter’s private venture, TIGR, unveiled the first complete genome sequence of any free-living species. There could now be no doubt that Venter was a force to be reckoned with in the nascent field of genomics, the sequencing and analysis of whole genomes. Though providing proof in principle for shotgun sequencing, the Haemophilus genome is small—it includes only 1.8 million letters and about 1,700 genes. It remained unclear, therefore, if shotgun sequencing could be ramped up to species having far larger genomes, especially as the computational challenges involved in assembling pieces into a whole explodes as genome size increases.

Venter, who by the late 1990s was heading a new private venture called Celera (from the Latin for swiftness), thus turned his attention to a far fancier creature, the fruit fly Drosophila melanogaster. Drosophila has a large genome—1.2 billion letters long—and encodes over 13,000 genes. Importantly, Drosophila provided a test not only of the shotgun method’s ability to decode a large genome but of its accuracy. Previous work by Drosophila geneticists had generated a great deal of high-quality DNA sequence that could be compared to Celera’s new sequence. Again, Venter’s approach worked and Celera unveiled the complete genome sequence of the fly.

  1. 1

    Some of the same story has been told in previous books. See especially James Shreeve, The Genome War: How Craig Venter Tried to Capture the Code of Life and Save the World (Knopf, 2004).

  2. 2

    Exactly how these sequencing strategies work is somewhat more complicated than I’ve indicated. For more detail, see Greg Gibson and Spencer V. Muse, A Primer of Genome Science (Sinauer Associates, 2004).

  • Email
  • Single Page
  • Print