The Cheshire Cat's DNA | John Maynard Smith | The New York Review of Books

Submit a letter:

Reviewed:

by Evelyn Fox Keller

Harvard University Press, 186 pp., $22.95

1.

In 1900, three biologists independently rediscovered Mendel’s laws, according to which the characteristics of organisms are determined by hereditary units, each kind being present once in a gamete, sperm or egg, and hence twice in the fertilized egg. In effect, it was an atomic theory of heredity. The term “genetics” was introduced by William Bateson in 1906, and, for the hereditary units themselves, the word “gene” by Wilhelm Johannsen in 1909. By 1930, Thomas Hunt Morgan and his colleagues, working with the fruit fly Drosophila, had shown that genes are arranged linearly along chromosomes.

In 1953, James Watson and Francis Crick elucidated the structure of DNA, the material of which genes are made, and by so doing suggested a mechanism whereby genes could carry genetic information, and could replicate. There followed in rapid succession the discovery that genes produce their effects by determining the sequence of amino acids in proteins, that they do so by means of a “code” in which a triplet of bases in the DNA specifies an amino acid, and that there is a process whereby one gene can regulate the activity of another. During the last twenty years there has been an explosion in our knowledge of how genes influence the development of animals and plants. Finally, in the year 2000, we await the publication of the complete sequence of the human genome.

It is this history that Evelyn Fox Keller celebrates, and criticizes, in her book. A professor of the history and philosophy of science at MIT, she is at the same time enthusiastic about the light that has been shed on the nature of life and critical of the oversimplifications that she feels have been made. Later in this review I shall argue with some of her conclusions, so I must start by emphasizing that she is well qualified to draw them. She has an admirable grasp of recent research in molecular genetics—certainly wider and more detailed than my own—and has read widely in the history of genetics. I was delighted to meet again in her pages biologists who influenced me when I was starting in research, but whose work I imagined had been forgotten. She has also thought hard about both the history and the current state of the subject. Our disagreements are not of the kind that can be settled by specific experiments or observations; they concern differences about the best strategy to pursue when faced by the complexities of living organisms. I know that the world is complicated, but always seek for simple explanations of the complexity. For Keller, living organisms only work because they are complex; to simplify them is to leave out their essence.

The book can be read by those without previous knowledge of molecular genetics. However, it is not the kind of account I would write if I was aiming at a nonprofessional readership; I would leave out a lot of the complications. Clearly, this is not an option for Keller, because for her the complications are crucial. This means that reading the book is hard work in places, but it is worth it. It is commendably short, but if you understand it you will have learned a lot about contemporary biology.

I will illustrate the nature of my disagreement with Keller by discussing how genes are replicated. I remarked earlier that the structure of DNA as revealed by Watson and Crick already suggested how it might replicate; in fact, their paper ends with the memorable last sentence, “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.” As is well known, they showed that DNA consists of two complementary strands, each a string of four kinds of chemical units, or bases: adenine, cytosine, guanine, and thymidine, or A, C, G, and T for short. In the double-stranded helix, a C in one strand is always paired with a G in the other, and an A is paired with a T. This structure at once suggests a mechanism of replication. The two strands separate, and each acts as a template for the synthesis of a new complementary strand. The sequence, and hence the information, is preserved by the pairing of complementary bases. For reasons of chemical affinity, C pairs with G, and A with T. And, in a simple account, that is all there is to it.

Sadly, real life is more complicated. If matters were left just to the chemical affinity of C for G, and A for T, a “wrong” base would frequently be inserted; the error frequency would be at least 1 in 100. There would therefore be no way in which a large genome, with some 1,000 million bases, could be reproduced; errors would accumulate at an enormous rate. In practice, the process is executed by enzymes, that is, by proteins that make chemical reactions more rapid and more precise. Indeed, matters are more complex still. After the first enzyme-mediated pairing, the error rate is still about 1 in 1,000. There are then two further steps of checking and error correction, called, appropriately enough, proof-reading and mismatch-repair.

Keller describes this three-stage process in detail. She points out that the notion that DNA “replicates itself” is nonsense; it is replicated by a battery of enzymes. But if DNA cannot replicate without enzymes, it is equally true that enzymes would not exist without DNA, or without the whole machinery of protein synthesis, which starts with DNA. Thus heredity, the property that like begets like, depends not only on complementary base pairing, but on a complex dynamical system, involving both DNA and proteins. In this context she quotes with approval a remark by Max Delbrück, an ex-physicist and early molecular biologist. He pointed out that a system of cross-reacting and inhibiting chemical reactions can lead not just to one steady state, but to multiple steady states.

Delbruck himself did not see this “multiple steady states” model as a general explanation of heredity, but rather as an explanation of “cell heredity”; that is, of the fact that cells of a given type—fibroblasts or lymphocytes, for example—give rise when they divide to more cells of the same type. Keller’s own position is not fully clear to me. However, she is attracted to the idea that heredity depends not just on the copying of genes, but also on the stability of dynamic systems. Discussing the origin of life, she suggests, following the physicist Freeman Dyson, that self-maintaining metabolic systems, initially lacking any replicating molecules, may have played an essential role.

The idea that stable dynamic systems play a role in heredity is one that recurs; I remember being puzzled by it when I was a student. For evolution to be possible, a hereditary system is needed that permits the stable reproduction of an indefinitely large number of different structures; I do not think that a system relying on the alternative steady states of a dynam-ical system could permit this. I would draw quite a different moral from our present knowledge of DNA replication. It is one that Francis Crick famously called “the central dogma of molecular biology”—that information can pass from nucleic acid (DNA and RNA) to nucleic acid, and from nucleic acid to proteins, but not from proteins to nucleic acids. What he meant was this. If, in a lineage of reproducing cells, a single nucleotide—i.e., an individual component of nucleic acid—in a DNA molecule is altered, that alteration will be transmitted to the DNA in future generations, and may alter an amino acid in a protein; but if an amino acid is altered in a protein, that might interfere with DNA replication, but would not result in proteins appearing in future generations with the same altered amino acid. Changes in DNA are inherited, but changes in proteins (specifically, in their amino acid sequence) are not. Although Crick named this a “dogma,” it does appear to be true, perhaps the only universal truth we biologists have. It explains why geneticists take DNA seriously. Its significance for evolution is obvious.

One thing, however, is clear. The present process whereby DNA is replicated is far too complicated to have been a feature of the first living things. What, then, were the first living things like? In particular, how did it come about that like begot like? Without such heredity, there could be no evolution. Keller prefers Freeman Dyson’s suggestion that life originated as a symbiosis between a self-maintaining metabolic system involving proteins and a population of inaccurately rep-licating molecules, probably nucleic acids. I prefer the idea that the first living things—that is, the first entities with heredity and so able to evolve—were molecules, perhaps RNA (a molecule resembling DNA but single-stranded), which acted both as inac-curate replicators and as primitive enzymes; the suggestion is supported by the fact that there are RNA catalysts, analogous to enzymes, in existing organisms. My colleague Eörs Szathmáry and I have discussed elsewhere how this primitive RNA system might have evolved gradually into a DNA-protein system, with a genetic code.^* The ideas are necessarily speculative, but I think they make more sense than a system of heredity based on alternative steady states of a dynamical system.

The fundamental difference between Keller and myself is that, for her, dynamic complexity is fundamental. For me, the crucial idea is the one first suggested by the Watson-Crick structure of DNA—that heredity depends on the chemical affinity of G for C and A for T. As Leslie Orgel remarked, as one traces life back to its origins, features are lost one by one, until one is left just with homologous base pairing, like the smile on the face of the Cheshire Cat.

Another situation in which it seems to me that Keller needlessly complicates things concerns the question, what do genes do? The simple answer, foreshadowed by the slogan “one gene, one enzyme” proposed by Beadle and Tatum in 1941, is that a gene codes for a protein. By a well-understood mechanism, different triplets of bases in the DNA specify different amino acids. The DNA that carries the information also has sequences meaning “start translating here” and “end of protein.” Sadly, for me but not for Keller, there are many complications. I have space to discuss only two. First, between the “proper” genes there are long stretches of DNA that are not translated into protein. A small fraction of this DNA has known regulatory functions, but most of it does not. Most of us tend to regard this DNA as “junk,” but it may have functions we do not know about. Second, within the coding genes there are “introns”—intervening sequences—which are spliced out before the gene is translated.

Obviously, anyone working with DNA must be aware of these complications. There are also fascinating questions about how this extra DNA came to be there in the first place. But in practice, given the sequenced genome of a simple animal such as a fruit fly or a nematode worm, it is possible to identify most of the “genes” that code for proteins, and to deduce the amino acid sequence of the proteins coded for. It is harder to identify all the protein-coding genes in the human genome because of the larger proportion of DNA that codes for nothing. But most biologists would accept that the meaningful part of the human genome consists largely of protein-coding genes. Yet Keller thinks that there is a difficulty in defining a gene functionally as a length of DNA that codes for a protein. I can see that there is a real difficulty in providing a philosopher’s definition that is true of all genes (for example, there are genes which code for functional RNA molecules, but not for proteins), but I don’t think this need worry biologists. After reading what Keller has to say in the last chapter about the way biologists use words, I think she might agree.

2.

I find myself agreeing with much that Keller has to say about development. The classic problem, which goes back at least to August Weismann, is as follows: How does it come about that the cells in different parts of an animal’s body are different? Weismann thought that different genes—he called them ids—were directed to different cells, so that cells in the liver, for example, received only the genes needed in the liver. We now know that, typically, all cells receive a complete set of genes, but that different genes are active in different cells, and at different times. How is this done? Keller describes the crucial step toward an answer, contained in a paper by the French biologists François Jacob and Jacques Monod, published in 1961. In effect, they discovered that there are “regulatory” genes. Such a gene produces a regulatory protein, which acts to switch on, or off, a second gene, by binding to a specific DNA sequence which, in Jacob and Monod’s study, is close to the regulated gene. Their study concerned a bacterium, but it is now clear that in animals and plants the activity of any “functional” gene is influenced, positively or negatively, by a number of regulatory genes.

Their discovery led Jacob and Monod to speak of a “genetic program,” an analogy with computer programs that has since become popular. Keller has no difficulty with the analogy between the regulation of development and a computer program, but she is unhappy with the phrase “genetic program” because a lot more is involved in the process than genes; for example, regulatory proteins are involved, as are messages transmitted between cells. One worry I have with the phrase “genetic program” is that it will make us think that the process is a lot simpler than it really is. The programs I write consist of a linear sequence of instructions, with a beginning and an end; in development many interacting messages are being transmitted simultaneously. Today advanced computer programs are coming to resemble developmental programs in complexity, so my worry is perhaps outdated.

A second question that has concerned me for fifty years, and clearly concerns Keller, is the problem of geometry. How does it come about that the right genes are switched on or off in the right places? Sometimes we know the answer. For example, in the egg of a fruit fly, there is a gradient in the concentration of a protein called bicoid. The gradient arises when the egg is still in the ovary, because the female inserts at one pole of the egg the RNA molecule that codes for bicoid. Most developmentally relevant gradients arise de novo during development. However, I chose bicoid because it illustrates a point that Keller would want to emphasize: there is more in an egg than just a bag of genes. The varying concentration of the bicoid protein then acts as a signal to switch on different genes along the antero-posterior axis of the egg.

The creation of a gradient of concentration by diffusion in this way is a rather simple dynamic process. There may be more interesting ones. It is hard to look at a zebra without thinking that the stripes represent a response to a standing wave, that is, a regular series of peaks and troughs of concentration of some chemical substance; it is absurd to believe that each stripe arises from the activation of a different gene, and even more absurd to think that the same gene responds positively to many different concentrations of the same inducing substance, and negatively to all the intermediate values. As it happens, a dynamical process that could generate a standing wave was suggested almost fifty years ago by the mathematician Alan Turing.

The point of all this is that, in development, genes are switched on or off by other genes, so that different genes are active in different parts of the body. This requires the diffusion of regulatory proteins, and more complex dynamic processes. Keller emphasizes the fact that such processes are remarkably robust. Despite chance events and environmental fluctuations, the outcome of development in different members of a species is remarkably uniform. How can this be? Part of the explanation, Keller suggests, may lie in the fact that many regulatory genes are redundant. This fact emerged rather unexpectedly. Repeatedly, a developmental geneticist will discover a gene in a mouse whose structure suggests that it may play a role in development. Using modern techniques, the gene can be “knocked out,” or rendered inactive. To the frustration of the investigator, the mouse often seems unaffected.

This suggests redundancy—that several genes are available to do the same job. In engineering, this would make good sense; an essential component should always be backed up in case of accidental failure. But there is a difficulty with a similar explanation in biology. A truly redundant gene would rarely be “selected” as part of the process of national selection; that is, its presence would only affect the survival of its possessor on the rare occasion when its “partner” was inactive. It is therefore hard to see how natural selection could maintain such a redundant gene against recurrent harmful mutation. I think there are ways around this difficulty, but we should be cautious about assuming the presence of redundant genes.

It is already clear that the control of development is very complicated; as the Cambridge biologist Sidney Brenner put it, “the real answer must surely be in the detail.” There is, however, one reason to hope that the details, although numerous, will be comprehensible. The organisms we see are the product of natural selection. Very often, evolutionary adaptation requires that one organ change while another remains unaltered, or changes in a different direction. For example, human evolution required that our arms and legs changed independently. Hence natural selection will often favor genes whose effects are local. The result will be a developmental program that is modular; it will not be the case that every gene will affect every organ. It appears to be the case that, particularly in the evolution of the vertebrates, regulatory genes have been duplicated, and that the two copies have subsequently acquired different functions.

Keller draws an analogy between developmental programs and the work of several groups of computer scientists who are working on the design of “robust” programs, able to behave reliably despite being composed of unreliable components, connected in irregular ways. I am not competent to comment on this analogy—my programs blow up when I type a comma for a full stop. But I am encouraged by the increasing cooperation between biologists and computer scientists.

I am intrigued by one outcome of this cooperation that Keller does not discuss. Computer scientists are attempting to solve difficult control problems by a process analogous to natural selection. A population of programs for solving a particular problem is allowed to evolve, under “mutation” (random changes in the program), “recombination” (combining pieces of the better programs—an analogue of sex), and “selection” of the programs giving the best solutions as parents of the next generation. I first played with this idea in 1944 by evolving—by hand—a program to play the simple board game Fox and Geese. At much the same time Donald Michie, who later became a colleague and friend, evolved a program to play tic-tac-toe. He was rather more successful than I, but his was a simpler game. Things have moved a long way since then. But I think it is still too early to say whether this approach, based on selection between rival genetic algorithms, will prove to be more efficient than other problem-solving methods. It would also be nice if computer scientists could tell biologists something we do not know about the nature of efficient algorithms.

Finally, what of the human genome? What will the publication of the complete sequence tell us? At first, very little. I find it hard to believe that any biologist ever thought otherwise. There are two simple reasons why the sequence will be hard to interpret. First, suppose we identify a gene coding for a protein whose amino acid sequence is unfamiliar. At present, we cannot tell how such a linear string of amino acids will “fold up” to form a three-dimensional structure, or what it will do when folded. There is nothing mysterious about this. Most proteins, given the appropriate conditions, will fold themselves up; it is just a matter of chemical forces. The snag is that the equations for any particular protein are too numerous to solve. We can only predict how a sequence will fold if it is similar to one we have met before, whose folded shape is known. There is a second, more serious, difficulty. As I have already explained, development involves complex interaction between regulatory and functional genes. We cannot tell merely from the sequence of a genome which genes will regulate which others, or in what way. It is as if, wishing to learn Hungarian, I am given a grammar and a dictionary, both written in Hungarian. The information is all there, but I can’t read it.

It does not follow that the human genome is useless. It will be an invaluable tool for future research. For example, consider schizophrenia, a condition I choose because it is typical in being difficult to understand; it is not caused by mutations at a single locus, and, for most individual sufferers, there are probably both environmental and genetic predisposing causes. Knowledge of the complete human genome is a valuable, probably essential, aid in identifying the gene loci, certainly more than one and perhaps many, at which mutations may contribute to predisposition. Given such information it will soon be possible for any individual sufferer to discover which of these predisposing mutations, if any, is present. What to do next? That is, at least to me, a much harder question. But when one looks at the progress that has been made in the last fifty years, it is hard to believe that solutions will not be found. The problem, of course, is that as scientific questions are answered, ethical ones will arise. My point, and I think Keller would agree, is that the human genome will not by itself tell us anything—except that it is complicated, and we know that already—but it will help in research directed to answering specific questions.

It will be apparent from this review that I admire the breadth of Keller’s knowledge and her skill in conveying it. I am stimulated by her ideas, but I sometimes find myself disagreeing with her conclusions. Some of our disagreements may be settled by history. Essentially, they are disagreements about the kind of scientific explanation that we expect to be fruitful. I see myself as a reductionist, although my friend the biologist Lewis Wolpert sees me as a woolly-minded holist. I seek simple models of the world whose consequences I can work out. If I have to ignore some of the detail, too bad. My favorite guide to scientific theorizing is a remark by Pete Richerson and Robert Boyd: “To replace a world you do not understand by a model of the world you do not understand is no advance.” Keller, in contrast, thinks that the behavior of a living organism depends on its complexity: if you ignore the complexity, you will not understand anything. If you start by ignoring what you take to be details, or brushing them under the carpet, you will end by throwing away the baby and keeping the bathwater. I suspect that science needs both types of approach. During the past century, geneticists have tended to be reductionists, although there have been exceptions—for example Barbara McClintock and C.H. Waddington. So we need Keller’s voice.

John Maynard Smith

John Maynard Smith, Professor of Biology at the University of Sussex, is the author of On Evolution, The Evolution of Sex, Evolution and the Theory of Games, and, with Eörs Szathmáry, The Major Transitions in Evolution. (December 2000)

This Issue

December 21, 2000