Noam Chomsky
Noam Chomsky; drawing by David Levine


Almost three decades ago I reviewed in these pages a striking development in the study of language that I called “Chomsky’s Revolution in Linguistics.”1 After such a long time it would seem appropriate to assess the results of the revolution. This article is not by itself such an assessment, because to do an adequate job one would require more knowledge of what happened in linguistics in these years than I have, and certainly more than is exhibited by Chomsky’s new book. But this much at least we can say. Judged by the objectives stated in the original manifestoes, the revolution has not succeeded. Something else may have succeeded, or may eventually succeed, but the goals of the original revolution have been altered and in a sense abandoned. I think Chomsky would say that this shows not a failure of the original project but a redefinition of its goals in ways dictated by new discoveries, and that such redefinitions are typical of ongoing scientific research projects.

The research project of the revolution was to work out for each natural language a set of syntactical rules that could “generate” all the sentences of that language. The sense in which the rules could generate the infinite number of sentences of the language is that any speaker, or even a machine, that followed the rules would produce sentences of the language, and if the rules are complete, could produce the potentially infinite number of its sentences. The rules require no interpretation and they do more than just generate patterns. Applied mechanically, they are capable of generating the infinite number of sentences of the language.

Syntax was regarded as the heart of linguistics and the project was supposed to transform linguistics into a rigorous science. A “grammar,” in the technical sense used by linguists, is a theory of a language, and such theories were called “generative grammars.” Stated informally, some rules of English are that a sentence can be composed of a noun phrase plus a verb phrase, that a verb phrase can consist of a verb plus a noun phrase and that a noun phrase can be composed of a “determiner” plus a noun, that nouns can be “woman,” “man,” “ball,” “chair”…; verbs can be “see,” “hit,” “throw”…; determiners can be “the,” “a”…. Such rules can be represented formally in the theory as a set of instructions to rewrite a symbol on the left side as the symbols on the right side. Thus,

S → NP + VP

VP → V + NP

NP → Det + N

N → man, woman, ball…

V → hit, see, throw…

Det → a, the…

This small fragment of an English grammar would be able to generate, for example, the sentence

The man hit the ball.

Such rules are sometimes called “rewrite rules” or “phrase structure rules” because they determine the elementary phrase structure of the sentence. Chomsky argued that such rules are inadequate to account for the complexities of actual human languages like English, because some sentences require that a rule apply to an element not just in virtue of its form, but in virtue of how it got that form, the history of how it was derived. Thus, for example, in the sentence

The chicken is ready to eat

even though the words are not ambiguous, the sentence as a whole is syntactically ambiguous depending on whether “chicken” is the subject or the object of “eat.” The sentence can mean either the chicken is ready to eat something, or the chicken is ready for something to eat it. To account for this ambiguity it seems, Chomsky argued, that we have to suppose that the sentence is the surface expression of two different underlying sentences. The sentence is the result of applying rules that transform two different underlying, or deep, structures. Such rules are called transformational rules, and Chomsky’s version of generative grammar was often called “transformational grammar” because of the argument for the necessity of transformational rules. In the classical versions of the theory, the phrase structure rules determined the “deep structure” of the sentence, the bearer of meaning; the transformational rules converted deep structure into surface structure, something that could be uttered. In the example of the chicken above, there is one surface structure, the sentence I have quoted, and two deep structures, one active, one passive.

It was a beautiful theory. But the effort to obtain sets of such rules that could generate all and only the sentences of a natural language failed. Why? I don’t know, though I will suggest some explanations later. But seen from outside a striking feature of the failure is that in Chomsky’s later work even the apparently most well-substantiated rules, such as the rule for forming passive sentences from active sentences, have been quietly given up. The relation between “John loves Mary” and “Mary is loved by John” seemed elegantly explained by a transformational rule that would convert the first into the second. Apparently nobody thinks that anymore.


Another feature of the early days was the conviction that human beings were born with an innate brain capacity to acquire natural human languages. This traditional view—it goes back at least to the seventeenth century—seemed inescapable, given that a normal infant will acquire a remarkably complex system of rules at a very early age with no systematic teaching and on the basis of impoverished and even defective stimuli. Small children pick up a highly competent knowledge of a language even though they get no formal instruction and the utterances they hear are limited and often not even grammatical.

The traditional objection to this “innateness hypothesis” (Chomsky always objected to this term, but it seems reasonable enough) was that languages were too various to be accounted for by a single brain mechanism. Chomsky’s answer was that the surface variety of languages concealed an underlying structure common to all human languages. This common structure is determined, he wrote, by an innate set of rules of Universal Grammar (UG). The innate mechanism in the brain that enables us to learn language is so constituted that it embodies the rules of UG; and those rules, according to Chomsky, are not rules we can consciously follow when we acquire or use language. I think the official reason for the abandonment of the research program was that the sheer complexity of the different rule systems for the different languages was hard to square with the idea that they are really all variations on a single underlying set of rules of UG.

There were, as might be expected, a number of objections to Chomsky’s proposals. I, for one, argued that the innate mechanism that enables the child to acquire language could not be “constituted” by—i.e., made up of—rules. There are no rules of universal grammar of the sort that Chomsky claimed. I argued this on a number of grounds, the chief being that no sense had been given to the idea that there is a set of rules that no one could possibly consciously follow: if you can’t follow them consciously then you can’t follow them unconsciously either. I also argued that, precisely to the extent that the mechanism was innate and applied automatically, it was empty to suppose that its application consisted in rule-governed behavior, No sense, I wrote, had been given to the idea of rules so deeply buried in unconscious brain processes that they were not even the sort of things that could be consciously followed.

Just as a child does not follow a rule of “Universal Visual Grammar” that prohibits it from seeing the infrared or ultraviolet parts of the electromagnetic spectrum, so the child does not follow rules of Universal Linguistic Grammar that prohibit it from acquiring certain sorts of languages but not others. The possibilities of vision and language are already built into the structure of the brain and the rest of the nervous system. Chomsky attempted to answer my arguments in a number of places, including the book under review. But in the case of UG he has given up the idea that there are rules of universal grammar.

In his recent book, as well as in other works (most importantly, The Minimalist Program2), Chomsky advances the following, much more radical, conception of language: the infant is indeed born with an innate language faculty, but it is not made up of any set of rules; rather it is an organ in the brain that operates according to certain principles. This organ is no longer thought of as a device for acquiring language, because in an important sense it does not so much acquire as produce any possible human language in an appropriate environment. Chomsky writes,

We can think of the initial state of the faculty of language as a fixed network connected to a switch box; the network is constituted of the principles of language, while the switches are the options to be determined by experience. When the switches are set one way, we have Swahili; when they are set another way, we have Japanese. Each possible human language is identified as a particular setting of the switches—a setting of parameters, in technical terminology. If the research program succeeds, we should be able literally to deduce Swahili from one choice of settings, Japanese from another, and so on through the languages that humans can acquire. [my italics]

According to this view, the possibility of all human languages is already in the human brain before birth. The child does not learn English, French, or Chinese; rather, its experiences of English set the switches for English and out comes English. Languages are neither learned nor acquired. In an important sense they are already in the “mind/brain” at birth.


What happens, then, to the rules of grammar? Chomsky writes that

This “Principles and Parameters” approach, as it has been called, rejected the concept of rule and grammatical construction entirely: there are no rules for forming relative clauses in Hindi, verb phrases in Swahili, passives in Japanese, and so on. The familiar grammatical constructions are taken to be taxonomic artifacts, useful for informal description perhaps but with no theoretical standing. They have something like the status of “terrestrial mammal” or “household pet.”

The overall conception of language that emerges is this: a language consists of a lexicon (a list of elements such as words) and a set of computational procedures. The computational procedures map strings of lexical elements onto a sound system at one end and a meaning system at the other. But the procedures themselves don’t represent anything; they are purely formal and syntactical. As Chomsky says,

The computational procedure maps an array of lexical choices into a pair of symbolic objects…. The elements of these symbolic objects can be called “phonetic” and “semantic” features, respectively, but we should bear in mind that all of this is pure syntax and completely internalist.

Chomsky is eager to emphasize that the principles and parameters approach is a tentative research project and not an established result, but it is pretty clear that he thinks the original project of thirty-five years ago has failed. For years he has told us that the interest of the study of language was that it was a “window on the mind” and that from it we could identify a great many of the mind’s properties. One of his favorites: the mind uses “structure-dependent” rules, for example the transformational rules I earlier described.3 Now he has given all that up. Language is a specific faculty with no general mental implications; and there are no rules, hence no structure- dependent rules. In an important sense there aren’t even any languages. All each person has is what he calls an “I-language,” “I” for internal, individual, and intensional.4 A neutral scientist, a “Martian scientist” in Chomsky’s thought experiment, “might reasonably conclude that there is a single human language, with differences only at the margins.”

What about words and their meanings? Well, Chomsky speculates, maybe all possible concepts are also in the brain and what we call learning the meaning of a word is really just learning a label for a concept we have always had. “However surprising the conclusion may be that nature has provided us with an innate stock of concepts, and that the child’s task is to discover their labels, the empirical facts appear to leave open few other possibilities.” So, to take two examples discussed by Chomsky, on this view every human child that ever lived had at birth the concepts of “bureaucrat” and “carburetor”; indeed children born to the cave men twenty thousand years ago had these concepts, and all of us would still have them even if carburetors had never been invented and bureaucrats had never existed. In the face of the sheer implausibility of this claim Chomsky likes to appeal to the example of the immune system. Nature has provided us with the capacity to produce a huge stock, literally millions, of antibodies, even antibodies against antigens that have been artificially synthesized. So why not a huge stock of innate concepts, ready for any word we could conceivably invent? On this view, the only part of language that depends on stored conventions is the sounds of the words used to label the innate concepts.


To people who take the study of language seriously I think all this ought to seem more disquieting than it does to Chomsky. For all those years he was telling us that we had overwhelming evidence that speakers of a language were following certain types of rules, and that we even had evidence about which rules they were following. What happened to all that evidence? If the rules are all thrown out, what was the “evidence” evidence for?

Let us start with Chomsky’s idea of a neutral Martian scientist arriving on Earth and finding our languages an object of study for “natural science.” The point of imagining a Martian, he said, is to free us of our local prejudices. The scientist will find that we all speak the same language, except “at the margins,” and that the I-language with its variations is the proper object of study for natural science. Does that sound right to you? It doesn’t to me. First, any such scientist has to have a language of her, his, or its own. No language, no science. So the scientist’s first step is to compare our languages with her own. How are they like and unlike Martian? The only way I can imagine the scientist doing this is to imagine that she learns one of our languages, say English. She does that as anyone, scientist or otherwise, would, by figuring out how to translate her expressions into English and English expressions into Martian.

Let us suppose she is so good at it that soon she is bilingual. Then she will discover an interesting fact. Knowledge of English is not much use to her when she is confronted with monolingual Finnish speakers. For example she will eventually find out that the Finnish single-word sentence, “Juoksentelisinkohan,” appropriately pronounced, translates into English as “I wonder if I should run around a little bit without a particular destination.” So to learn Finnish she has to start all over again. And the same sequence repeats itself when she tries to converse in Arabic, Swahili, or Japanese. Is there really only one language on earth? Not in her experience.

Worse yet, she will soon discover that language is not an object of “natural” science and could not be. The distinction, rough as it is, between the so-called “natural” sciences and the “social” sciences is based on a more fundamental distinction in ontology, between those features of the world that exist independently of human attitudes, like force, mass, gravitational attraction, and photosynthesis, on the one hand, and, on the other, those whose existence depends on human attitudes, like money, property, marriage, and government. There is a distinction, to put it in very simple terms, between those features of the world that are observer-independent and those that are observer-relative or observer-dependent. Natural sciences like physics, chemistry, and biology are about features of nature that exist regardless of what we think; and social sciences like economics, political science, and sociology are about features of the world that are what they are because we think that is what they are.

Where, then, do language and linguistics fit in? I think it is obvious that a group of letters or sounds can be called a word or a sentence of English or Finnish only relative to the attitudes of English and Finnish speakers. You can see this quite clearly in the case of linguistic changes. Pick up a text of Chaucer and you will find sentences that are no longer a part of English, though they once were, and you can produce English sentences that were not part of Chaucerian English. Of course, Chomsky is right to insist that “English” is not a well-defined notion, that the word has all sorts of looseness both now and historically. I am a native English speaker, yet I cannot understand some currently spoken dialects of English. All the same, the point remains: a group of letters or sounds is a sentence, or a word, or other element of a language only relative to some set of users of the language.

The point has to be stated precisely. There is indeed an object of study for natural science, the human brain with its specific language components. But the actual languages that humans learn and speak are not in that way natural objects. They are creations of human beings. Analogously humans have a natural capacity to socialize and form social groups with other humans. But the actual social organizations they create, such as governments and corporations, are not natural, observer- independent phenomena, they are human creations and have an observer- dependent existence. As their speakers develop or disappear, languages change or die out.

There is a deep reason why languages like English or Finnish must be rule-governed. The sentences and other elements only exist as part of the language because we regard them as such. Language is in an important sense a matter of convention. But if so, there must be some principles by which we regard some strings as sentences of English and others not. Being a sentence of English is not a natural fact like being a mountain or a waterfall; it is relative to the observer. Functional phenomena that are relative to an observer divide into two kinds, those like knives, chairs, and tables, which can function as such because of their physical structure, and those like money, language, and government, which function the way they do because we assign to them a certain status and with that status a function that can only be performed because of the collective acceptance of the entities as having a certain status and with that status a function.5

The second class, the status functions, require systems of rules (conventions, accepted procedures, principles, etc.). Human languages, like money, property, marriage, baseball games, and government, are constituted by certain sorts of rules that, years ago, I baptized “constitutive rules.”6 Such rules do not merely regulate existing activities, like the rules of driving, but they create the very possibility of such activities. There are no purely physical properties that are sufficient to determine all and only sentences of English (or money, baseball, US congressmen, married couples, or private property). But why not, since all these are physical phenomena? Because the physical phenomena satisfy these descriptions only relative to some set of conventions and of people’s attitudes operating within the conventions. Something is money, property, a sentence of English, etc., only relative to the attitudes people have within systems of rules. That language is constituted by rules cannot be legitimately denied, as now Chomsky tries to do, on the theoretical ground that it is hard to square with a certain conception of the innate language faculty.

But why did the attempt by linguists to get descriptively and explanatorily adequate generative grammars fail? I said I did not know, but here is one hypothesis. They wanted rules of a very unrealistic kind. They wanted rules for sentence formation that could be stated without any reference to the meanings of words or sentences and they wanted rules that generated sentences algorithmically, i.e., according to a set of precisely statable steps, without any need for interpretation and without any “other things being equal” conditions. The model was based on the formation rules for artificially created logical and mathematical systems. But human social rules are almost never like that. The history of the passive transformation is illustrative. You can formulate a transformational rule that converts sentences of the form

NP1 verbs NP2


NP2 is verbed by NP1.

Thus it converts

John loves Mary


Mary is loved by John.

But what about sentences like

John weighs one hundred and sixty pounds


John resembles Eisenhower.

These do not yield

One hundred and sixty pounds is weighed by John


Eisenhower is resembled by John.

Why not? I think any child recognizes that the passive does not work in these cases because of the meanings of the words. Resembling and weighing are not things that can be done by someone to someone or something in the way that loving, seeing, hitting, and desiring can be. So you can passivize sentences with “loves,” “sees,” “hits,” and “desires” but you can’t turn sentences into the passive voice with “weighs” and “resembles.” Perhaps in other languages sentences with verbs synonymous to these permit conversion into the passive, but not in English. The point is not that I have given a correct explanation, but rather that this sort of explanation was not permissible in generative grammar. The proponents of generative grammar required explanations using only syntactical rules—no meanings allowed—operating on syntactical entities.

The correct picture seems to me this. There are indeed innate mechanisms in the human brain for acquiring and using language. That is why we have languages and our close relatives, the chimpanzees, do not. The mechanisms work according to certain principles, like any other organ. But it is not a matter of rules, and learning a language is not a matter of following rules of Universal Grammar, any more than seeing something is a matter of following rules of Universal Visual Grammar.

There are indeed rules of specific languages, but the effort to find generative grammars for these languages is bound to fail, precisely because the aim was to obtain rigorous, strict, exceptionless rules of the sort that you get for constructing formal systems such as the predicate calculus, or axiomatic set theory, and such rules make no reference to what the entities were to be used for. The rules were to be stated without any reference to the meanings or the uses of the sentences generated. Natural human phenomena almost never have rules like that. There will often be exceptions to a rule; there will typically be semantic considerations in the formulation and application of the rule; and there will in general be an “other things being equal” clause understood in the application of the rule.

When Chomsky suggests that the concepts expressed by words like “carburetor” and “bureaucrat” must be innately known by every child, and that learning the meanings of the words is just a matter of applying labels to concepts the child already has, you know that something has gone radically wrong. He has a very unrealistic conception of learning. It is as if he supposed that learning the meanings of these words would have to consist in having one’s nerve endings stimulated by passing bureaucrats and carburetors, and because there is no way such passing stimuli could ever give us the meanings of these words, it looks like the meanings must be innate.

This argument is called the argument from the “poverty of the stimulus” and it occurs over and over in Chomsky’s work. But a more realistic conception is the following: in order to understand, for example, the word “bureaucrat,” a child has to be introduced to a culture, a culture that includes governments, bureaus, departments, powers, employment, and a host of other things. A child does not learn a set of discrete concepts, but learns to master a culture, and once that culture is mastered, it is not difficult for him to understand the word “bureaucrat.” Similar remarks could be made about “carburetor.” This concept only makes sense within the context of some knowledge of internal combustion engines. Once you have the basic understanding of how such engines work it is not hard to understand that a carburetor is a device for mixing air and fuel.

Furthermore, one often has a partial or imperfect knowledge of a concept. Chomsky’s analogy with the immune system thus seems grossly inadequate. Concepts are seldom all or nothing, and they are almost always systematically related to other concepts. You cannot have the concept of “carburetor” or “bureaucrat” without having a great many other logically related concepts. But chemical compounds are both all-or-nothing and discrete. Each antibody is distinct from every other antibody, and for any antibody you either have it or you don’t. For concepts you can have a partial grasp of the concept, and there is no way you can have a concept without having many other concepts.


I do not wish to give the impression that Chomsky’s entire book is concerned with these issues. On the contrary, most of the book is concerned with debates about current issues in philosophy. I will discuss one of them, the question of unconscious rules of human cognition, which is related to the question of language. A standard explanatory device in Chomsky’s earlier work, and in cognitive science in general, is to claim that we are unconsciously following rules. The importance of this can hardly be overestimated. Once we have the possibility of explaining particular forms of human behavior as following rules, we have a very rich explanatory apparatus that differs dramatically from the explanatory apparatus of the natural sciences. When we say we are following rules, we are accepting the notion of mental causation and the attendant notions of rationality and existence of norms.

So, for example, if we explain my driving behavior by saying that I am following the rule “Drive on the right- hand side of the road,” even when I am following this rule unconsciously, we have a mode of explanation that is quite different from saying that the car follows the rule “Force equals mass times acceleration.” Both “rules” describe what is happening, but only the first actually is a case of following a rule. The content of the rule does not just describe what is happening but plays a part in making it happen. In order to make an explanation of behavior as following rules work, we need to be able to distinguish cases which are guided by a rule from cases which are merely described by a rule. One condition of rule-guided explanations is that the rules have to be the sorts of things that one could actually follow. If you spell out those conditions, you find that unconscious rules have to be the sort of things that at least could be conscious. So, for example, I can follow the rule “Drive on the right” unconsciously, but it is the sort of rule I could bring to consciousness. For a number of reasons rules may be unconscious, and in some cases, such as brain damage or repression, a person may be unable to bring the rule to consciousness. But an unconscious rule has to have the kind of content which could be consciously understood, interpreted, followed, or violated.

Chomsky’s rules do not meet that condition. For him the rules of language are “computational” rules, but what exactly is the definition of computation, according to which these rules are computational? On the standard definition of computation, we are to think of computations as reducing to vast sets of zeroes and ones zapping through the computer. Is that how we are to think of unconscious rule-following on Chomsky’s model? Lots of zeroes and ones in the child’s head? That can hardly be right because the zeroes and ones are in the mind of the programmer. In actual commercial computers, the only reality independent of the observer consists of rapid—millions per second—transitions in complex electrical circuits. Commercial computers don’t literally follow rules because they do not have the mental apparatus necessary for rule-following. We program the computers to behave automatically as if they were following rules, and thus we can get the same results as human rule-following behavior.

So there is a dilemma: if we are to think of computational rule-following in the technical sense of reducing to binary symbols, then there is literally no rule-following independent of an observer and we have lost the explanatory power of psychological explanation. If we are to think of computation in the common-sense meaning, according to which, when we say, for example, that the child computed the meaning of the sentence we just mean that he figured it out, then the unconscious rules do not meet the condition of being thinkable.

Chomsky has now given up on the idea that there are rules of particular languages, but the difficulty about computation remains. This is an absolutely crucial point at issue and I want to make it completely clear. Chomsky insists that the study of language is a branch of natural science and the key notion in his new conception of language is computation. On his current view, a language consists of a lexicon plus computations. But my objection to this is that computation is not a notion of natural science like force, mass, or photosynthesis. Computation is an abstract mathematical notion that we have found ways to implement in hardware. As such it is entirely relative to the observer. And so defined, in this observer-relative sense, any system whatever can be described as performing computations. The stone falling off a cliff computes the function “The distance I fall has to equal half the square of gravity multiplied by the time I fall”: S = 1/2(gt2). The water flowing over a dam at the rate of one gallon per second computes the addition function 2 + 2 = 4 every four seconds, and so on with everything else in the world. Unlike, say, electrical charge, computation is not discovered in nature, rather it is assigned to physical processes. Natural processes can be interpreted or described computationally. In this observer-relative sense there cannot be a natural science of computation.

In the original definition of computation, before the invention of “computing machinery” by Alan Turing and others,7 “computing” meant figuring something out in arithmetic or mathematics, and a “computer” was a person who computed. In the sense in which I solve mathematical problems I really am intrinsically computing and the attribution of computational features to my conscious thought processes is not relative to an observer. Now, when Chomsky says that language is a matter of computation, which is it? Is it the observer-relative sense of zeroes and ones? If so, the project is no longer natural science and the computations do not explain the phenomena but merely describe processes whose causal explanation has to be found in neurobiology. Is it the observer-independent sense in which human beings figure things out? If so, then the unconscious rules don’t meet the conditions necessary for rule-following explanations. In neither case do we get an account of language that is at all like natural science. Chomsky says, “John Searle and I have discussed these issues for some years.” Indeed. And I expect the discussions to continue.

In any case, as I noted above, Chomsky has now given up on the idea that Universal Grammar is a matter of unconscious rule-following. But he also dismisses the idea that real human languages are governed by rules. That, I believe, cannot be right.

I would not wish my criticisms of Chomsky to be misunderstood. At a time when various embarrassingly incompetent accounts of language are widespread in university humanities departments under such names as “literary theory,” “deconstruction,” and “postmodernism” it is worth emphasizing that his work in linguistics is at the highest intellectual level.8

—This is the first of two articles about linguistics.

This Issue

February 28, 2002