Kant and Chomsky:

What Do We Know Before We Know?



Rationalism and Empiricism


The main two epistemological positions are rationalism and empiricism. Rationalism is the theory that we can know reality a priori, that is, without relying on our experience. In other words, rationalists believe that reason, by itself, can furnish the mind with knowledge. In fact, ratio means reason in Latin. Empiricism is the theory that we can only know reality a posteriori, that is, with the aid of our senses. Reason versus senses is the basic dichotomy in epistemology.


Both rationalism and empiricism have a very long history. There have been rationalists and empiricists since the very beginnings of Western Philosophy, with the Presocratics. Consider Parmenides, for example. As you may remember, Parmenides concluded that change is not possible. Nothing ever changes. He built an argument from two premises: What is is, and what is not is not. We will not go into the details of the argument here, but notice that you could evaluate the truth of the premises inside the Mystery Room, no problem. Once you accept both premises as true, by merely logical steps (that is, without leaving the Room at all), you will have to conclude with Parmenides that change is impossible, and, therefore, all the change we ever experience is just an illusion. For this reason, Parmenides is often referred to as “the First Logician.”


Empiricism, or the belief in only a posteriori truths, also has its roots in Presocratic Philosophy. However, although Parmenides and Plato are bona fide rationalists, we will not be able to find a 100% empiricist until we get to Modern times (remember that “Modern” refers to the 1600s and 1700s.) In fact, it is in the 17th and 18th centuries that we find some of the most important representatives of both epistemological positions. Interestingly enough, the distinction seems to go along geographical lines, too. Continental (as in “continental breakfast”) Modern Philosophers, such as Descartes, Leibniz and Spinoza, tend to be rationalists, whereas British Modern philosophers, such as Locke, Berkeley and Hume, tend to be empiricists (we even call those three the British Empiricists!)


A good way to see the difference between empiricists and rationalists is to consider how they envision the mind before experience. For a rationalist, the mind before experience is full of stuff. Obviously, all those things that the mind knows before (or independently of) experience, are considered a priori. Think of Plato, for instance.  He did not only believe that the mind can attain a priori knowledge, but he even believed that anything that deserved the name of knowledge has to be a priori. Not all rationalists are so extreme, though. Descartes, for instance, allowed for a lot of empirical knowledge, but we consider him a rationalist because he believed that there are some things that we know a priori (God’s existence, for instance). So, if you are a rationalist, you will believe that there is at least something in your mind that is a priori (n fact, most rationalists would agree that there are lots of things.)


Contrast that to the way empiricists envision the mind without experience. All empiricists subscribe to the tabula rasa or blank slate theory. What this theory says is that, without experience, the mind is empty. In other words, any idea in our mind can be tracked back to some impression.


This provides an easy way to distinguish between rationalists and empiricists. You just have to look at their pictures of the mind before experience. If you can find some knowledge, then you know you’ve found a rationalist. If you cannot find any knowledge at all, then you know you’ve found an empiricist.


So can somebody be a rationalist and an empiricist at the same time? If somebody was both, then in their picture of the mind before experience, there would be some knowledge and no knowledge. Since both claims are clearly contradictory, we can safely conclude that, no, somebody cannot be a rationalist and an empiricist at the same time.


That doesn’t mean that there are no common grounds between empiricists and rationalists. Both modern empiricists and rationalists believed in an objective reality that our mind adapts to. Their disagreement was as to when it adapts. For a rationalist, the mind would adapt to reality before experience (remember Plato and the life of the soul among the Forms before incarnation), whereas for an empiricist, the mind would adapt to reality after experience (remember Locke’s Copy Theory, for instance.) So for both rationalists and empiricists the mind was some sort of silly putty that took the shape of the knowledge it had been exposed to.


This appears to be a very intuitive idea. Most people that have never stopped to think about these issues would agree with the claim that our mind is passive when it comes to knowledge, that when we learn something our mind is just a passive receptacle onto which knowledge is poured. For many centuries, it was the paradigm, and both rationalists and empiricists accepted it.




And then came Immanuel Kant. You see, Kant was neither strictly a rationalist nor strictly an empiricist. He wasn’t both, either, since, as we have seen above, holding both beliefs would be contradictory. It’s true that he had a soft spot for rationalism, but it is also true that he liked some empiricists, particularly Hume, very much. So what he did was to create a synthesis that included elements from both theories into a new one.


Empiricists believe that all knowledge is a posteriori. Rationalists believe that (at least) some knowledge is a priori. What Kant claimed was that both rationalists and empiricists were wrong, since there is no a posteriori knowledge and there is no a priori knowledge. Any kind of genuine knowledge, knowledge that conveys some nontrivial information about reality, is a combination of a priori and a posteriori elements. In other words, if you only have the a priori elements, you don’t have knowledge, and if you only have the a posteriori elements, you don’t have knowledge, either. You need to have both.


This was quite a new idea, and it changed the way people thought about knowledge and learning forever, up to the point that people sometimes talk about a Kantian Revolution, equating it to the Copernican Revolution. By “revolution” we do not mean some sort of bloody political upheaval, but a sudden shift of paradigm. Copernicus drastically changed the way we look at the universe by shifting the perspective from a geocentric model, in which the earth is at the center, to a heliocentric model, in which the Sun is in the center. This was akin to what Kant did regarding knowledge. Before him, both rationalists and empiricists put all the emphasis on reality, whatever they meant by it (Forms, physical objects, etc.). Reality was at the center, and our mind adapted to it (before or after experience). After Kant, our mind will be at the center, and reality will adapt to our mind. But this is all very complicated, so it will be better if we proceed more slowly.


Remember we said above that Kant proposed that knowledge always involved a priori and a posteriori elements. So, before experience, our mind is not totally empty. There is something there, although that something does not qualify as knowledge. What is that something? Well, it is the a priori conditions that will make knowledge possible. Since this is probably not helping much, let me give you an example.


Before the semester starts, I get ready for it in many ways. Something I do, for instance, is to create a spreadsheet that will allow me to record my students’ grades. I have a row for each prospective student, and a column for each assignment. Does that mean I already know what grades my students will have? Of course not! I will not know that until I have some real experience of my students (that is, until I grade their work), but now I have at least the apriori conditions that will make knowledge of my students’ grades possible.


Having that sort of spreadsheet has many organizational advantages, but, if you think about it, it also limits the kinds of things I am able to record. If I do not have a column for specific class contributions, for instance, those contributions will go unrecorded. No instructor can record every single thing about their students, it is just not possible, so we all pick and choose what kinds of things we are going to keep records of: absences, class participation, homework, test scores, etc.


Kant claimed that our mind also picks and chooses what kinds of things it is going to keep records of. Our mind knows how to organize our experiences before we have them. How does our mind learn to organize our experiences? It doesn’t learn it, that’s why this is the a priori element of knowledge. Even very young children organize their experiences in a certain way. We would like to believe that, without those a priori principles of organization, we would get to know reality-as-it-really-is (noumena), but Kant hypothesizes that, without some a priori way of organizing our senses’ input we would not have knowledge, just chaos. That doesn’t mean that the way we do it is the only possible way. There may be other ways of organizing the information our senses provide us with, but the point is that there has to be a way.


Kant believed that, by virtue of being humans, we all organize our experiences in a remarkably similar fashion. Not everybody agrees with this statement, though. During several decades of the previous century it was fashionable to claim that we construct our experiences in different ways depending on our cultural background. In other words, there is no totally a priori element in knowledge, since even the way we organize our experiences into knowledge is learned. We will touch on this point later.


Now, let’s regroup for a second. As we have seen, Kant disagreed with the previous epistemologists in that they believed that the mind conforms to reality, whereas Kant claimed that the reality we get to know is such because it conforms to the a priori elements of our mind. This is what has been called “the Kantian Revolution”.  We organize our experiences into knowledge based on some a priori principles of organization. If we didn’t have them, the world would present itself to us as some sort of chaotic jumble of colors, sounds and events… or so we think. The truth is that it is totally unimaginable to conceive how the world would appear to somebody who didn’t have any way of cataloguing the sensory information. (FOOTNOTE: Maybe that is exactly what happens when somebody reaches nirvana/satori. The problem is that such experience becomes utterly private, since any attempt of communicating it to an audience will necessarily have to go through language, which imposes a set of constraints at least as astringent as those of the a prioris of the mind.)




In a very often-quoted article, the New York Times called Noam Chomsky “the most important intellectual alive”. That was several years ago, and, as I am writing this, Chomsky is still alive. He will turn 7x on X.


Some of you may know Chomsky for his political activism. Since the Vietnam War, he has been a relentless critic of American foreign policy, (FOOTNOTE: By the way, he is American. Although I once had a student that said “Chomsky? That doesn’t sound like an American last name!”. I guess for some people, American means WASP.) He has written many books and articles on the topic, one of the last ones being 9/11. You can check some of his writings regarding activism and other related materials, including an interview by Rage Against the Machine (they are big fans of Chomsky!), on zetamag.com (check reference.)


However, it is not Chomsky’s political ideas that I would like to discuss here, but his theories regarding linguistics. Because, besides being intensely active when it comes to civic involvement, Chomsky’s influence on the field of linguistics is so great that I do not think there has been any other person in the second half of the 20th century that has revolutionized their field as totally and irrevocably as Chomsky has revolutionized linguistics. Quite simply, if you are going to do linguistics today, you have to make it with reference to his work. That doesn’t mean you have to agree with him–what it means is that you cannot ignore him. And I am not the only one that says it. Check out the following quote by John L. Casti:


In virtually no other area of modern intellectual life that I’m aware of has one man’s work so completely shaped a field as Chomsky’s view have done in linguistics… In a very specific sense, Chomsky’s views define what we mean by much of modern linguistics, in much the same way that Newton’s ideas defined classical particle mechanics.


Here I am, talking about linguistics this and linguistics that, and you probably do not even know what a linguist does, although I am sure you can infer it has something to do with language. And that is exactly what linguists do: they study language. Before Chomsky, most linguists were interested in specific languages. They took an empirical approach to the subject, going to the field (a lost island in the Pacific, say) to collect data (write down all the things that the speakers said, hoping that certain grammatical patterns became apparent.) After Chomsky, in what we can call modern linguistics, the main question is not about what makes languages different (different sounds, different words, etc.), but about what all human languages have in common.


One clear example is the issue of language acquisition. How do children learn how to speak? This question applies to all human languages, since, in each and every human society, children do learn to speak.


So, how do you think children learn how to speak? Probably you think that the main mechanism involved is imitation. This appears to be the most commonsensical explanation, right? In fact, before Chomsky, most linguists would have agreed with you, and claim that children learn to speak by hearing what grown-ups say and then repeating it.  This theory of language acquisition is consistent with empiricism. As I hope you remember, empiricists believe in the Blank Slate Theory. In regards to language acquisition, the theory says that babies are born with no knowledge of language at all, their little brains being totally empty of linguistic structure. They just copy what they hear, and then it is just a matter of recombining the different words to create new sentences.


It cannot be denied that there is a great deal of imitation in learning a language, particularly at the beginning. My daughter Sofia, who is two years old as I write this, mimics everything we tell her. If you offer her a piece of chocolate asking her “Do you want some?”, she replies “You want some!”, instead of “I want some!”.  When we ask her to bring something from her room saying “Go get it!”, she runs to her room crying “Go get it!”. However, there is more to learning a language than imitation.  Parrots imitate English expressions, but that does not mean that they know English. If Sofia is like any other normal healthy toddler, she will grow out of this phase and, at some point, she will stop imitating our sentence and start creating hers.


Another objection against the mere imitation hypothesis is the fact that children say things that they have never heard before, such as broked (as in “She broked her arm”), mouses and bestest (as in “You’re the bestest mother in the world!”). How do kids come up with those words? If you think about it, you will notice that kids have problems with exceptions–with irregular verbs and plurals that are created in ways other than just adding an s. If learning a language were just a matter of regurgitating words that have been heard before, children should not have more problems with exceptions that with their regular counterparts.


I mentioned parrots above, and thinking about non-human animals is also useful when we test our hypotheses of how children learned languages. Parrots are not the only ones that can imitate human linguistic expressions. So can gorillas and chimps. Their ability is obviously limited by their physiognomy (they lack vocal cords, etc.), but they are very good at imitating hand gestures.


Let’s have a little parenthesis here about sign languages. Many people believe that ASL (American Sign Language) and the other languages used by the Deaf community (FOOTNOTE: “Deaf”, with capitalized D, is used to refer to the community of speakers that communicate using sign languages and the culture they have developed, as opposed to “deaf”, with lower case d, which refers to the medical condition of lacking the sense of hearing. Most Deaf people are extremely proud of their language and their culture, and many find expressions such as “hearing impaired” patronizing and denigrating) around the world are just basic hand gestures of the kind you would improvise in a foreign country where you do not have the most rudimentary knowledge of the native language. However, that is not what ASL, to use our closest example, is at all. ASL is a bona fide language. It is not English spelled with the hands (that is only used to spell out proper names and other words that are not translatable) and it is not just a collection of signs for things that can be put together in any random way. ASL has the same characteristics of any other human languages. ASL speakers communicate in sentences, and they can really tell if a sentence is well or ill-formed–just like you can tell that there is something wrong with the sentence “My quickly feet hurts”.  ASL speakers can even detect different accents!


I was saying that some apes are very good at imitating hand gestures. In fact, researchers have tried teaching ASL to gorillas and chimpanzees (the most famous being Koko, a gorilla, and Nim Chimsky, a chimp) with some success. The apes were able to learn lots of words and communicate with the researchers in some basic ways, but they never became fluent in the language. Let me draw a parallelism so that you can see what I mean. If Koko said “Me banana big eat”, she might be trying to tell you what she wants to eat or what she just ate, but that string of words would not qualify as an English sentence. That’s exactly what happened with ASL. The apes were able to imitate the words, but they were never able to master the grammar.


So empiricism does not appear to be very promising when it comes to the topic of language acquisition. As we have just seen, the hypothesis that children learn language by mere imitation is contradicted by the fact that children say things they have never heard (or seen, in the case of sign languages) before and by the fact that no non-human animals, including the ones that are perfectly able to imitate our speech, can master any human languages, a feat that is so easily accomplished by any normal human preschooler.


So, if it is not by copying other people, how is it that children learn how to speak? Because the truth is that virtually all of them do, and without the need of fancy flash cards or Baby Einstein videos. And not only that–children will learn the language that is spoken in their community, regardless of whether the language is English, Chinese or ASL (FOOTNOTE: Children whose mother tongue is ASL babble with their hands!)


And this brings us back to Chomsky. Chomsky brought the Kantian revolution to the field of linguistics. You will remember that Kant suggested that there are certain a priori features of our mind to which all our experiences and knowledge conforms. That is, he didn’t believe our mind adjusted to reality, like the empiricists and the rationalists claimed, but that the reality we know adjusts to our mind. Chomsky’s hypothesis is that some aspects of language are a priori.


The a priori component of language is what Chomsky calls Universal Grammar. There are two things to notice in that name–one is that it is Universal and the other one is that it is a Grammar. Let’s talk about them.


First, the a priori component is Universal. This means that we all have it. This explains why any normal human baby can learn any human language. Any person born and raised in the US is perfectly fluent in English, regardless of whether their ancestry is Anglo, Spanish, Chinese or Bantu. Kids that are raised in Deaf American families are perfectly fluent in ASL, regardless of whether they can hear or not. This shows that human languages are probably not as different as we may think they are. In fact, they may all be based on the same principles, principles that are consistent with they way our brain works (FOOTNOTE: And only our brains–this would also explain why non-human animals are unable to learn human languages.)


Second, the a priori component is a Grammar. Now you may be ready to ask, “How can grammar be a priori, an innate feature of my mind, if I am terrible at grammar?”. But the truth is that you are wrong–your grammar is wonderful! You may not be able to explain what the subject-predicate agreement means, or when to use present continuous as opposed to simple present, but you do those things all the time quite effortless. Grammar refers to the principles that allow you to generate well-formed sentences. You may not be able to spell out the principles (we can call this “theoretical grammar”), but you sure know how to use them (let’s call this “practical grammar”). To illustrate my point: a newborn knows nothing about “theoretical” digestion, but is very proficient at “practical” digestion.


Now, how can grammar be a priori? Don’t you need to be exposed to English in order to know English grammar, both theoretical and practical? Obviously you do, but the Universal Grammar is not about English (or French or Yiddish or Swahili) grammar, but about the basic structure that is common to all human languages. I realize this requires some clarification. Let me explain.


The Universal Grammar determines what kind of structure a human language can have. For instance, any human language consists of sentences, and sentences always consist of a NP (a noun phrase, traditionally called the subject) and a VP (a verb phrase, traditionally called the predicate). Now, in some human languages, the NP goes first, and in some, the VP goes first. In some human languages, such as English, the NP has to be explicit, it cannot hide (FOOTNOTE: Even with verbs where there is not really a subject. When we say, “It rains”, what does “it” stand for?). The Universal Grammar gives us all the possible options, and some human languages activate some options and other human languages activate others. The Universal Grammar provides the framework of possibilities.


According to Chomsky, the Universal Grammar is innate to humans, or, to use Kantian terminology, a priori. That is why every normal human child is born with the ability to learn human languages. You can visualize the Universal Grammar as some kind of switchboard. Depending on what language you are first exposed in your infancy, some of the switches will be turned on or off. There are a few years in which the Universal Grammar retains all its flexibility, but, after that, once a switch is on, it is very hard to turn it off, and vice versa. That is why it is so easy for young children to learn a language and so hard for adults. My daughter Alicia, for instance, has been learning English for only eight years, and I have been learning it for almost thirty, but her English is much better than mine. The reason is that she was born in America and started learning it when she was a newborn, when all the scope of possibilities in her Universal Grammar where still open, whereas I started learning it when I was much older, and the switches of my Universal Grammar had been already turned off and on consonant with the peculiarities of Spanish. (FOOTNOTE: Some children are raised bilingual and seem to have little problem mastering two (or even more) languages. So…)


Chomsky’s Universal Grammar is totally consistent with Kant’s epistemology (although, obviously, Kant never used it as an example, having been dead for 153 years when Chomsky first published his theory):


  1. The Universal Grammar is a priori.


Unlike his predecessors in the field of linguistics, Chomsky does not join the empiricists in claiming that learning a language is just a matter of imitation. There has to be something in our mind, something innate, that makes the knowledge of any human language possible. In fact, Chomsky has postulated the existence of a language organ in our brain that would come equipped with knowledge of the Universal Grammar. Non-human animals, lacking that language organ, can never become fluent in any human languages.


  1. Reality conforms to the mind


In the context we are discussing, by “reality” we mean “language”. In the empiricist’s view, the mind is just a passive receptacle of information—if you spoke to a child in any sort of language (English or Arabic, but also a computer language or an alien language) from the moment they are born, then the child would become fully proficient in it. In contrast, in Chomsky’s theory, a human child could never acquire a computer language or an alien language as their first language, since those languages do not conform to the Universal Grammar (FOOTNOTE: An alien language would doubtless conform to some sort of Universal Grammar, but it would be the Alien Universal Grammar, as opposed to the Human Universal Grammar.) So the Universal Grammar makes our knowledge of language possible, but also limits the kinds of languages we can know, just like our a priori of space makes our perception of physical objects possible, but also limits the kinds of objects we can perceive—to, for instance, three-dimensional objects.


This is not to say that we cannot learn a computer language. Millions of people do. But they cannot use those languages to communicate naturally. You cannot have a conversation with a friend in, say, C++.  A parallelism with Kant’s a prioris can prove helpful here, too. A mathematician can very successfully describe a fourth-dimensional space or a non-Euclidean geometry, but all the objects we’ll ever be able to perceive are three-dimensional and exist in what appears to be Euclidean space. This is just the way our minds work, and there is nothing we can do to change that fact.