What the F Read online

Page 4


  # $ % !

  In what ways are the 7,000 languages of the world similar, and in what ways are they different? Both questions have fascinated linguists and philosophers for millennia, for different reasons. Universal features found to hold in all languages reveal something about what it is to be human. If all humans do something—whether it’s art, music, math, or some aspect of language, that universal behavior must be due to either some shared common experience or some trait possessed by all humans, transcending cultural idiosyncrasies. Perhaps, sometimes, this stems from our genetic endowment.

  There doesn’t appear to be much about profanity that is truly universal—shared without exception by all languages and cultures. It’s not just that the specific words are different. As we’ve seen, the differences are much deeper than that. Some cultures have rich and deeply codified systems of profanity, like English or Russian. Others, like Japanese, don’t really have anything like the same category of words. Instead of absolute universals, when we look around the globe we find certain common tendencies across languages. The Holy, Fucking, Shit, Nigger Principle takes a first stab at characterizing the types of words that tend to become profane. Languages select from a small pool of semantically constrained candidates for their bad words—if indeed they decide to have bad words. Not only do the specific words differ from language to language, but so do the proportions of words selected from each domain, in ways related to the sociocultural legacy that a given language carries with it.

  But this sort of statistical universal, where features overlap in languages that exhibit family resemblances, is the norm in the languages of the world—not just when it comes to profanity but for language features in general. It’s very hard to find much of anything that all languages do. When you look for universal features of languages, you mostly find tendencies. This makes us think that the way a language will be structured isn’t merely random. Something must be at work making languages similar, but it isn’t some inviolable rule inscribed in our genes. In each case of a cross-linguistic tendency, facts about how people use language—what they want to convey with it, the memory and time constraints imposed on them while using it, and so on—likely shape languages over the course of generations such that they settle on certain similar sorts of solutions. For example, people seem to want to talk about things and events, so it’s not surprising to find nouns and verbs in the world’s languages. Similarly, it can be useful to distinguish who did something from whom they did it to. As a result, languages evolve subjects and objects and ways to encode them. So if profanity is like other cross-linguistic tendencies—languages tend to have it, and it tends to be drawn from certain domains—then what pressures tending to produce similar-seeming profanity could the histories of the world’s languages share?

  The answer probably lies in taboos not about language but about the world. Across cultures, people exhibit taboos about the very things that provide the vocabulary for profanity. There are taboos around the world associated with the supernatural—with gods and demons and prophets. There are taboos about copulation. There are taboos about defecation, micturition, menstruation, and other bodily functions. And there are taboos about people who are not members of our social group (see, for instance, laws against miscegenation that remained on the books in the United States until 1967!).

  The fact that taboos like these erupt around the world, though not universally, suggests an explanation for how profanity comes about and how it comes to have similar contours. People around the world have taken these taboos and extended them from the world to the word. It’s not just defecation that’s taboo in many cultures; nor is it just talking about defecation. Rather, the words that describe defecation themselves are taboo, whether that’s how you happen to be using them in the moment or not.

  There could be different reasons for this. We know that merely hearing or seeing a word stokes an internal mental representation of the things the word refers to.19 If the word shit causes people to “see” feces in their mind’s eye and “smell” it in their mind’s nose, then the impulse to limit the word’s use is understandable. Or it could be that people hold more metaphysical beliefs about words and their power—that they believe that using words associated with a particular taboo topic will bring bad fortune.

  Whichever of these explanations is ultimately correct—and there’s more work to do to tease them apart—the specific words that are profane across languages are similar because the things that are taboo across cultures are also similar. The pressure to reject words associated with those taboos is the real universal.

  But here’s the catch. The road from taboo things in the world to taboo words is nondeterministic. Even if excretion is culturally taboo, that doesn’t mean that all words describing it will be as well. Shit is more profane than poop. Fuck is profane, but copulate is not. And so cultural taboos only set the stage for profanity. They don’t select specific words. What distinguishes profane cunt from childlike wa-wa? That’s up next.

  2

  What Makes a Four-Letter Word?

  A cross the globe, profanity tends to emerge from particular domains of meaning—I refer you to the Holy, Fucking, Shit, Nigger Principle. But for every profane holy, fucking, and shit, there’s a technical and anodyne liturgical, copulation, or excretion. For every cock and cunt, there’s a childlike wee-wee and cha-cha. Many words describing sexual organs, excretory functions, and so on fail to rise to the heights (or, if you prefer, sink to the depths) of profanity. These words are articulated without fear of offending, whether in the classroom or the courtroom or the examination room. They aren’t profane, despite referring to taboo concepts. This means that something beyond what a word denotes—what it refers to—must cement it as profanity.

  What is that thing?

  Why is cunt a dirty word when coochie-snorcher isn’t?

  The most obvious possibility is that some aspect of how profane words are written or sound makes them vulgar. Let’s begin with the eight-hundred-pound gorilla. Many English profane words famously have four letters—not just cunt but fuck, shit, piss, cock, tits, and many others. No matter how you count, a lot of the profane words in English are spelled with four letters. Take just the words from the four lists in the last chapter. These lists aren’t exhaustive. But what’s nice about them is that they weren’t assembled with any particular interest in what the words sound like or how they’re spelled. Admittedly, the people who had to come up with lists of profane words might have been unconsciously swayed by the four-letter word notion, but at least that wasn’t their stated objective. So in that way, they offer as unbiased a sample as we’re likely to find. Those four lists in aggregate give us a total of eighty-four distinct words (I’ve removed multiword expressions like get fucked and Jesus fucking Christ, which include other words already in the list). Of the eighty-four words, twenty-nine are spelled with four letters. By this count, then, just over a third of profane words are four-letter words. This number may be artificially deflated, since many of the longer words (like asshole, motherfucker, and wanker) have shorter four-letter words embedded inside them. But it’s a good start.

  The first thing to notice from this is that having four letters isn’t a necessary prerequisite for profanity. Certainly, we already knew this: words like ass and motherfucker don’t have four letters, and most of the words on the list have some number of letters other than four. Nor is having four letters sufficient, since many four-letter words are not at all profane, like four or word. So we have to reconsider the question we’re asking. The real issue seems to be whether having four letters makes a word more likely to be profane, all other things being equal. That’s still an interesting question. Here’s a way to ask it.

  Given that many (but not all) profane words in English are spelled with four letters, we can try to find out whether the pattern is stronger than you’d expect, given how words in the language are spelled generally. That is, suppose you grabbed a random set of eighty-four English words. What are th
e odds that twenty-nine of them would have four letters? You can see a histogram on the next page, showing how many profane words from our list have each number of letters—the profane words are in the dark bars. As you can see, there’s a sharp spike at four, representing those twenty-nine four-letter profane words. But is twenty-nine a lot? You can tell by comparing the lengths of profane words in dark bars with English words in general, shown in the light bars. (To calculate these values, I counted the English words with each number of letters and normalized these counts to an eighty-four-word language to make them directly comparable to the profane numbers).a As you can see, English has a lot of words with four, five, six, or seven letters. And in general English looks like a smoother version of the profane distribution. But what really sticks out is how many more profane four-letter words there are than expected from English in general. The 29 profane four-letter words in our list are significantly more than you’d expect if profane words were like English words in general, in which case we’d expect only 12.6 profane four-letter words out of 84.b

  To calculate the numbers for English in general, I used the lemmatized word list that Adam Kilgarriff generated from the British National Corpus (available from his web page).

  A chi-squared test of lengths three through twelve reveals that the two samples are significantly different. For the statistically minded, χ2(3) = 38.61, p < 0.0001.

  Profane words (dark bars) are more likely to be three, four, five, or eight letters long than are English words in general (light bars).

  Perhaps more surprising is how many profane three- and five-letter words there are. There are relatively few three-letter words in English overall, and profane words are almost twice as likely to have three letters than you’d expect, all things being equal. We’ll come back to this in a moment, because it’s important. Less important but also notable is the little bump in eight-letter profane words, compared with the language in general. This is due to words composed of two four-letter words, like ballsack, bullshit, buttfuck, and shithead. Four-letter words appear to bend how English words look even when they’re merely parts of other words. But for our present purposes, it’s enough to note that profanity in English is strikingly more likely to have four letters than other words. The take-away is that there’s some truth to the popular notion about four-letter words.

  So this raises the obvious question, why? Why are profane words more likely than other words to have four letters?

  If you were a linguist, and maybe you are, the first thing to occur to you would be that the special length of profane words might be due to their frequency of use. In general, the most common words in a language tend to be shorter (in English, these include the, be, of, and, a, and so on), and as words get less frequent, they also get longer (the one-thousandth most frequent word in English is useful, the five-thousandth is gravity, and so on). The explanation for why this is the case is fascinating (having to do with efficiency of information transmission), but for our purposes it could also possibly account for the aberrant lengths of profane words. Maybe profane words are shorter than words in general because they tend to be among the most common.

  In fact, if you compare profane words with the most frequent words in English, shown below, you can see that they match up a lot better. But there’s still a little bump for profanity at four and eight letters, and the two groups of words are still statistically different.c So this can’t be the whole explanation, but it might be part of one.

  I ran a two-by-ten chi-squared test for lengths one to ten comparing profane words with the 626 most frequent words from the British National Corpus: χ2(9) = 19.17, p < 0.05.

  Profane words are also more likely to be four or eight letters long than the most frequent 10 percent of English words.

  The catch is that it’s hard to know whether profanity really is as frequent as the top 10 percent of English words. The difficulty lies in the fact that sources we usually use to measure word frequency are all written, mostly in a formal language register—things like newspaper archives and great literature. Profanity is vanishingly rare there. But the informal and spoken environments that form its natural habitat and in which it thrives leave no record. So we can’t measure how common it is in those places. Here’s the best I can do. I searched in a place where people do use language relatively casually and that does leave a permanent, searchable record: the website Reddit. Reddit is an interactive news, entertainment, and commentary platform fractured into various topic-related communities. People can post links or comments and often interact with informal language. They also tend to be younger than the population at large and more male. I took the eighty-four profane words in question and computed how frequent they were on all of Reddit, on average, over the course of two years, from August 31, 2013, to August 31, 2015. Profane words were quite frequent—not quite as frequent as those in the top 10 percent, but close.

  So the upshot is that frequency might explain part of why profane words tend to be four letters long in English. But it doesn’t tell the whole story. Perhaps there’s something else going on—perhaps something about having this number of letters causes a word to seem particularly taboo. Indeed, in some places in the world, people avoid the number four systematically—you can think of it as the number thirteen of Southeast Asia. More on that later, but the association between four letters and profanity appears to largely be an English-specific phenomenon. Although we can’t do comparable analyses for other languages (because lists of profane words in other languages haven’t been systematically tested), a quick tour around the swearwords of the world reveals that the four-letter rule doesn’t apply in many other languages. Often the most profane words in non-English languages are a different length. For instance, the strongest French words, putain (“whore”) and foutre (“fuck”), have six letters, and there is almost no Mexican Spanish four-letter profanity—strong words are longer, like chingar (“fuck”), concha (“cunt”), and pinche (“fucking”). In some other languages, profane words aren’t spelled with four letters because there’s no spelling at all—in places where tetraphobia (fear of four) is pervasive, the local languages often aren’t spelled with an alphabet. Chinese, for instance, uses logographic characters instead. And more generally, spelling is only relevant to the half of the world’s languages that have a written form.1 So if spelling is responsible for the four-letter phenomenon, then it would have to be for English-specific reasons.

  And when you sit with the idea for a bit, other considerations might cause you to second-guess whether having four letters could really make words profane. After all, people have been speaking English for a thousand years, and for most of that time many of those people couldn’t read or write. But they could swear. Children can swear before they can read or write (more on that later!). And even within English, some pretty profane words happen not to have four letters but are pretty close, like ass or bitch. So perhaps we’re detecting obliquely, through spelling, another, deeper cause at play. Maybe profane words tend to be four letters long because four-letter words tend to be pronounced a particular way. Maybe shit, cunt, fuck, and the like don’t look profane so much as sound profane.

  # $ % !

  This might seem outlandish, but hear me out. Take a moment to think about profane four-letter words, like cunt, fuck, and shit. Doesn’t something about them just sound dirty? Don’t they sound vulgar? Don’t they sound aggressive?

  If you agree, you’re not the first person to intuit that the words of your language somehow sound appropriate for what they mean. This was noted at least as early as the nineteenth century by German linguist Georg von der Gabelentz,2 who observed that German speakers consider the French silly for calling a horse cheval rather than the clearly more suitable German word Pferd.d Truly, though, cheval? Ridiculous! Obviously it’s a Pferd. Even if you know intellectually that there are different names for horses in other languages, in your heart of hearts, you may still feel like horse in your native language fits the animal best and that equi
valent words in other languages are slightly less apt. This is sometimes called the “sound-symbolic feeling”: the sounds of words in your language feel like they suit their meanings.

  In case you’re wondering, yes, it’s really pronounced with a p followed by an f!

  Taboo words often elicit particularly strong sound-symbolic feelings. When you say them—fuck, shit, bitch—when you roll them around in your mouth, they have a certain mouth-feel. And gut-feel. They feel like they sound obscene. One manifestation of this feeling is that it’s hard to imagine them meaning anything else. How could fuck signify anything other than what it does? (For instance, a word that sounds very similar in French, phoque, preposterously means “seal.”) And so we’re baffled when people who are not native speakers of our language accidentally produce profane words. Spanish speakers often confound English sheet and shit or beach and bitch because Spanish doesn’t encode a distinction between the ee and i sounds. Even knowing this, the sound-symbolic feeling makes it almost inconceivable to a monolingual English speaker that you could think that sheet would be pronounced shit. Shit feels dirty. Sheet shouldn’t.

  So could something about how profane words sound make them profane? Does this sound-symbolic feeling index something more than a subjective feeling? Do shit and fuck sound objectively more vulgar than poo-poo and copulate?

  One of the most common reasons words sound appropriate to their meanings is that the things they refer to sound like something, and the word’s pronunciation reflects that sound. This familiar phenomenon is known as onomatopoeia or sound symbolism. Words for sounds or actions that produce sounds often (but not always) imitate those sounds themselves. For example, even if you didn’t know what they meant, with a little context, you might be able to make an educated guess about the meanings of cock-a-doodle-doo and swish.