In English, the word for the sniffing appendage on our face is nose. Japanese also happens to use the consonant n in this word (hana) and so does Turkish (burun). Since the 1900s, linguists have argued that these associations between speech sounds and meanings are purely arbitrary. Yet a new study calls this into question.

Together with his colleagues, Damián Blasi of the University of Zurich analyzed lists of words from 4,298 different languages. In doing so, they discovered that unrelated languages often use the same sounds to refer to the same meaning. For example, the consonant r is often used in words for red—think of French rouge, Spanish rojo, and German rot, but also Turkish krmz, Hungarian piros, and Maori kura.

The idea is not new. Previous studies have suggested that sound-meaning associations may not be entirely arbitrary, but these studies were limited by small sample sizes (200 languages or fewer) and highly restricted lists of words (such as animals only). Blasi’s study, published this month in Proceedings of the National Academy of Sciences USA, is notable because it included almost two thirds of the world’s languages and used lists of diverse words, including pronouns, body parts,verbs,natural phenomena,and adjectives—such as we, tongue, drink, star and small, respectively.

The scope of the study is unprecedented, says Stanka Fitneva, associate professor of psychology at Queen’s University in Canada, who was not involved in the research. And Gary Lupyan, associate professor of psychology at the University of Wisconsin, adds, “Only through this type of large-scale analysis can worldwide patterns be discovered.”

The method involved two key parts. The first step was to estimate how frequently the word for a given concept uses a particular sound by assigning binary values of 0 or 1 to associations in individual languages. For example in English, the word for red uses the consonant r and therefore is scored a 1, while in Japanese, aka does not contain r and therefore receives a 0. Aggregating these numbers across the thousands of languages studied yields the overall probability that any word for red in any language will contain r—in this case, 0.35.

On its own, however, this calculation is not enough. There are thousands of words that use r—road, mural, and waiter, to name only a few English examples.So how do we know that the association between red and r is special? To address this question, the authors performed a second step, this time calculating the probability that any randomly selected word uses r. By comparing the two probabilities, they were able to show that across languages, r is more than twice as likely to occur in words for red than in other words. With this method, the researchers reported 74 robust associations between word sounds and meanings, including l and leaf, l and tongue (English is among the exceptions), and n and nose.

One limitation of the study is the relatively small number of meanings that were included in the analysis, points out Eric Holman, professor emeritus of psychology at the University of California, Los Angeles. Despite the diversity of meanings, the typical word list contained only 28 to 40 items. Another limitation concerns the transcription system, which collapsed certain distinctions (such as that between plain and nasal vowels, which are found in French words like non) that are known to play an important role in many languages.

The study raises some big-picture questions. Why, for example, should it be the case that culturally and geographically diverse groups of humans link the same sounds with the same meanings? Blasi and colleagues used statistical analyses to rule out the possibility that people happened to borrow words like red from neighboring languages, or that such words descended from the same ancient proto-language. So the answer to this question remains elusive. Although it’s easy to imagine that the n-sound in nose reflects nasality, this is a guess and no such relationship can explain other associations.

Another tough question concerns the relatively small number of associations. Why do a handful of words like red, small and leaf form non-arbitrary links to their speech sounds, while thousands of other words—such as soup and dog—do not? Simon Kirby, professor at the Center for Language Evolution at the University of Edinburgh, thinks this may be the heart of the matter. “The puzzle is really why this is such a marginal phenomenon,” says Kirby. “Why does it take a huge study like this to demonstrate that there is some non-arbitrariness in the lexicon? Blasi and colleagues have shown that non-arbitrary associations are possible—the deeper puzzle about language is why it is nevertheless largely arbitrary.”