Young numeral systems

The basic number words in most languages are opaque. This is certainly the case for English. ‘Seven’ calls to mind nothing in particular other than ‘seven’; ditto ‘two,’ ‘nine,’ and the rest. These are abstract, impenetrable symbols; there is no image behind them, no ghost of meaning to suggest where they might have come from.

But numbers words very likely did come from somewhere. More often than not, if you dig deep enough into the history of abstract terms, you find they are rooted in images, actions, and objects. But time and use have a way of obscuring such roots, and number words have been around for such a long time and have been put to such heavy use that—as one historian of numbers, Karl Menninger, put it—they have “become mere gibberish.” If only we could step back in time—if only we could study numeral systems in their youth—we might learn something about how these foundational concepts first emerged. Unfortunately, this is now impossible.

Or is it? English number words do indeed have ancient roots, as do the number words in many other large-scale global languages. But in other parts of the world—especially in small-scale indigenous communities in the Amazon, Australia, and New Guinea—numeral systems remain young. I don’t mean “young” strictly in terms of chronological age, but also in terms of intensity of use. In many of these communities, numbers have simply never been in heavy rotation; they have nothing like the cultural prominence, ubiquity, and multifariousness of numbers in the globalized, industrialized world. As a result, these indigenous numeral systems are greener, less weathered—and, as we’ll see, not yet reduced to “mere gibberish.”

The three-pronged foot of a rhea. The image of a rhea footprint served as the basis for ‘three’ in Xerénte, an Amazonian language. Photo: Frank Vincentz ( source ).

The three-pronged foot of a rhea. The image of a rhea footprint served as the basis for ‘three’ in Xerénte, an Amazonian language. Photo: Frank Vincentz (source).

Of particular interest on this topic is a study by Patience Epps on numeral systems in the Vaupés region of the Amazon (Epps, 2006), as well as a broader survey by Epps and colleagues on numeral systems in small-scale languages around the world (Epps et al., 2012). Part of my interest in these studies is that they resonate with informal field observations my colleagues and I have made about number words in Yupno, a language of Papua New Guinea. Put together, these observations suggest that there may be discernible hallmarks of young numeral systems wherever those systems emerge. Most prominently, such numerals retain a ghost of meaning—sometimes more than a ghost. But they appear to bear other hallmarks as well.

Hallmark 1: Etymological transparency

Old numerals are etymologically opaque, as mentioned. Young numerals, in contrast, exhibit a degree of transparency, and this transparency stems from a few different sources. A first source is that number words are made up of other number words—that is, they exhibit compositionality. It is common in old numeral systems for number words to be compositional in the 10+ range—think ‘twenty-one,’ ‘twenty-two,’ and so on. In young numeral systems, however, even words in the 1-10 range are sometimes composed of other number words. An especially common case appears to be words for 4. In Yupno and other small-scale languages, the word for 4 is simply the word for 2 reduplicated—‘two two.’

A second source of transparency is that young numerals often evoke the body—especially fingers, hands, toes, and feet. (Also quite common is ‘man’ to refer to 20.) Presumably, this is because numbers words frequently—perhaps universally—originated as descriptions of embodied counting procedures. For instance, Menninger relates a “picturesque” way of expressing 99 in an unspecified New Guinean language: “Four men die (80), two hands come to an end (10), one foot ends (5), and four.” Similarly, in Yupno, once you get beyond 4, the number terms begin to make explicit reference to embodied counting procedures, with phrasings like “from the other [hand] take one.”

A third source of transparency is not as widely attested but is especially striking: an imagistic basis for number words in the 1-4 range. This phenomenon is attested in a couple pockets around the world, but the most vivid evidence comes from the Amazon (see Epps et al., 2012, p. 67). Examples include:









etymological source


deer footprint

rubber seed

rhea footprint

jar support

pronged fishing arrow

has a brother




Nadahup family





Of course, these are not arbitrary associations. The image of ‘deer footprint’ is used for 2 because it has two salient parts; ‘pronged fishing arrow’ is used for 3 because it has three salient parts. And so on. (The remarkable use of “has a brother” and other kinship expressions in the numeral systems in this region would require a detour; see Epps, 2006 for discussion.) Note that this phenomenon of an imagistic basis for numerals in the 1-4 range is not one we observed in Yupno, but nor did we go searching for it.

Hallmark 2: Loose lexicalization

A second hallmark of young numerals is what we might call “loose lexicalization.” Not all the terms used to refer to number concepts are fully lexicalized in the sense of a rigidly conventional mapping between form and meaning. In short, these number “words” are perhaps not best described as words at all.

Loose lexicalization is a phenomenon Epps (2006, p. 270) observed in passing in the Amazon. It is also one we noted in Yupno. In preparation for one of our studies, we elicited and recorded the 1-10 count list from a few different speakers in a single village. We were struck by the fact that these count lists were similar but by no means identical. Some of this inconsistency, particularly in the 1-5 range, appears to have been due to dialectical variation; but beyond the 1-5 range it is more likely due to loose lexicalization. Essentially, beyond 5 we weren’t eliciting words so much as “numerical expressions”: quasi-conventional descriptions of how to produce the target number using an embodied procedure.

My hunch is that loose lexicalization may be more common in young numerals systems than has been reported to date. It’s a difficult phenomenon to wrap your head around, after all. Our notion of what it means to be a word assumes a fixed, rigid mapping between form and meaning. And when we don’t find such a fixed mapping it’s easy to chalk this looseness up to other factors: maybe the speaker doesn’t really know the number words; maybe there are several conventional forms available; maybe there is dialectical variation.

Hallmark 3: Length

A third hallmark is that the terms in young numeral systems tend to be long, sometimes tediously so (see the “word” for 99 reported earlier). In global languages, number words are often decidedly compact. English speakers can recite the words for 1-10 in a mere eleven syllables. Compare this to Yupno, where the word for 1 is three syllables and the word for 4 is six. The Amazonian numerals that Epps describes are similarly long-winded.  

Part of the interest of this observation is that the compactness of one’s numeral system may have cognitive consequences. We use number words not only to communicate but to think, to remember and manipulate quantities. The longer these words get, the more cumbersome they become. Perhaps the clearest demonstration of this comes from a study of Welsh-English bilinguals. By recording bilinguals as they read numbers aloud in these two languages, the researchers found that the Welsh 1-10 number words took longer to articulate than the English ones. And, in turn, participants had a shorter “digit span” when those digits were presented in Welsh than when they were presented in English. Note that the difference in length between Welsh and English number words is actually relatively subtle. Imagine performing a similar experiment with people who command both an old, compact numeral system and a lengthy, young numeral system.

Summing up, despite being culturally distinct and geographically dispersed, young numeral systems appear to bear several hallmarks in common. Old numerals are most often opaque, fixed, and short; young numerals, in contrast, are often transparent, loosely lexicalized, and long. These hallmarks provide vital clues about how humans were able to invent numbers in the first place—in particular, by using embodied procedures, concrete imagery, and grouping strategies.  

One especially fraught issue here is whether old numerals are superior to young ones—that is, more highly evolved, more “civilized,” not only wizened but wise. Certainly, in the early days of anthropology, in the 1800’s, there was a good deal of number chauvinism in the air. Consider, for instance, the title of a 1863 paper by John Crawfurd, ‘‘On numerals as evidence of the progress of civilization.”

My aim here is certainly not to revive such attitudes. Importantly, any notion of superiority in numeral systems is relative to an assumed function. Short numbers are probably better suited for tasks involving manipulation and memory, such as psychology experiments that involve memorizing arbitrary strings of digits. But hunter-gatherers don’t go around remembering arbitrary strings of digits—indeed, not many humans did until relatively recently. Moreover, the shortness and opacity of number words could come at price. Speculatively, for children first acquiring number words, opaque pieces of gibberish may be harder to associate with number concepts than transparent expressions would be. For now, the safest assumption to make about young numerals systems is not that they are less “civilized” or less useful, only that they are just that: young.


1. For “gibberish,” see Menninger (1969), Number words and number symbols: A cultural history. MIT Press. p. 125.

2. For the “picturesque” way of expressing 99, see Menninger (1969), Number words and number symbols: A cultural history. MIT Press. p. 36.

3. See also this insightful discussion of the early emergence of number concepts by David Barner.

4. Another hallmark of young numeral systems, which I do not discuss, is that they tend to have low upper bounds. That is, they tend to have a relatively low maximum reportable value. See Epps et al., 2012 (p. 49-55) for discussion. It is probably this hallmark, more than others, that has inspired “number chauvinism.”

5. The compactness of old numeral systems may be one reason why they are so frequently borrowed in contact situations.