Phoneme Inventory Size and Demography

It’s long since been established that demography drives evolutionary processes (see Hawks, 2008 for a good overview). Similar attempts are also being made to describe cultural (Shennan, 2000; Henrich, 2004; Richerson & Boyd, 2009) and linguistic (Nettle, 1999a; Wichmann & Homan, 2009; Vogt, 2009) processes by considering the effects of population size and other demographic variables. Even though these ideas are hardly new, until recently, there was a ceiling as to the amount of resources one person could draw upon. In linguistics, this paucity of data is being remedied through the implementation of large-scale projects, such as WALS, Ethnologue and UPSID, that bring together a vast body of linguistic fieldwork from around the world. Providing a solid direction for how this might be utilised is a recent study by Lupyan & Dale (2010). Here, the authors compare the structural properties of more than 2000 languages with three demographic variables: a language’s speaker population, its geographic spread and the number of linguistic neighbours. The salient point being that certain differences in structural features correspond to the underlying demographic conditions.

With that said, a few months ago I found myself wondering about a particular feature, the phoneme inventory size, and its potential relationship to underlying demographic conditions of a speech community. What piqued my interest was that two languages I retain a passing interest in, Kayardild and Pirahã, both contain small phonological inventories and have small speaker communities. The question being: is their a correlation between the population size of a language and its number of phonemes? Despite work suggesting at such a relationship (e.g. Trudgill, 2004), there is little in the way of empirical evidence to support such claims. Hay & Bauer (2007) perhaps represent the most comprehensive attempt at an investigation: reporting a statistical correlation between the number of speakers of a language and its phoneme inventory size.

In it, the authors provide some evidence for the claim that the more speakers a language has, the larger its phoneme inventory. Without going into the sub-divisions of vowels (e.g. separating monophthongs, extra monophtongs and diphthongs) and consonants (e.g. obstruents), as it would extend the post by about 1000 words, the vowel inventory and consonant inventory are both correlated with population size (also ruling out that language families are driving the results). As they note:

That vowel inventory and consonant inventory are both correlated with population size is quite remarkable. This is especially so because consonant inventory and vowel inventory do not correlate with one another at all in this data-set (rho=.01, p=.86). Maddieson (2005) also reports that there is no correlation between vowel and consonant inventory size in his sample of 559 languages. Despite the fact that there is no link between vowel inventory and consonant inventory size, both are significantly correlated with the size of the population of speakers.

Using their paper as a springboard, I decided to look at how other demographic factors might influence the size of the phoneme inventory, namely: population density and the degree of social interconnectedness.

Continue reading “Phoneme Inventory Size and Demography”

Population Size and Rates of Language Change

In previous posts, I’ve looked at the relationship between cultural evolution and demography (see here, here and here). As such, it makes sense to see if such methods are applicable in language which is, after all, a cultural product. So, having spent the last few days looking over the literature on language and demography, I found the following paper on population size and language change (free download). In it, the authors, Søren Wichmann and Eric Holman, use lexical data from WALS to test for an effect of the number of speakers on the rate of language change. Their general findings argue against a strong influence of  population size, with them instead opting for a model where the type of network influences change at a local level, through different degrees of connectivity between individuals. Here is the abstract:

Previous empirical studies of population size and language change have  produced  equivocal  results. We  therefore  address  the  question  with  a new set of lexical data from nearly one-half of the world’s languages. We first show that relative population sizes of modern languages can be extrapolated to ancestral languages, albeit with diminishing accuracy, up to several thousand years into the past. We then test for an effect of population against the null hypothesis that the ultrametric inequality is satisified by lexical distances among triples of related languages. The test shows mainly negligible effects of population, the exception being an apparently faster rate of change in the larger of two closely related variants. A possible explanation for the exception may be the influence on emerging standard (or cross-regional) variants from speakers who shift from different dialects to the standard. Our results strongly indicate that the sizes of speaker populations do not in and of themselves determine rates of language change. Comparison of this empirical  finding with previously published computer simulations suggests that the most plausible model  for  language  change  is  one  in  which  changes  propagate  on  a  local level in a type of network in which the individuals have different degrees of connectivity.

As I’m in the middle of several other things at the moment I don’t really have time to provide a thorough review of this paper. Having said that, I agree with their claim of population size being unlikely to account for rates of language change. I reckon their results would be stronger if they factored in population density. So those that are dense and large will change faster than those which are large and distributed. The main point being that population size and population density influence the degree of social interconnectivity. Nettle (1999), for instance, argues that “spreading an innovation over a tribe of 500 people is much easier and takes much less time than spreading one over five million people.” This is fairly reasonable if we are looking at the generation of a single innovation within each of these populations. However, if those 500 people are spread across a large distance, then their transmission chain is going to be stretched: effectively lowering the rate of transmission. The same applies for a population of five million individuals who are packed into a small area: Arguably, given the right conditions, we can arrive at a situation where a population of five million show greater levels of interconnectivity than 500. I think it’s this aspect, the level of social interconnectivity, which may be more relevant to the rate of language change (other things to test for, include: writing systems/literacy and inter-language contact).

Can linguistic features reveal time depths as deep as 50,000 years ago?

ResearchBlogging.orgThroughout much of our history language was transitory, existing only briefly within its speech community. The invention of writing systems heralded a way of recording some of its recent history, but for the most part linguists lack the stone tools archaeologists use to explore the early history of ancient technological industries. The question of how far back we can trace the history of languages is therefore an immensely important, and highly difficult, one to answer. However, it’s not impossible. Like biologists, who use highly conserved genes to probe the deepest branches on the tree of life, some linguists argue that highly stable linguistic features hold the promise of tracing ancestral relations between the world’s languages.

Previous attempts using cognates to infer the relatedness between languages are generally limited to predictions within the last 6000-10,000 years. In the present study, Greenhill et al (2010) decided to examine more stable linguistic features than the lexicon, arguing:

Continue reading “Can linguistic features reveal time depths as deep as 50,000 years ago?”