Tag Archives: biases

Picture 57

Social structure and language evolution: resolving the synthetic/analytic debate

A cultural evolution approach to language suggests that genes encode weak prior biases that can be amplified through cultural transmission to produce strong language universals.  Below is a diagram from Kirby, Dowman & Griffiths (2007).

The link between biological predispositions and language structure, from Kirby, Dowman & Griffiths, 2007.

Note the long-term feedback between language universals and genes.  However, recent research is pointing towards a more complicated picture.  Continue reading

Cultural Evolution and the Impending Singularity

Prof. Alfred Hubler is an actual mad professor who is a danger to life as we know it.  In a talk this evening he went from ball bearings in castor oil to hyper-advanced machine intelligence and from some bits of string to the boundary conditions of the universe.  Hubler suggests that he is building a hyper-intelligent computer.  However, will hyper-intelligent machines actually give us a better scientific understanding of the universe, or will they just spend their time playing Tetris?

Let him take you on a journey…

Continue reading

The end of universals?

Woah, I just read some of the responses to Dunn et al. (2011) “Evolved structure of language shows lineage-specific trends in word-order universals” (language log here, Replicated Typo coverage here).  It’s come in for a lot of flack.  One concern raised at the LEC was that, considering an extreme interpretation, there may be no affect of universal biases on language structure.  This goes against Generativist approaches, but also the Evolutionary approach adopted by LEC-types.  For instance, Kirby, Dowman & Griffiths (2007) suggest that there are weak universal biases which are amplified by culture.  But there should be some trace of universality none the less.

Below is the relationship diagram for Indo-European and Uto-Aztecan feature dependencies from Dunn et al..  Bolder lines indicate stronger dependencies.  They appear to have different dependencies- only one is shared (Genitive-Noun and Object-Verb).

However, I looked at the median Bayes Factors for each of the possible dependencies (available in the supplementary materials).  These are the raw numbers that the above diagrams are based on.  If the dependencies’ strength rank in roughly the same order, they will have a high Spearman rank correlation.

Spearman Rank Correlation Indo-European Austronesian
Uto-Aztecan 0.39, p = 0.04 0.25, p = 0.19
Indo-European -0.13, p = 0.49

Spearman rank correlation coefficients and p-values for Bayes Factors for different dependency pairs in different language families.  Bantu was excluded because of missing feature data.

Although the Indo-European and Uto-Aztecan families have different strong dependencies, have similar rankings of those dependencies.  That is, two features with a weak dependency in an Indo-European language tend to have a weak dependency in Uto-Aztecan language, and the same is true of strong dependencies.  The same is true to some degree for Uto-Aztecan and Austronesian languages.  This might suggest that there are, in fact, universal weak biases lurking beneath the surface. Lucky for us.

However, this does not hold between Indo-European and Austronesian language families.  Actually, I have no idea whether a simple correlation between Bayes Factors makes any sense after hundreds of computer hours of advanced phylogenetic statistics, but the differences may be less striking than the diagram suggests.


As Simon Greenhill points out below, the statistics are not at all conclusive.  However, I’m adding the graphs for all Bayes Factors (these are made directly from the Bayes Factors in the Supplementary Material):

Austronesian:                                                             Bantu:

Indo-European:                                                            Uto-Aztecan:

Michael Dunn,, Simon J. Greenhill,, Stephen C. Levinson, & & Russell D. Gray (2011). Evolved structure of language shows lineage-specific trends in word-order universals Nature, 473, 79-82

The Return of the Phoneme Inventories

Right, I already referred to Atkinson’s paper in a previous post, and much of the work he’s presented is essentially part of a potential PhD project I’m hoping to do. Much of this stems back to last summer, where I mentioned how the phoneme inventory size correlates with certain demographic features, such as population size and population density. Using the the UPSID data I generated a generalised additive model to demonstrate how area and population size interact in determining the phoneme inventory size:

Interestingly, Atkinson seems to derive much of his thinking, at least in his choice of demographic variables, from work into the transmission of cultural artefacts (see here and here). For me, there are clear uses for these demographic models in testing hypotheses for linguistic transmission and change, as I see language as a cultural product. It appears Atkinson reached the same conclusion. Where we depart, however, is in our overall explanations of the data. My major problem with the claim is theoretical: he hasn’t ruled out other historical-evolutionary explanations for these patterns.

Before we get into the bulk of my criticism, I’ll provide a very brief overview of the paper.

Continue reading

Evolved structure of language shows lineage-specific trends in word-order universals

Via Simon Greenhill:

Dunn M, Greenhill SJ, Levinson SC, & Gray RD (2011). Evolved structure of language shows lineage-specific trends in word-order universals. Nature.

Some colleagues and I have a new paper out in Nature showing that the evolved structure of language shows lineage-specific trends in word-order universals. I’ve written an overview/FAQ on this paper here, and there’s a nice review of it here and here.

The Abstract:

Languages vary widely but not without limit. The central goal of linguistics is to describe the diversity of human languages and explain the constraints on that diversity. Generative linguists following Chomsky have claimed that linguistic diversity must be constrained by innate parameters that are set as a child learns a language. In contrast, other linguists following Greenberg have claimed that there are statistical tendencies for co-occurrence of traits reflecting universal systems biases, rather than absolute constraints or parametric variation. Here we use computational phylogenetic methods to address the nature of constraints on linguistic diversity in an evolutionary framework. First, contrary to the generative account of parameter setting, we show that the evolution of only a few word-order features of languages are strongly correlated. Second, contrary to the Greenbergian generalizations, we show that most observed functional dependencies between traits are lineage-specific rather than universal tendencies. These findings support the view that—at least with respect to word order—cultural evolution is the primary factor that determines linguistic structure, with the current state of a linguistic system shaping and constraining future states.


Mutual Exclusivity in the Naming Game

The Categorisation Game or Naming Game looks at how agents in a population converge on a shared system for referring to continuous stimuli (Steels, 2005; Nowak & Krakauer, 1999). Agents play games with each other, one referring to an object with a word and the other trying to guess what object the first agent was referring to. Through experience with the world and feedback from other agents, agents update their words. Eventually, agents are able to communicate effectively.  The model is usually couched in terms of agents trying to agree on labels for colours (a continuous meaning space).  In this post I’ll show that the algorithms used have implicit mutual exclusivity biases, which favour monolingual viewpoints.  I’ll also show that this bias is not necessary and obscures some interesting insights into evolutionary dynamics of langauge.

Continue reading

Genetic Anchoring, Tone and Stable Characteristics of Language

In 2007, Dan Dediu and Bob Ladd published a paper claiming there was a non-spurious link between the non-derived alleles of ASPM and Microcephalin and tonal languages. The key idea emerging from this research is one where certain alleles may bias language acquisition or processing, subsequently shaping the development of a language within a population of learners. Therefore, investigating potential correlations between genetic markers and typological features may open up new avenues of thinking in linguistics, particularly in our understanding of the complex levels at which genetic and cognitive biases operate. Specifically, Dediu & Ladd refer to three necessary components underlying the proposed genetic influence on linguistic tone:

[...] from interindividual genetic differences to differences in brain structure and function, from these differences in brain structure and function to interindividual differences in language-related capacities, and, finally, to typological differences between languages.”

That the genetic makeup of a population can indirectly influence the trajectory of language change differs from previous hypotheses into genetics and linguistics. First, it is distinct from attempts to correlate genetic features of populations with language families (e.g. Cavalli-Sforza et al., 1994). And second, it differs from Pinker and Bloom’s (1990) assertions of genetic underpinnings leading to a language-specific cognitive module. Furthermore, the authors do not argue that languages act as a selective pressure on ASPM and Microcephalin, rather this bias is a selectively neutral byproduct. Since then, there have been numerous studies covering these alleles, with the initial claims (Evans et al., 2004) for positive selection being under dispute (Fuli Yu et al., 2007), as well as any claims for a direct relationship between dyslexia, specific language impairment, working memory, IQ, and head-size (Bates et al., 2008).

A new paper by Dediu (2010) delves further into this potential relationship between ASPM/MCPH1 and linguistic tone, by suggesting this typological feature is genetically anchored to the aforementioned alleles. Generally speaking, cultural and linguistic processes will proceed on shorter timescales when compared to genetic change; however, in tandem with other recent studies (see my post on Greenhill et al., 2010), some typological features might be more consistently stable than others. Reasons for this stability are broad and varied. For instance, word-use within a population is a good indicator of predicting rates of lexical evolution (Pagel et al., 2007). Genetic aspects, then, may also be a stabilising factor, with Dediu claiming linguistic tone is one such instance:

From a purely linguistic point of view, tone is just another aspect of language, and there is no a priori linguistic reason to expect that it would be very stable. However, if linguistic tone is indeed under genetic biasing, then it is expected that its dynamics would tend to correlate with that of the biasing genes. This, in turn, would result in tone being more resistant to ‘regular’ language change and more stable than other linguistic features.

Continue reading

Phoneme Inventory Size and Demography

It’s long since been established that demography drives evolutionary processes (see Hawks, 2008 for a good overview). Similar attempts are also being made to describe cultural (Shennan, 2000; Henrich, 2004; Richerson & Boyd, 2009) and linguistic (Nettle, 1999a; Wichmann & Homan, 2009; Vogt, 2009) processes by considering the effects of population size and other demographic variables. Even though these ideas are hardly new, until recently, there was a ceiling as to the amount of resources one person could draw upon. In linguistics, this paucity of data is being remedied through the implementation of large-scale projects, such as WALS, Ethnologue and UPSID, that bring together a vast body of linguistic fieldwork from around the world. Providing a solid direction for how this might be utilised is a recent study by Lupyan & Dale (2010). Here, the authors compare the structural properties of more than 2000 languages with three demographic variables: a language’s speaker population, its geographic spread and the number of linguistic neighbours. The salient point being that certain differences in structural features correspond to the underlying demographic conditions.

With that said, a few months ago I found myself wondering about a particular feature, the phoneme inventory size, and its potential relationship to underlying demographic conditions of a speech community. What piqued my interest was that two languages I retain a passing interest in, Kayardild and Pirahã, both contain small phonological inventories and have small speaker communities. The question being: is their a correlation between the population size of a language and its number of phonemes? Despite work suggesting at such a relationship (e.g. Trudgill, 2004), there is little in the way of empirical evidence to support such claims. Hay & Bauer (2007) perhaps represent the most comprehensive attempt at an investigation: reporting a statistical correlation between the number of speakers of a language and its phoneme inventory size.

In it, the authors provide some evidence for the claim that the more speakers a language has, the larger its phoneme inventory. Without going into the sub-divisions of vowels (e.g. separating monophthongs, extra monophtongs and diphthongs) and consonants (e.g. obstruents), as it would extend the post by about 1000 words, the vowel inventory and consonant inventory are both correlated with population size (also ruling out that language families are driving the results). As they note:

That vowel inventory and consonant inventory are both correlated with population size is quite remarkable. This is especially so because consonant inventory and vowel inventory do not correlate with one another at all in this data-set (rho=.01, p=.86). Maddieson (2005) also reports that there is no correlation between vowel and consonant inventory size in his sample of 559 languages. Despite the fact that there is no link between vowel inventory and consonant inventory size, both are significantly correlated with the size of the population of speakers.

Using their paper as a springboard, I decided to look at how other demographic factors might influence the size of the phoneme inventory, namely: population density and the degree of social interconnectedness.

Continue reading

Evolution of Colour Terms: 10 Universal Patterns are not Evidence for Innate Constraints

In a series of posts, I’ve been discussing constraints on the evolution of colour terms. Here, I discuss the role of drift and also argue that universal patterns are not necessarily good evidence for innate constraints. For the full dissertation and references, go here.


An important point which has not been highlighted in the literature is the drift introduced by cultural transmission.  Perceptual systems are noisy, and change over lifetimes.  Therefore, systems of categorising these perceptions may drift over time.  However, if concepts are shared, this drift is influenced by more than one system.  This may cause a different kind of drift from a stand-alone system for self-thought.  Communication has an additional semantic bottleneck which self-though does not have.  Using language for self thought, if you don’t know a label, you can make one up.

However, for communication, this won’t work.  For example, in models of cultural transmission (e.g., Steels & Belpaeme, 2005) agents do create new labels but, importantly, accept the speaker’s label when available.  That is, communicative systems are more flexible than systems for self-thought (communicators must be more willing to change their minds), and so are more subject to drift.  The drift allows the system to move around the possible space of coding efficiency and object categorisation efficiency.  Peaks in these landscapes will attract the drift, hence environmental and perceptual constraints being projected into language.

Although systems of colour categorisation for self-thought may be more efficient if they were constrained by the environment, shared cultural systems are more likely to reflect constraints in the environment because they are more flexible.  That is, perceptual constraints have projected themselves into language because of a communicative pressure, rather than a perceptual or environmental pressure.

I suggest that this drift, together with an ability for categories to warp perceptual spaces, would mean that individuals converge on a shared perceptual system.  If comprehension involves the activation of perceptual representations, then communication involves individuals reaching similar perceptual representations or, in a perfect world, activation of the same neural substrates.  Therefore, a population with a shared perceptual system would be able to communicate much more effectively.  In this sense, Embodied systems improve communicative success, whereas the same effect is not necessarily true of Symbolist systems. Furthermore, this drift means that populations can still converge on similar solutions, without having to assume that Universal biases are the main driving force.  It has been argued that the similarities in colour categorisation between cultures contradicts Relativism, which would predict a large variation in colour categorisation between cultures (e.g., Belpaeme & Bleys, 2005).  I argue that this inference is not necessarily valid.


This series of posts has shown that a wide range of factors constrain the categorisation of colour, including the physiology of perception, the environment and cultural transmission.  Why is there evidence for Colour Terms being adapted to so many domains?

This study considered the idea that categorisation acquired by individuals can feed back into perception and itself become a constraint both on the development of categorisation, the environment and genetic inheritance.  In this sense, the feedback from categorisation allows Niche Construction dynamics to apply to linguistic categorisations.  It was argued that this dynamic fits with the Cultural implication of an Embodied account of language comprehension.  That is, this study has concluded, similarly to Kirby et al. (2007), that universal patterns across populations do not necessarily imply strong innate biases.  This was done by arguing that Cultural, Embodied systems tend to drift towards better representations of the real world, which involves better coherence with perceptual and environmental constraints, creating cross-cultural patterns.  Furthermore, an Embodied approach to cultural dynamics incorporating a mechanism for perceptual warping predicts that the perceptual spaces of individuals can be synchronised through language to achieve better communication.

Steels, L., & Belpaeme, T. (2005). Coordinating perceptually grounded categories through language: A case study for colour Behavioral and Brain Sciences, 28 (04) DOI: 10.1017/S0140525X05000087

Belpaeme, T. (2005). Explaining Universal Color Categories Through a Constrained Acquisition Process Adaptive Behavior, 13 (4), 293-310 DOI: 10.1177/105971230501300404

Kirby, S., Dowman, M., & Griffiths, T. (2007). Innateness and culture in the evolution of language Proceedings of the National Academy of Sciences, 104 (12), 5241-5245 DOI: 10.1073/pnas.0608222104

Evolution of Colour Terms: 3 Perceptual Constraints

Continuing my series on the Evolution of Colour terms, this post reviews evidence for perceptual constraints on colour terms. For the full dissertation and for references, go here.

Continue reading