Compositionality and Bilingualism

Last week I put up a link to an online experiment.  Here’s the results! You can still do the experiment first, if you like, here.  Source code and raw results at the bottom.

Languages evolve over time under a pressure to be learned by a new generation.  Does learning two languages at once effect this pressure? My experiment says … maybe.

These pressures include ones for learnability (compression) and expressivity (able to express a large variety of meanings, Kirby, Cornish & Smith, 2008).  Bilingualism seems like an unlikely ability since learning an extra language leaves the speaker potentially no more expressive at a cost of an increase in the amount of effort required to learn it.  There is no pressure for one language structure (e.g. English) to adapt to another language (e.g. Mandarin) so that they can become optimally learnable and expressive as a single medium.  That is, there’s no reason to assume that expressivity and learnability pressures apply across languages (which are not being used by the same people).

Nevertheless, children display an aptitude and a willingness to learn and use multiple languages simultaneously, and at a similar rate to monolingual children.  Therefore, languages do seem to have adapted to be learnable simultaneously.  Does the compatibility of languages point to a strong innate property of language?  In contrast, it might point to underlying similarity in the structure of languages, brought about by universal principles of communication.

Continue reading “Compositionality and Bilingualism”

The Return of the Phoneme Inventories

Right, I already referred to Atkinson’s paper in a previous post, and much of the work he’s presented is essentially part of a potential PhD project I’m hoping to do. Much of this stems back to last summer, where I mentioned how the phoneme inventory size correlates with certain demographic features, such as population size and population density. Using the the UPSID data I generated a generalised additive model to demonstrate how area and population size interact in determining the phoneme inventory size:

Interestingly, Atkinson seems to derive much of his thinking, at least in his choice of demographic variables, from work into the transmission of cultural artefacts (see here and here). For me, there are clear uses for these demographic models in testing hypotheses for linguistic transmission and change, as I see language as a cultural product. It appears Atkinson reached the same conclusion. Where we depart, however, is in our overall explanations of the data. My major problem with the claim is theoretical: he hasn’t ruled out other historical-evolutionary explanations for these patterns.

Before we get into the bulk of my criticism, I’ll provide a very brief overview of the paper.

Continue reading “The Return of the Phoneme Inventories”

More on Phoneme Inventory Size and Demography

On the basis of Sean’s comment, about using a regression to look at how phoneme inventory size improved as geographic spread was incorporated along with population size, I decided to look at the stats a bit more closely (original post is here). It’s fairly easy to perform multiple regression in R, which, in the case of my data, resulted in highly significant results (p<0.001) for the intercept, area and population (residual standard error = 9.633 on 393 degrees of freedom; adjusted R-Squared = 0.1084). I then plotted all the combinations as scatterplots for each pair of variables. As you can see below, this is fairly useful as a quick summary but it is also messy and confusing. Another problem is that the pairs plot is on the original data and not the linear model.

Continue reading “More on Phoneme Inventory Size and Demography”

Phoneme Inventory Size and Demography

It’s long since been established that demography drives evolutionary processes (see Hawks, 2008 for a good overview). Similar attempts are also being made to describe cultural (Shennan, 2000; Henrich, 2004; Richerson & Boyd, 2009) and linguistic (Nettle, 1999a; Wichmann & Homan, 2009; Vogt, 2009) processes by considering the effects of population size and other demographic variables. Even though these ideas are hardly new, until recently, there was a ceiling as to the amount of resources one person could draw upon. In linguistics, this paucity of data is being remedied through the implementation of large-scale projects, such as WALS, Ethnologue and UPSID, that bring together a vast body of linguistic fieldwork from around the world. Providing a solid direction for how this might be utilised is a recent study by Lupyan & Dale (2010). Here, the authors compare the structural properties of more than 2000 languages with three demographic variables: a language’s speaker population, its geographic spread and the number of linguistic neighbours. The salient point being that certain differences in structural features correspond to the underlying demographic conditions.

With that said, a few months ago I found myself wondering about a particular feature, the phoneme inventory size, and its potential relationship to underlying demographic conditions of a speech community. What piqued my interest was that two languages I retain a passing interest in, Kayardild and Pirahã, both contain small phonological inventories and have small speaker communities. The question being: is their a correlation between the population size of a language and its number of phonemes? Despite work suggesting at such a relationship (e.g. Trudgill, 2004), there is little in the way of empirical evidence to support such claims. Hay & Bauer (2007) perhaps represent the most comprehensive attempt at an investigation: reporting a statistical correlation between the number of speakers of a language and its phoneme inventory size.

In it, the authors provide some evidence for the claim that the more speakers a language has, the larger its phoneme inventory. Without going into the sub-divisions of vowels (e.g. separating monophthongs, extra monophtongs and diphthongs) and consonants (e.g. obstruents), as it would extend the post by about 1000 words, the vowel inventory and consonant inventory are both correlated with population size (also ruling out that language families are driving the results). As they note:

That vowel inventory and consonant inventory are both correlated with population size is quite remarkable. This is especially so because consonant inventory and vowel inventory do not correlate with one another at all in this data-set (rho=.01, p=.86). Maddieson (2005) also reports that there is no correlation between vowel and consonant inventory size in his sample of 559 languages. Despite the fact that there is no link between vowel inventory and consonant inventory size, both are significantly correlated with the size of the population of speakers.

Using their paper as a springboard, I decided to look at how other demographic factors might influence the size of the phoneme inventory, namely: population density and the degree of social interconnectedness.

Continue reading “Phoneme Inventory Size and Demography”

Experiments in communication pt 2: Human Iterated Learning

ResearchBlogging.orgIn the last post, I discussed some of the literature into experimental communication, with the intention of then following it up by looking at recent experiments done at Edinburgh (and beyond). But as Hannah pipped me to the post, with a great overview of the wide range of experiments into language evolution, I’ll instead limit this to two relatively recent papers on Human Iterated Learning (Kirby et al., 2008; Cornish et al., 2009)

Drawing from experimental approaches found in Diffusion Chain and Artificial Language Learning studies, Kirby et al (2008) show that as a consequence of intergenerational transmission languages “culturally evolve in such a way as to maximize their own transmissibility: over time, the languages in our experiments become easier to learn and increasingly structured.” In these experiments a subject is exposed to an alien language, made up of two elements within a finite space: meanings (consisting of a picture with three discernible elements: colour, shape and movement) paired with signals (consisting of a string of letters). Importantly, the subject is only exposed to a set amount of meanings (SEEN items), after which they are then presented with a group of meanings (some SEEN, some UNSEEN) without the corresponding signal — the goal being that they provide a response (be it the correct version or not). On completion of forming the meaning-signal pairs the experiment is repeated, except this time the new subjects are trained on the data provided by the previous generation. This continues until the experiment is finished, which in this case happened at generation ten.

Continue reading “Experiments in communication pt 2: Human Iterated Learning”

Language evolution in the laboratory

When talking about language evolution there’s always a resistance from people exclaiming;  ‘but how do we know?’, ‘surely all of this is conjecture!’ and, because of this, ‘what’s the point?’

Thomas Scott-Phillips and Simon Kirby have written a new article (in press) in ‘Trends in Cognitive Science’ which addresses some of the techniques currently used to address language evolution using experiments in the laboratory.

The Problem of language evolution

The problem of language evolution is one which encompasses not only the need to explain biologically how language came about but also how language came to be how it is today through processes of cultural evolution. Because of this potential ambiguity arises when using the term ‘language evolution’. To sort this ambiguity the authors put forward the following:

Language evolution researchers are interested in the processes that led to a qualitative change from a non-linguistic state to a linguistic one. In other words, language evolution is concerned with the emergence of language

Continue reading “Language evolution in the laboratory”