Children are better than adults at learning second languages. Children find it easy, can do it implicitly and achieve a native-like competence. However, as we get older we find learning a new language difficult, we need explicit teaching and find some aspects difficult to master such as grammar and pronunciation. What is the reason for this? The foremost theories suggest it is linked to memory constraints (Paradis, 2004; Ullman, 2005). Children find it easy to incorporate knowledge into procedural memory – memory that encodes procedures and motor skills and has been linked to grammar, morphology and pronunciation. Procedural memory atrophies in adults, but they develop good declarative memory – memory that stores facts and is used for retrieving lexical items. This seems to explain the difference between adults and children in second language learning. However, this is a proximate explanation. What about the ultimate explanation about why languages are like this?
In a recent post, James wrote about the Social Sensitivity hypothesis. Given findings that certain genetic variants will make a person more sensitive to social contact and more reliant on social contact under stress, it proposes that certain genetic variants ‘fit’ better with certain social structures. In support of this idea, Way and Lieberman (2010) find a correlation between the prevalence of this variant and the level of collectivism (as opposed to individualism) in a society.
An alternative explanation I’ve been thinking about is migration patterns. If genetic differences make a person less reliant on social networks, they may be more likely to migrate. This would predict that areas settled later in human history will have more ‘non socially sensitive’ individuals.
On the basis of Sean’s comment, about using a regression to look at how phoneme inventory size improved as geographic spread was incorporated along with population size, I decided to look at the stats a bit more closely (original post is here). It’s fairly easy to perform multiple regression in R, which, in the case of my data, resulted in highly significant results (p<0.001) for the intercept, area and population (residual standard error = 9.633 on 393 degrees of freedom; adjusted R-Squared = 0.1084). I then plotted all the combinations as scatterplots for each pair of variables. As you can see below, this is fairly useful as a quick summary but it is also messy and confusing. Another problem is that the pairs plot is on the original data and not the linear model.
Experimental studies (e.g. Jones & Munhall 2000) indicate that humans monitor their own speech through hearing in order to maintain accurate vocal articulation throughout the lifespan. Similarly, songbirds not only rely on song input from tutors and conspecifics in the early stages of song development, but also on the ability to hear and detect production errors in their own song and adjust it accordingly with reference to an internal ‘sensory target’ following the initial song learning phase.
This phenomenon also extends to ‘closed-ended learners’ – birds who do not acquire novel song elements after an initial learning period, but who still demonstrate song variability in adulthood. Experimental studies have shown that in such species, vocal learning is more prolonged and fundamental to song production than originally thought. For example, Okanoya and Yamaguchi (1997) showed that afflicted deafening in adult Bengalese Finches resulted in the production of abnormal song syntax in a matter of days. This is parallel to the human condition whereby linguistic fidelity, particularly with regards to prosodic aspects such as pitch and intensity, gradually degrades in human adults with postlinguistically acquired auditory impairments.
It’s long since been established that demography drives evolutionary processes (see Hawks, 2008 for a good overview). Similar attempts are also being made to describe cultural (Shennan, 2000; Henrich, 2004; Richerson & Boyd, 2009) and linguistic (Nettle, 1999a; Wichmann & Homan, 2009; Vogt, 2009) processes by considering the effects of population size and other demographic variables. Even though these ideas are hardly new, until recently, there was a ceiling as to the amount of resources one person could draw upon. In linguistics, this paucity of data is being remedied through the implementation of large-scale projects, such as WALS, Ethnologue and UPSID, that bring together a vast body of linguistic fieldwork from around the world. Providing a solid direction for how this might be utilised is a recent study by Lupyan & Dale (2010). Here, the authors compare the structural properties of more than 2000 languages with three demographic variables: a language’s speaker population, its geographic spread and the number of linguistic neighbours. The salient point being that certain differences in structural features correspond to the underlying demographic conditions.
With that said, a few months ago I found myself wondering about a particular feature, the phoneme inventory size, and its potential relationship to underlying demographic conditions of a speech community. What piqued my interest was that two languages I retain a passing interest in, Kayardild and Pirahã, both contain small phonological inventories and have small speaker communities. The question being: is their a correlation between the population size of a language and its number of phonemes? Despite work suggesting at such a relationship (e.g. Trudgill, 2004), there is little in the way of empirical evidence to support such claims. Hay & Bauer (2007) perhaps represent the most comprehensive attempt at an investigation: reporting a statistical correlation between the number of speakers of a language and its phoneme inventory size.
In it, the authors provide some evidence for the claim that the more speakers a language has, the larger its phoneme inventory. Without going into the sub-divisions of vowels (e.g. separating monophthongs, extra monophtongs and diphthongs) and consonants (e.g. obstruents), as it would extend the post by about 1000 words, the vowel inventory and consonant inventory are both correlated with population size (also ruling out that language families are driving the results). As they note:
That vowel inventory and consonant inventory are both correlated with population size is quite remarkable. This is especially so because consonant inventory and vowel inventory do not correlate with one another at all in this data-set (rho=.01, p=.86). Maddieson (2005) also reports that there is no correlation between vowel and consonant inventory size in his sample of 559 languages. Despite the fact that there is no link between vowel inventory and consonant inventory size, both are significantly correlated with the size of the population of speakers.
Using their paper as a springboard, I decided to look at how other demographic factors might influence the size of the phoneme inventory, namely: population density and the degree of social interconnectedness.
In the last post, I discussed some of the literature into experimental communication, with the intention of then following it up by looking at recent experiments done at Edinburgh (and beyond). But as Hannah pipped me to the post, with a great overview of the wide range of experiments into language evolution, I’ll instead limit this to two relatively recent papers on Human Iterated Learning (Kirby et al., 2008; Cornish et al., 2009)
Drawing from experimental approaches found in Diffusion Chain and Artificial Language Learning studies, Kirby et al (2008) show that as a consequence of intergenerational transmission languages “culturally evolve in such a way as to maximize their own transmissibility: over time, the languages in our experiments become easier to learn and increasingly structured.” In these experiments a subject is exposed to an alien language, made up of two elements within a finite space: meanings (consisting of a picture with three discernible elements: colour, shape and movement) paired with signals (consisting of a string of letters). Importantly, the subject is only exposed to a set amount of meanings (SEEN items), after which they are then presented with a group of meanings (some SEEN, some UNSEEN) without the corresponding signal — the goal being that they provide a response (be it the correct version or not). On completion of forming the meaning-signal pairs the experiment is repeated, except this time the new subjects are trained on the data provided by the previous generation. This continues until the experiment is finished, which in this case happened at generation ten.
Cultural differences are often attributed to events far removed from genetics. The basis for this belief is often based on the assertion that if you take an individual, at birth, from one society and implant them in another, then they will generally grow up to become well-adjusted to their adopted culture. Whilst this is more than likely true, even if there may be certain cultural features that may disagree with someone of a different ethnic background (e.g. degrees of alcohol tolerance), the situation is not as clear cut as certain political factions may have you believe. Yet, largely due to studies on gene-culture coevolution, we are now starting to understand the complex dynamics through which genes and culture interact.
First, a particular culture may exert selection pressures on genes that provide an advantageous benefit to the adoption of a particular cultural trait. This is evident in the strong selection of the lactose-tolerance allele due to the spread of dairy farming. Second, pre-existing gene distributions provide pressures through which culture adapts. Off the top of my head, one proposed example of this is a paper by Dediu and Ladd (2007), which looked at how the distribution of the derived haplotypes of ASPM and Microcephalin may have subtly influenced the development of tonal languages. The paper in question, however, is looking more broadly at culture. Specifically, the authors, Baldwin May and Matthew Lieberman, examine recent genetic association studies and how within-variation of genes involved in central neurotransmitter systems are associated with differences in social sensitivity. In particular, they highlight a correlation between the relative frequencies of certain gene-variants and the relative degree of individualism or collectivism within certain populations.
Most of you in the science blogosphere have probably come across Razib’s recent post on linguistic diversity and poverty. The basic argument being that linguistic homogeneity is good for economic development and general prosperity. I was quite happy to let the debate unfold and limit my stance on the subject to the following few sentences I posted previously:
From the perspective of a linguist, however, I do like the idea of really obscure linguistic communities, ready and waiting to be discovered and documented. On the flip side, it is selfish of me to want these small communities to remain in a bubble, free from the very same benefits I enjoy in belonging to a modern, post-industrialised society. Our goal, then, should probably be more focused on documenting, as opposed to saving, these languages.
Since then, the debate has become a lot more heated, with Neuroanthropology wading in against Razib, which, in the second-half of the post at least, is worth reading just to get the general flavour of the other side in this debate. Having said that, I wasn’t convinced by the evidence Greg Downey used to dismiss Razib’s hypothesis, so I decided to actually look at the literature on the subject. The first paper I found upon searching was one by Nettle et al, in which they examine the relationship between cultural diversity and societal instability using a large cross-national data set of 212 nations. Importantly, they look at cultural diversity in the context of three areas: linguistically, ethnically and religious affiliation. Also, they draw a distinction between within-nation (alpha) diversity and between-nation (beta) diversity. Lastly, unlike other studies on the subject, where simple regression or correlation methods are used, the current study employs structural equation modelling (SEM):
Throughout much of our history language was transitory, existing only briefly within its speech community. The invention of writing systems heralded a way of recording some of its recent history, but for the most part linguists lack the stone tools archaeologists use to explore the early history of ancient technological industries. The question of how far back we can trace the history of languages is therefore an immensely important, and highly difficult, one to answer. However, it’s not impossible. Like biologists, who use highly conserved genes to probe the deepest branches on the tree of life, some linguists argue that highly stable linguistic features hold the promise of tracing ancestral relations between the world’s languages.
Previous attempts using cognates to infer the relatedness between languages are generally limited to predictions within the last 6000-10,000 years. In the present study, Greenhill et al (2010) decided to examine more stable linguistic features than the lexicon, arguing: