The Bilingual paradox in Language Evolution: Top down versus bottom up approaches

When thinking about bilingualism and language evolution, there appears to be a paradox:  Children are adept at learning more than one language at a time  and there are many bilingual societies in the world.  However, pressures on memory and redundancy makes it unclear what the adaptive advantage of a cognitive capacity for learning multiple languages at an early stage of language evolution would be.  For instance, Hagen (2008) has argued that a bilingual ability would not have been adaptive in early societies and so could not have been selected for.  Furthermore, many models have suggested that bilingualism is an unstable trait in a society (e.g. Castello et al., 2008).  How can we account for the evolution of this ability?  Would an early population of language users most likely be monolingual or bilingual?  Here, I take a top down and a bottom up approach and show that they tends to lead to two different conclusions.

Top-Down Approach

A rational, top-down approach to this problem could ask in what situations it would be rational for a learner to assume multiple languages in its input.   We could define a model where learners had to induce which language or languages were being spoken by considering data produced by multiple teachers.  Burkett & Griffiths (2010) present a model of Iterated Learning with Bayesian agents which have a prior expectation about the number of languages in their input (also described here).  If the learners assumed multiple languages in their input, the distribution would converge to reflect the prior biases of the learners.  Otherwise, if the learners assumed that there were few or only one language in their input, the distribution would tend to reflect the situation in the initial condition.  The model could be extended so that this expectation could itself be estimated from the data using an overhypothesis.

However, we may be no closer to solving the paradox.  The model suggests that if learners are sensitive to the distribution of variation over speakers then a bilingual ability would be rational in certain scenarios, most probably where the structure of the society was complex.  However, this argument is a little circular:  In a situation likely to produce variation between speakers, being sensitive to the that variation would be rational.  To account for an evolved ability to  We would have to assume in this case, in order to account for a current bilingual ability, that early communities of language speakers had complex structures (e.g. migration, trade, fission/fusion dynamics etc.).  That is, a bilingual ability was adaptive at an early stage of language evolution.

I suggest that there are two reasons to take a different approach.  The first is that we would like an argument about an adaptive benefit for communication in general that would lead to the ability to learn multiple languages. There's no reason to believe that there's a specific ability for learning multiple languages.  I get the feeling that, outside of the field of bilingualism, the ability to learn two languages at once is seen as a sort of bolt-on feature:  An extra ability on top of the ability to learn languages that you have when you're young and then fades away.  Instead, learning multiple languages may involve no more mechanisms than are required to learn one - the bilingual ability may be an by-product of more general learning abilities.  Certainly there are extra factors to consider - multiple names for things, more variation in phonetics, multiple sets of phonological rules, more demands on executive control etc.  But these seem to be extensions of abilities that are required for language learning in general.  More specifically, they may be just differences in the heuristics used by children in different scenarios (e.g. differences in mutual exclusivity assumptions, see my posts here and here).  Therefore, a more pertinent question may be 'How do children recognise bilingual situations in order to change their learning parameters, and how do those situations come about?', rather than 'How do children learn two languages at once?'.

The second is the debatable reality of a `language' as a discrete unit separable from its speakers.  Defining a criteria for dividing the variation between speakers into separate languages is not a trivial task.  Mutual intelligibility can be used, but there are degrees of intelligibility (that are not always bi-directional) and it is unclear where to draw the line in a chain of mutually intelligible varieties where the speakers at either end cannot understand each other.  Furthermore, the distinction between languages and dialects are often affected by politics or issues of national identity.  Bilingualism complicates the matter by combining multiple varieties in one speaker which may be used as single `mediums' through code-switching (Gafaranga, 2000).  From a functional perspective, then, it is unclear how to define a language.  Therefore, modelling units such as "languages" without an underlying link to its content and its speakers may be unrealistic:  They may not have a cognitive or social reality in all situations.  It's true that in Burkett & Griffiths' model utterances are diagnostic of languages, rather than actually representing languages, but they are still not linked to individuals.  See my post on Bayesian Bilingualism in which I describe a model that does take account of who says what.

Bottom-Up Approach

One solution to the problems above is to set aside large-scale units such as language and consider a mechanism which would lead to a bilingual ability given a bottom-up approach.  The assumption is that learners are trying to find structure in their input by finding the most efficient way of dividing the variation they perceive into conditioned clusters.  Languages will order conditioning factors differently.  For some languages, contrasts of intonation are important at a high level (questions versus statements), while in tonal languages they would be important at a lower level (words).  Similarly, aspects of semantics such as tense, gender or number could be differently important in different languages.  This ability would optimally be multi-modal in order to condition variation by any aspect of meaning.  This has a clear adaptive advantage.  Furthermore, it would allow variables such as speaker identity to become important conditioning factors.  In a community where different speakers preferentially use different varieties, speaker identity may be an efficient and salient conditioning factor, leading to the learner assuming multiple languages.

This approach is similar to that followed by Petitto & Kovelman (2003) who show that children are aware of important separations of their input at very early stages, allowing bilinguals to reach milestones in language learning at the same time as monolinguals.  The early identification of speaker identity as an important conditioning factor may also partially explain some other differences between monolingual and bilingual learners such as the development of perspective taking and meta-linguistic abilities.

(For another discussion of top-down versus bottom up, see my post here)

Resolving the paradox

Given a bottom-up approach, the paradox can be resolved.  Language learners have an ability to detect structure conditioned on aspects of meaning at multiple levels.  In a complex social structure, a salient aspect may be speaker identity, leading to the assumption that there are multiple varieties spoken in the population and therefore a rational approach to learning that variation.  However, this assumption can be reached without specifying a specific mechanism for rationally assessing the structure of the society and without an abstraction to the concept of a "language" unit.  Furthermore, bilingualism can emerge whenever the situation is appropriate.  In this case, unlike in the top-down approach, early communities need not have been complex to explain our capacity for bilingualism.  That is, the ability to acquire multiple languages need not have been an important factor in evolutionary history for humans to have developed the ability to acquire

In conclusion, a rational, top down approach told us the situations in which it was rational to assume multiple languages in the population.  Assuming a bilingualism-specific ability (which the Bayesian formulation may trick us into accepting), we must conclude that the structure of early societies was complex, otherwise the bilingual ability would not have emerged.  That is, we resolve the paradox by assuming that the second premiss is false.  However, a bottom-up approach lead to a different conclusion, and a different picture of language evolution.  This approach assumes a general learning mechanism for conditioning variation by different aspects of meaning.  This may explain how children learn multiple varieties in the appropriate situations.  This approach resolves the paradox by suggesting that it is badly formed:  What we should be asking is how children approach the task of learning multiple levels of conditioned variation, and how that interacts with social structures to bring about bilingual societies.


Note : Pettito & Kovelman, 2003 also talk about a 'bilingual paradox', but this that we marvel at how good children are at learning multiple languages while being worried that it might have negative affects.  The paradox above sort of an evolutionary extension of this.

