There has been a lot of talk round these parts recently of the merits of pluralistic approaches to problems in language evolution, and condemning the assignment of too much explanatory power to statistical correlations away from other forms of evidence, such as cultural learning experiments. Sean and James recently published a paper about this here which includes some commentary on Hay & Bauer (2007), who find that speaker population size and phoneme inventory size correlate (the more speakers a language has, the bigger its phoneme inventory is). James has blogged about this extensively here. More recently Moran, McCloy & Wright presented a critical analysis of Hay & Bauer's (2007) findings here along with a statistical analysis of their own which uses more languages than Hay & Bauer (2007), and finds little to no correlation between speaker population and various measures of the phonological system, I hope James will do a blog about this as the resident expert.
As I've just mentioned, doing further statistical analysis is one good way of disputing or confirming the results of large scale statistical studies. But turning to experimental evidence is also a good way to back up the findings of statistical results and to tease out patterns of causation. I discuss this briefly here.
Recently, I was reading Selten & Warglien (2007) (mentioned by James here and covered by John Hawks here), which is a study which looks at how simple languages emerge within a coordination task with no initial shared language. The experiment uses pairwise interactions in which participants had to refer to figures which could be distinguished using features on three levels of outer shape, inner shape and colour (see picture). Participants were given a code which had a limited number of letters which they were to use to communicate with one another. However, the use of letters within this code had a cost within the language game the participants were playing, so the less letters they used the higher their score. Also, the more communicatively successful they were, the higher their score.
The study was primarily interested in what enhanced the emergence of structure in this code via the communication game. They looked at the effects of 2 variables, the number of letters available and variability in the set of figures. I am only going to discuss the effects of the first variable here. Selten & Warglien (2007) start off with an experiment where only two (and then three) letters were available which showed very little convergence to a common code. A common code is defined as being a code where the signals for all figures agree between the two participants. However, when given a larger inventory of letters to play with, participants were much more successful at creating a common code. This is not surprising as more symbols permit a higher degree of cost efficiency within the language game as you can use more distinct, shorter expressions. Selten & Warglien (2007) also make the point that the human capability to produce a large variety of phonetic signals seems to be at the root of the emergence of most linguistic structure, because if you only have a small inventory of individual units, you have to rely more on positional structure. Positional systems are systems like the Arabic number notation which are more likely invented rapidly rather than the product of slow emergence via cultural evolution, but can be easily used once they have emerged.
This is all very interesting in its own right, but the reason I brought it up in this post is that Selten & Warglien (2007) have shown that you can experimentally explore the effects of the size of inventory on an artificial language in a laboratory setting. I know that the natural direction of causation is to assume that demographic structure (e.g. the size of a population) affects the linguistic structure (e.g. the size of the phoneme inventory), but it might be possible to see whether a common code can be more easily reached within a small language community using only a small number of phonemes, than with a larger speaker community. I'm also not sure how one might create an experimental proxy for size of population in an experiment such as this (perhaps repeated interaction between the same participants compared with interaction within changing pairs). It might also be possible to look at the effects that the size of inventory can have on other linguistic features that have been hypothesised to correlate with population size, e.g. how regular the compositional structure of an emerging language is given difference inventory sizes.
Hay, J., & Bauer, L. (2007). Phoneme inventory size and population size Language, 83 (2), 388-400 DOI: 10.1353/lan.2007.0071
Roberts, S. & Winters, J. (2012). Social Structure and Language Structure: the New Nomothetic Approach. Psychology of Language and Communication, 16(2), pp. 79-183. Retrieved 12 Feb. 2013, from doi:10.2478/v10057-012-0008-6
Selten, R., & Warglien, M. (2007). The emergence of simple languages in an experimental coordination game Proceedings of the National Academy of Sciences, 104 (18), 7361-7366 DOI: 10.1073/pnas.0702077104