Spurious correlation bonanza to mark Replicated Typo 2.0 reaching 100,000 hits

Replicated Typo 2.0 has reached 100,000 hits!  The most popular search term that leads visitors here is ‘What makes humans unique?’ and part of the answer has to be our ability to transmit our culture.  But as we’ve shown on this blog, culturally transmitted features can be highly correlated with each other.  This fact is a source of both frustration and fascination, so I’ve roped together some of my favourite investigations of cultural correlations into a correlation super-chain.  In addition, there’s a whole new spurious correlation at the end of the article!

Edit: You can hear me talk about these correlations in an extended EU:Sci podcast.

Let Replicated Typo take you on trip from acacia trees to traffic accidents…

Continue reading “Spurious correlation bonanza to mark Replicated Typo 2.0 reaching 100,000 hits”

The Declining Academic Performance of Men

PZ Myers points to a TED video of Philip Zimbardo (see below) that links the declining academic performance of men with arousal addiction: here, the transition from boys to men in our modern society is characterised by “digitally rewired” brains that are in search of constant arousal etc etc. Like Myers, I’m sceptical of these claims, but I think they are certainly worth investigating, just not in the fashion employed by Susan Greenfield (you know, she of pseudo-neuroscientific fame). What I would like to see answered is: Do all Internet-influenced societies see this general trend of declining academic performance in men?

Another research question we might want to test, or control for in our hypothetical study, is whether or not there is a correlation between the number of female teachers and male academic performance? I haven’t bothered to look into the literature on this, so maybe a study has already been done, but female teachers certainly appear to outnumber their male counterparts in many corners of the globe (especially in primary school education). In Wales, for instance, I was astonished to find that 74.7% of teachers are female. My point: there might be a more obvious underlying cause as to why women are outperforming men, other than the rise of the zombie-generation of internet-addicted gamers. Still, I’m going to go with the cop-out approach and claim there are numerous factors underpinning male achievement (or lack of) in academia and beyond. I just wanted to point out that, in any study purporting to provide answers about declining educational attainment, you first really need to look at who is doing the teaching.

Continue reading “The Declining Academic Performance of Men”

Great Andamanese: The key to more than one linguistic puzzle?

Last week we had a lecture from Anvita Abbi on rare linguistic structures in Great Andamanese – a language spoken in the Andaman Islands.  The indigenous populations of the Andaman Islands lived in isolation for tens of thousands of years until the 19th Century, but still exhibit some common features of south-east Asian languages such as retroflex consonants.  This could be evidence for the migration route of humans from India to Australia.  Indeed, recent genetic research has shown that the Andamanese are descendants of the first human migration from Africa in the Palaeolithic, though Abbi suggested that the linguistic evidence is also a strong marker of human migration and an “important repository of our shared human history and civilization”.

Although the similarities are fascinating for studies of cultural evolution, the rarity of some structures in Great Andamanese are even more intriguing.

The Andaman Islands

Continue reading “Great Andamanese: The key to more than one linguistic puzzle?”

Linguistic diversity and traffic accidents

This post was chosen as an Editor's Selection for ResearchBlogging.orgI was thinking about Daniel Nettle’s model of linguistic diversity which showed that linguistic variation tends to decline even with a small amount of migration between communities.  I wondered if statistics about population movement would correlate with linguistic diversity, as measured by the Greenberg Diversity Index (GDI) for a country (see below).  However, this is a cautionary tale about obsession and use of statistics.  (See bottom of post for  link to data).

Continue reading “Linguistic diversity and traffic accidents”

A random walk model of linguistic complexity

EDIT: Since writing this post, I have discovered a major flaw with the conclusion which is described here.

One of the problems with large-scale statistical analyses of linguistic typologies is the temporal resolution of the data.  Because we only typically have single measurements for populations, we can’t see the dynamics of the system.  A correlation between two variables that exists now may be an accident of more complex dynamics.  For instance, Lupyan & Dale (2010) find a statistically significant correlation between a linguistic population’s size and its morphological complexity.  One hypothesis is that the language of larger populations are adapting to adult learners as they comes into contact with other languages.  Hay & Bauer (2007) also link demography with phonemic diversity.  However, it’s not clear how robust these relationships are over time, because of a lack of data on these variables in the past.

To test this, a benchmark is needed.  One method is to use careful statistical controls, such as controlling for the area that the language is spoken in, the density of the population etc.  However, these data also tend to be synchronic.  Another method is to compare the results against the predictions of a simple model.  Here, I propose a simple model based on a dynamic where cultural variants in small populations change more rapidly than those in large populations.  This models the stochastic nature of small samples (see the introduction of Atkinson, 2011 for a brief review of this idea).  This model tests whether chaotic dynamics lead to periods of apparent correlation between variables.  Source code for this model is available at the bottom.

Continue reading “A random walk model of linguistic complexity”

Laryngeal Air Sacs

So, I got a request from a friend of mine to make an abstract on the fly for a poster for Friday. I stayed up until 3am and banged this out. Tonight, I hope to write the poster justifying it into being. A lot of the work here builds on Bart de Boer’s work, with which I am pretty familiar, but much of it also started with a wonderful series of posts over on Tetrapod Zoology. Rather than describe air sacs here, I’m just going to link to that – I highly suggest the series!

Here’s the abstract I wrote up, once you’ve read that article on air sacs in primates. Any feedback would be greatly appreciated – I’ll try to make a follow-up post with the information that I gather tonight and tomorrow morning on the poster, as well.

Re-dating the loss of laryngeal air sacs in hominins

Laryngeal air sacs are a product of convergent evolution in many different species of primates, cervids, bats, and other mammals. In the case of Homo sapiens, their presence has been lost. This has been argued to have happened before Homo heidelbergensis, due to a loss of the bulla in the hyoid bone from Austrolopithecus afarensis (Martinez, 2008), at a range of 500kya to 3.3mya. (de Boer, to appear). Justifications for the loss of laryngeal air sacs include infection, the ability to modify breathing patterns and reduce need for an anti-hyperventilating device (Hewitt et al, 2002), and the selection against air sacs as they are disadvantageous for subtle, timed, and distinct sounds (de Boer, to appear). Further, it has been suggested that the loss goes against the significant correlation of air sac retention to evolutionary growth in body mass (Hewitt et al., 2002).

I argue that the loss of air sacs may have occurred more recently (less than 600kya), as the loss of the bulla in the hyoid does not exclude the possibility of airs sacs, as in cervids, where laryngeal air sacs can herniate between two muscles (Frey et al., 2007).  Further, the weight measurements of living species as a justification for the loss of air sacs despite a gain in body mass I argue to be unfounded given archaeological evidence, which suggests that the laryngeal air sacs may have been lost only after size reduction in Homo sapiens from Homo heidelbergensis.

Finally, I suggest two further justifications for loss of the laryngeal air sacs in homo sapiens. First, the linguistic niche of hunting in the environment in which early hominin hunters have been posited to exist – the savannah – would have been better suited to higher frequency, directional calls as opposed to lower frequency, multidirectional calls. The loss of air sacs would have then been directly advantageous, as lower frequencies produced by air sac vocalisations over bare ground have been shown to favour multidirectional over targeted utterances (Frey and Gebler, 2003). Secondly, the reuse of air stored in air sacs could have possibly been disadvantageous toward sustained, regular heavy breathing, as would occur in a similar hunting environment.


Boer, B. de. (to appear). Air sacs and vocal fold vibration: Implications for evolution of speech.

Fitch, T. (2006). Production of Vocalizations in Mammals. Encyclopedia of Language and Linguistics. Elsevier.

Frey, R, & Gebler, A. (2003). The highly specialized vocal tract of the male Mongolian gazelle (Procapra gutturosa Pallas, 1777–Mammalia, Bovidae). Journal of anatomy, 203(5), 451-71. Retrieved June 1, 2011, from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1571182&tool=pmcentrez&rendertype=abstract.

Frey, Roland, Gebler, Alban, Fritsch, G., Nygrén, K., & Weissengruber, G. E. (2007). Nordic rattle: the hoarse vocalization and the inflatable laryngeal air sac of reindeer (Rangifer tarandus). Journal of Anatomy, 210(2), 131-159. doi: 10.1111/j.1469-7580.2006.00684.x.

Martínez, I., Arsuaga, J. L., Quam, R., Carretero, J. M., Gracia, a, & Rodríguez, L. (2008). Human hyoid bones from the middle Pleistocene site of the Sima de los Huesos (Sierra de Atapuerca, Spain). Journal of human evolution, 54(1), 118-24. doi: 10.1016/j.jhevol.2007.07.006.

Hewitt, G., MacLarnon, A., & Jones, K. E. (2002). The functions of laryngeal air sacs in primates: a new hypothesis. Folia primatologica international journal of primatology, 73(2-3), 70-94. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/12207055.

Sound good? I hope so! That’s all for now.

The end of universals?

Woah, I just read some of the responses to Dunn et al. (2011) “Evolved structure of language shows lineage-specific trends in word-order universals” (language log here, Replicated Typo coverage here).  It’s come in for a lot of flack.  One concern raised at the LEC was that, considering an extreme interpretation, there may be no affect of universal biases on language structure.  This goes against Generativist approaches, but also the Evolutionary approach adopted by LEC-types.  For instance, Kirby, Dowman & Griffiths (2007) suggest that there are weak universal biases which are amplified by culture.  But there should be some trace of universality none the less.

Below is the relationship diagram for Indo-European and Uto-Aztecan feature dependencies from Dunn et al..  Bolder lines indicate stronger dependencies.  They appear to have different dependencies- only one is shared (Genitive-Noun and Object-Verb).

However, I looked at the median Bayes Factors for each of the possible dependencies (available in the supplementary materials).  These are the raw numbers that the above diagrams are based on.  If the dependencies’ strength rank in roughly the same order, they will have a high Spearman rank correlation.

Spearman Rank Correlation Indo-European Austronesian
Uto-Aztecan 0.39, p = 0.04 0.25, p = 0.19
Indo-European -0.13, p = 0.49

Spearman rank correlation coefficients and p-values for Bayes Factors for different dependency pairs in different language families.  Bantu was excluded because of missing feature data.

Although the Indo-European and Uto-Aztecan families have different strong dependencies, have similar rankings of those dependencies.  That is, two features with a weak dependency in an Indo-European language tend to have a weak dependency in Uto-Aztecan language, and the same is true of strong dependencies.  The same is true to some degree for Uto-Aztecan and Austronesian languages.  This might suggest that there are, in fact, universal weak biases lurking beneath the surface. Lucky for us.

However, this does not hold between Indo-European and Austronesian language families.  Actually, I have no idea whether a simple correlation between Bayes Factors makes any sense after hundreds of computer hours of advanced phylogenetic statistics, but the differences may be less striking than the diagram suggests.


As Simon Greenhill points out below, the statistics are not at all conclusive.  However, I’m adding the graphs for all Bayes Factors (these are made directly from the Bayes Factors in the Supplementary Material):

Austronesian:                                                             Bantu:

Indo-European:                                                            Uto-Aztecan:

Michael Dunn,, Simon J. Greenhill,, Stephen C. Levinson, & & Russell D. Gray (2011). Evolved structure of language shows lineage-specific trends in word-order universals Nature, 473, 79-82

Colour terms and national flags

I’m currently writing an article on the relationship between language and social features of the speakers who use it. As studies such as Lupyan & Dale (2010) have discovered, language structure is partially determined by social structure.  However, it’s also probable that many social features of a community are determined by its language.

Today, I wondered whether the number of basic colour terms a language has is reflected in the number of colours on its country’s flag. The idea being that a country’s flag contains colours that are important to its society, and therefore a country with more social tools for discussing colour (colour words) will be more likely to put more colours on its flag. It was a long shot, but here’s what I found:

The World Atlas of Language Structures has data on the number of basic colours in many languages (Kay & Maffi, 2008). Wikipedia has a list of country flags by the number of colours in them.  Languages with large populations (like English, Spanish etc.) were excluded.  It’s known that the number of basic colour terms correlates with latitude, so a partial correlation was carried out.  There was a small but significant relationship between the number of colour terms in a langauge and the number of colours on the flag where that language is spoken (r = 0.15, τ = 254, p=0.01, partial correlation, 2-tailed using Kendall’s tau).

Here’s the flag of Belize, where Garífuna is spoken (9-10 colours in the language, 12 colours on the flag):

Here is the flag of Nigeria where Ejagham is spoken (3-4 colours in the langauge, 2 colours on the flag):

Interestingly, the languages with the highest number of colours in their language and flag come from Central America while the majority of the languages with the lowest number of colours in their language and flag come from Africa.  Maybe there’s some cultural influence on neighbouring flags.


Here’s a boxplot, which makes more sense:

Also, I re-ran the analysis taking into account distance from the equator, speaker population and some properties of the nearest neighbour of each language (number of colours on flag and number of basic colours in langauge).  A multiple regression showed that the number of basic colours in a language is still a significant predictor of the number of colours in its national flag (r = 0.12, F(106,16)=1.8577, p= 0.03).  This analysis was done by removing languages with populations more than 2 standard deviations from the mean (9 languages out of 140).  The relationship is still significant with the whole dataset.

There are still problems with this analysis, of course.  For example, many of the languages in the data are minority languages which may have little impact on the national identity of a country.  Furthermore, the statistics may be compromised by multiple comparisons, since there may be a single flag for more than one language.  Also, a proper measure of the influence of surrounding languages would be better.  The nearest neighbour was supposed to be an approximation, but could be improved.

Lupyan G, & Dale R (2010). Language structure is partly determined by social structure. PloS one, 5 (1) PMID: 20098492

Kay, Paul & Maffi, Luisa. (2008). Number of Basic Colour Categories.In: Haspelmath, Martin & Dryer, Matthew S. & Gil, David & Comrie, Bernard (eds.) The World Atlas of Language Structures Online. Munich: Max Planck Digital Library, chapter 133.

Animal Signalling Theory 101: Handicap, Index… or even a signal? The Case of Fluctuating Asymmetry

The differences between handicaps and indices are usually distinguishable in formal mathematical models or in unambiguous real-world cases. Often though, classifying a trait as a handicap, an index, or even a signal at all, can be quite a difficult task.

For the purposes of illustration I will use Fluctuating Asymmetry (FA for short) as an example.  Fluctuating asymmetry is the term used to refer to deviation from symmetry in paired morphological structures (ranging from birds’ tails to human faces) that should be, all being well, bilaterally symmetric. Deviations from the ideal symmetrical phenotype are caused by inherent genetic perturbations and exposure to environmental disturbances occurring in early development.

Is FA a signal?

In their 2005 book Animal Signals, Maynard-Smith and Harper define a signal as:

‘Any act or structure which alters the behaviour of other organisms, which evolved because of that effect, and which is effective because the receiver’s response has also evolved’

They then argue that FA is unlikely to function as a signal because it is difficult to discern whether receivers respond directly to FA and because there appear to be few examples of displays in which signallers actively advertise their symmetry to receivers.


Continue reading “Animal Signalling Theory 101: Handicap, Index… or even a signal? The Case of Fluctuating Asymmetry”

Animal Signalling Theory 101 – The Handicap Principle

One of the most important concepts in animal signalling theory, proposed by Amotz Zahavi in a seminal 1975 paper and in later works (Zahavi 1977; Zahavi & Zahavi 1997), is the handicap principle. A general definition is that females have evolved mating preferences for males who display exaggerated ornaments or behaviours that are costly to maintain and develop, and that this cost ensures an ‘honest’ signal of male genetic quality.

As a student I found it quite difficult to identify a working definition for this important type of signal mainly due to the apparent ‘coining fest’ that has taken place over the years since Zahavi outlined his original idea in 1975. For this reason, I have decided to provide a brief outline of the terminological and conceptual differences that exist in relation to the handicap principle in an attempt to help anyone who might be struggling to navigate the literature.

As Zahavi did not define the handicap principle mathematically, a number of interpretations can be found in the key literature due to scholars disagreeing as to the true nature of his original idea. Until John Maynard Smith and Harper simplified and clarified things wonderfully in their 2003 publication Animal Signals, to my knowledge at least four different interpretations of the handicap were being used and explored empirically and through mathematical modelling, each with distinct differences that aren’t all that obvious to grasp without delving into the maths.

Continue reading “Animal Signalling Theory 101 – The Handicap Principle”