On the basis of Sean’s comment, about using a regression to look at how phoneme inventory size improved as geographic spread was incorporated along with population size, I decided to look at the stats a bit more closely (original post is here). It’s fairly easy to perform multiple regression in R, which, in the case of my data, resulted in highly significant results (p<0.001) for the intercept, area and population (residual standard error = 9.633 on 393 degrees of freedom; adjusted R-Squared = 0.1084). I then plotted all the combinations as scatterplots for each pair of variables. As you can see below, this is fairly useful as a quick summary but it is also messy and confusing. Another problem is that the pairs plot is on the original data and not the linear model.
A prominent idea in linguistics is that humans have an array of specialised organs geared towards the production, reception and comprehension of language. For some features, particularly the physical capacity to produce and receive multiple vocalizations, there is ample evidence for specialisation: a descended larynx (Lieberman, 2003), thoracic breathing (MacLarnon & Hewitt, 1999), and several distinct hearing organs (Hawks, in press). Given that these features are firmly in the domain of biology, it makes intuitive sense to apply the theory of natural selection to solve the problem: humans are specially adapted to the production and reception of multiple vocalizations.