Mapping Linguistic Phylogeny to Politics

In a recent article covered in NatureNews in Societes Evolve in Steps, Tom Currie of UCL, and others, like Russell Gray of Auckland, use quantitative analysis of the Polynesian language group to plot socioanthropological movement and power hierarchies in Polynesia. This is based off of previous work, available here, which I saw presented at the Language as an Evolutionary Systemconference last July. The article claims that the means of change for political complexity can be determined using linguistic evidence in Polynesia, along with various migration theories and archaeological evidence.

I have my doubts.

Note: Most of the content in this post is refuted wonderfully in the comment section by one of the original authors of the paper. I highly recommend reading the comments, if you’re going to read this at all – that’s where the real meat lies. I’m keeping this post up, finally, because it’s good to make mistakes and learn from them. -Richard


I had posted this already on the Edinburgh Language Society blog. I’ve edited it a bit for this blog. I should also state that this is my inaugural post on Replicated Typo; thanks to Wintz’ invitation, I’ll be posting here every now and again. It’s good to be here. Thanks for reading – and thanks for pointing out errors, problems, corrections, and commenting, if you do. Research blogging is relatively new to me, and I relish this unexpected chance to hone my skills and learn from my mistakes. (Who am I, anyway?) But without further ado:


In a recent article covered in NatureNews in Societes Evolve in StepsTom Currie of UCL, and others, like Russell Gray of Auckland, use quantitative analysis of the Polynesian language group to plot socioanthropological movement and power hierarchies in Polynesia. This is based off of previous work, available here, which I saw presented at the Language as an Evolutionary Systemconference last July. The article claims that the means of change for political complexity can be determined using linguistic evidence in Polynesia, along with various migration theories and archaeological evidence.

I have my doubts. The talk that was given by Russell Gray suggested that there were still various theories about the migratory patterns of the Polynesians – in particular, where they started from. What his work did was to use massive supercomputers to narrow down all of the possibilities, by using lexicons and charting their similarities. The most probable were then recorded, and their statistical probability indicated what was probably the course of action. This, however, is where the ability for guessing ends. Remember, this is massive quantificational statistics. If one has a 70% probability chance of one language being the root of another, that isn’t to say that that language is the root, much less that the organisation of one determines the organisation of another. But statistics are normally unassailable – I only bring up this disclaimer because there isn’t always clear mapping between language usage and migration.

On a related note that I won’t go into, I have my doubts about this sort of quantificational study, as well. For one, the lexicons that were worked with were done in post-hoc standardised orthography (as far as I could gather), and as such didn’t really suggest the actual phonetic detail, a point raised by one of the audience members (I believe it was Pullum). For two, the interconnectivity of the Polynesian languages by no means means that one languages proportional derivational split means anything. The use of derivational trees versus a net or web network, in this particular case, may very well be unjustified.

The article in NatureNews, short as it is, does thankfully manage to cast doubt on the research which suggests that is possible.

“So the only question would be ‘are languages a good way to work out relationships between societies?'” In general, Diamond [the author of this related article] says, languages do fit the bill.
Unfortunately, how is never stated. But a few arguments are raised in NatureNews:
  • Quantitative analyses have never traditionally been used by anthropologists or sociologists in this fashion.
  • The geopolitical situation of such a wide area means that statements cannot be made with certainty, given the likelihood of different sorts of movements and political fall out, such as fractionation, which occur generally in such contexts.
  • The mapping of political evolution to linguistic phylogeny is not clear cut.

I’ve already stated the issues I have with the general use of these structures, anyway: phonetic detail, orthographic convention which is applied after the gathering of data, and statistical relevance to linguistic phylogeny. But let us also compare the history of the Polynesian Islands (something I won’t pretend to be an expert in, but I’m certainly better placed to talk about it than most, having read more than a little on the subject.) To sum it up shortly: some islands were at war with only other island groups, some islands remained historically neutral to other’s blood feuds, there was a very good knowledge of where, across thousands of miles, other archipelagos and chains resided, there was travel between them, not always for warfare, there was considerable movement that was occasionally residual and long-standing, and not mere trading, and there were languages that could be thousands of miles apart and not mutually incomprehensible, while there were also languages on the same islands and in the same communities which were incomprehensible (to a monolingual). As well, some islands had different political hierarchies than others, even when within spitting distance of each other, and these political hierarchies by no means fit into the four categories proposed by Currie et al.: acephalous, simple chiefdom, complex chiefdom, and  state are the only options considered. (There, oddly, doesn’t seem to be any work given over to the sometimes abrupt change from acephalous to state that occurred with colonisation). Finally, the historical political evidence that we have is often poorly recorded, and political hierarchisation is not something which is easily traced back archaeologically in an essentially stone-age society (as far as I am aware, and here I may be wrong. The article does substantiate many of its claims with archaeological evidence, which is a major point for it.)

But let’s look closer: what are they actually trying to say? Merely that political organisation and hierarchies rise and fall in complexity in steps instead of in jumps. They argue that this is a held notion, but that it hasn’t been backed up by quantitative analyses. The actual article states, clearly, that “recently, phylogenetic trees showing the historical relationships between these societies have been inferred using basic vocabulary data, by applying the same techniques that biologists use with genetic data to infer the shared ancestry of a species.” Ok, that’s fine– wait. No, I’m not sure I agree with that statement. Biologists use genes to chart the movement of species, ancestry, sister-species and the like, sure. But they don’t use those genes to mark how a particular tribe of bonobo followed one leader versus another, and it would be ridiculous to do so. There certainly isn’t a one-to-one correspondance between language and political systems. But this may be a poor analogy.

So how did they judge that political systems flow this way? “By mapping data about the characteristics of societies onto the tips of these trees, we can use phylogenetic comparative methods to make inferences about what societies were like in the past and how they have changed over time.” (801) So, they took present-day analyses of the political systems, hard-coded them into the tips of the languages, and then saw if the political systems fell out naturally? I’m not sure this is how it works. Again, the mapping of language onto political systems isn’t clear (I feel like I’m repeating myself here.) Secondly, though, languages are resistant to political organisation. When a society changes from an acephalous group to a simple chiefdom, one doesn’t expect that the pronoun usage or the word for kava would change. And this seems  to be what is being suggested. They also don’t account for different sorts of changes, or how these steps would be realised differently. A voluntary migration is certainly different from a rout, as if after a war, which would result in acephalous movements. Finally, the geographical distribution of their data doesn’t reflect the region as well as it could, as it comes from a single projected language group: New Guinea appears to have only four languages, for one. And what about bilingualism?

Here I’ll stop. I welcome any comments that anyone has. I’m going to be reading more into the research behind this test, as it is the first of it’s kind, and very interesting, especially as I am not an anthropologist. If it produces results which are in line with the general consensus, then I hope that it continues to shed light on more information. My problem is I want to know which is shedding light on which, here, and exactly where is the line where the darkness begins to actually be an issue.


  • Smith, Nature News:
  • Diamond, Nature Article:
  • Currie, T. E., Greenhill, S. J., Gray, R. D., Hasegawa, T. & Mace, R. Nature 467, 801-804 (2010). | Article
  • Gray, R. D., Drummond, A. J. & Greenhill, S. J. Science 323, 479-483 (2009). | Article
  • Author: Richard

    I am computational linguistics student at the University of Saarland; my undergraduate in Linguistics was at the University of Edinburgh. I am interested in evolutionary linguistics, particularly involving Bayesian phylogenetics, typology, and computer simulations. I am also interested in data management, web development, open documentation, and scientific workflows. My undergraduate thesis focused on the evolution and significance of word segmentation.

    3 thoughts on “Mapping Linguistic Phylogeny to Politics”

    1. Hi,

      I’m an author of the study, and the guy who built the language database. Let me respond to some of your criticisms.

      1. A minor point – the migratory patterns of Polynesians and where they came from is very very well-established: Fiji. The migratory patterns of where the Austronesians came from (of which Polynesian is a subset) is also well understood, but the debate is about the origins of this expansion and the timing of this expansion. Either way, the debate is between a small group of geneticists who argue for an origin around 15,000 BP in Island South-East Asia (rather unspecific). On the other side is basically most linguistics, archeaologists, and anthropologists who argue for a recent (~5-6,000 BP) origin in Taiwan.

      2. Regarding probabilities – I’m not quite sure what your argument here is. 70% probability is pretty damn good. Keep in mind the sheer numerical space of probability here, in the original study we used 400 languages. There are something like 10^128 possible ways of grouping 400 languages. Not all of these trees is equally probable. If a branch is supported by 70% of the posterior probability distribution (i.e. the trees that we found that had high probability given the data and model of language change), then we’ve pruned that down from 10^128. This is a big deal.

      3. “there isn’t always clear mapping between language usage and migration.”. Maybe not all the time, but historical linguistics as well as some studies of gene/language links show that this is the case in many areas of the globe, on both a fine-scale and and a low-level scale. (e.g. Coevolution of languages and genes on the island of Sumba, eastern Indonesia).

      4. Regarding the lack of standard orthography in the database. Yes, there are a lot of unstandardised entries in there, but these are being cleaned up slowly (this is time-consuming). There are many large samples in the database that are standardised. More importantly, the cognate decisions were done by people largely familiar with the languages and their orthography (a short list includes: Bob Blust, Jeff Marck, John Lynch, Laurent Sagart, and Malcolm Ross, as well as cognate information culled from many published works). For these analyses the lack of orthography isn’t a problem. I wouldn’t want to try and work out the phonology of the languages from these lists though.

      5. “The use of derivational trees versus a net or web network…may very well be unjustified”.

      Three points – first of all web/network methods do not currently exist that allow people to do these analyses. Second, the analyses we used sample the trees in accordance with their probability. If there is conflicting signal caused by e.g. borrowing (which is what a network would be representing), then it will be included in the analyses with an appropriate probability. Third, inferences from trees are actually pretty robust even when there are high levels of conflicting signal (e.g. Does horizontal transmission invalidate cultural phylogenies?).

      6. “But let us also compare the history of the Polynesian Islands”:

      All of the points you raise are discussed in the paper and the supplement.

      7. “Biologists use genes to chart the movement of species, ancestry, sister-species and the like, sure. But they don’t use those genes to mark how a particular tribe of bonobo followed one leader versus another, and it would be ridiculous to do so. ”

      This is incorrect. Phylogenetic comparative methods are very powerful tools and are often used for testing hypotheses about animal behavior.

      8. “mapping language onto political systems isn’t clear”:

      What’s not clear? there’s a table in the paper that shows how we did it.

      Ok, that’ll do – hopefully I’ve cleared up some of those issues for you,


    2. Hey Simon.

      Thank you very much for this response! It was unexpected, but exactly the sort of response which was necessary. It was very kind of you to take time to clear my doubts, and ignorance, up. To be honest, I hadn’t expected my errors to be so glaring – but I am glad that they were, because it means that a correction was necessary. I attended the talk by GK Pullum given at Edinburgh about blogging, which was mentioned here by Sean, and I took to heart the idea that blogs can be a much more useful tool for peer review than journals. Pullum specifically mentioned how it takes a sort of bravery to post things without careful consideration; I see, more clearly, what that means now. Which is to say that correction is a major part of this experience, and I do expect that I’ll spend (much) longer reviewing my posts in the future. So, thank you.

      As for your responses, I’ll go through them now, although I have very little response except to apologise for my poorly-researched and purposefully-contrarian article. To tell the truth, I liked the paper very much, and I overreacted when trying to find flaws which I could use to temper my initial enthusiasm. Thank you again for highlighting where the flaws I was considering lay with me and not in your paper. Given that all of the flaws I had thought of have pretty much been annihilated, I can thankfully go back to enjoying and agreeing with the paper again. Here are my answers to your points:

      1. I must have misinterpreted the lecture that was given in Edinburgh. It was also some time ago, and I misplaced my notes for it. I was unaware.

      2. You are correct. (I don’t think I need to say more. I don’t want to apologise too much, as it might seem insincere.)

      3. I look forward to reading through that article. Thank you for the link! As I said, I am going to be researching this particular area more thoroughly. I recently stumbled upon Fiona Jordan’s work via Twitter, and I am looking forward to reading about that, as well, as it seems relevant.

      4. Thank you for clearing that up. It was a weak argument, on my part, in the first place; but I am glad that they are being standardised, and I do hope that this doesn’t take more time than it has to.

      5. a) I’d be very interested in any future work on webs, if you know of any in the works! I wasn’t aware that there weren’t already analyses being run that integrated them. That’s a genuine gap that I hope is filled soon. b) That sounds like the correct thing to do, and in lieu of your other comments, I am not surprised that this was done. It was my error for assuming that this was unaccounted for. c) Thank you for this article, I was also unaware of it. It may have become clear by now that I did not research this topic as well as I ought to have, given the caliber of this blog and of your article. I have no excuse for negligence, but I hope the fact that I am an undergraduate could be helpful in explaining this disparity between statements and supporting knowledge. Again, I will be much more cautious in the future – which is how it ought to be.

      6. Yes, they are, and it was foolish of me to try and replicate them here, using the poor wording and vagueness to cover the fact that this issue had been dealt with. This seems to me to be the most grievous error in the article, from a reporting stand point, and I apologise for that.

      7. You are right, this is incorrect, and ought to have been edited out before I wrote it on the page. Thank you for pointing it out in such clear terms.

      8. What wasn’t clear to me was how a language could indicate what sort of political system was in place, and how mapping an indicator of political complexity into the tips of a phylogenetic tree would be able to show what previous branches would have had as their political systems. However, this wasn’t a lack of clarity because of your paper, but rather I suspect it was because I did not take the adequate time and research to address your paper correctly (something which has become blatant.)

      I think every point I managed to make in my ignorance has now been thoroughly corrected. I apologise (finally) that this was necessary at all. Ideally, I shouldn’t have written this with such a lack of thought. However, because of this experience, I’ll certainly proof my essays and blogs more thoroughly in the future, so I suppose I should say thanks for improving my future grades. I now also understand your paper a considerable deal more, and I do appreciate the work that you and your colleagues are doing. So, thank you (one last time).


    3. Hi Richard,

      No worries! Sorry if I came across as grumpy – we did put a lot of effort into the paper to try and answer some of these types of criticisms. I do really enjoy getting feedback – and criticism (yes, really) – on my work, so thank you for that.

      Regarding “webs” – there are some analyses, but they’re just not very good. There are a quite a lot of people working on developing these, but the methods are not quite there yet. For example, I had a paper in Proceedings B recently which did some network analyses of language typology. There’s also a few papers coming out in a forthcoming issue of Philosophical Transactions of the Royal Society on language networks.

      I agree that blogging is important (hell, I run one myself) – it’s a great way to engage with things you find interesting. Keep it up!


    Leave a Reply

    This site uses Akismet to reduce spam. Learn how your comment data is processed.