Tone and Humidity: FAQ

Everett, Blasi & Roberts (2015) review literature on how inhaling dry air affects phonation, suggesting that lexical tone is harder to produce and perceive in dry environments.  This leads to a prediction that languages should adapt to this pressure, so that lexical tone should not be found in dry climates, and the paper presents statistical evidence in favour of this prediction.

Below are some frequently asked questions about the study (see also the previous blog post explaining the statistics).

Continue reading “Tone and Humidity: FAQ”

Book Review: The Nature and Origin of Language (Bouchard 2013)

This review appeared originally in the LINGUIST List at

Book announced at

bouchardAUTHOR: Denis Bouchard
TITLE: The Nature and Origin of Language
SUBTITLE: First Edition
SERIES TITLE: Oxford Studies in the Evolution of Language
PUBLISHER: Oxford University Press
YEAR: 2013

REVIEWER: Hannah Little, Vrije Universiteit Brussel

Review’s Editors: Malgorzata Cavar and Sara Couture


This monograph outlines a new perspective for the origin of language. Its central premise is that language’s arbitrariness was the main innovation causing language to emerge. Bouchard frames this thesis in the context of Saussurean theory and his own Sign Theory of Language (STL). Arbitrariness, then, is the thing that evolutionary explanations of language must seek to explain, and Bouchard proposes humans’ uniquely evolving Offline Brain Systems (OBS) as the main driver in the emergence of language. OBS are proposed by Bouchard to be uniquely human neural systems that are activated even in the absence of direct stimuli, allowing us to represent things not currently present. Throughout the book, Bouchard’s ideas are presented in opposition to ideas of the origins of language from a generative perspective, which I will cover in more detail throughout the review.

The book is quite lengthy, but demands to be read from cover to cover in order to follow the arguments, and is not something that can be easily dipped in and out of, which is demonstrated by the use of acronyms which are mostly only glossed once and used in every chapter.

Continue reading “Book Review: The Nature and Origin of Language (Bouchard 2013)”

Syntax came before phonology?
A new paper has just appeared in the proceedings of the royal society B entitled, “Language evolution: syntax before phonology?” by Collier et al.

The abstract is here:

Phonology and syntax represent two layers of sound combination central to language’s expressive power. Comparative animal studies represent one approach to understand the origins of these combinatorial layers. Traditionally, phonology, where meaningless sounds form words, has been considered a simpler combination than syntax, and thus should be more common in animals. A linguistically informed review of animal call sequences demonstrates that phonology in animal vocal systems is rare, whereas syntax is more widespread. In the light of this and the absence of phonology in some languages, we hypothesize that syntax, present in all languages, evolved before phonology.

This is essentially a paper about the distinction between combinatorial and compositional structure and the emergence narrative of duality of patterning. I wrote a post about this a few months ago, see here. The paper focusses on evidence from non-human animals and also evidence from human languages, including Al-Sayyid Bedouin Sign Language, looking at differences and similarities between human abilities and those of other animals.

Peter Marler outlined different types of call combinations found in animal communication by making a distinction between ‘Phonological syntax’ (combinatorial structure), which he claims is widespread in animals, and ‘lexical syntax’ (compositional structure), which he claims  have yet to be described in animals (I can’t find a copy of the 1998 paper which Collier et al. cite, but he talks about this on his homepage here). Collier et al. however, disagree and review several animal communication systems which they claim fall under a definition of “lexical syntax”.

They start by defining what they mean by the different levels of structure within language (I talk about this here).  They present the following relatively uncontroversial table:


 Evidence from non-human species

The paper reviews evidence from 4 species; 1) Winter wrens (though you could arguably lump all birdsong in with their analysis for this one),  2) Campbell monkeys, 3) Putty-nosed monkeys and 4) Banded mongooses.

1) Birdsong is argued to be combinatorial, as whatever the combination of notes or syllables, the songs always have the same purpose and so the “meaning” can not be argued to be a result of the combination.

2) In contrast to  Marler, the authors argue that Campbell monkeys have compositional structure in their calls. The monkeys give a ‘krak’ call when there is a leopard near, and a ‘hok’ call when there is an eagle. Interestingly, they can add an ‘-oo’ to either of these calls change their meanings. ‘Krak-oo’ denotes any general disturbance and ‘hok-oo’ denotes a disturbance in the canopy. One can argue then that this “-oo” has the same meaning of “disturbance”, no matter what construction it is in, and “hok” generally means “above”, hinting at compositional structure.

3) The authors also discuss Putty-nosed monkeys, which were also discussed in this paper by Scott-Philips and Blythe (again, discussed here). While Scott-Philips and Blythe arrive at the conclusion that the calls of putty-nosed monkeys are combinatorial (i.e. the combined effect of two signals does not amount to the combined meaning of those two signals):


“Applied to the putty-nosed monkey system, the symbols in this figure are: a, presence of eagles; b, presence of leopards; c, absence of food; A, ‘pyow’; B, ‘hack’ call; C = A + B ‘pyow–hack’; X, climb down; Y, climb up; Z ≠ X + Y, move to a new location. Combinatorial communication is rare in nature: many systems have a signal C = A + B with an effect Z = X + Y; very few have a signal C = A + B with an effect Z ≠ X + Y.”

However, Collier et al. argue this example is not necessarily combinatorial, as the pyow-hack sequences could be interpreted as idiomatic, or have much more abstract meanings such as ‘move-on-ground’ and ‘move-in-air’, however in order for this analysis to hold weight, one must assume the monkeys are able to use contextual information to make inferences about meaning, which is a pretty controversial claim. However, Collier et al. argue that it shouldn’t be considered so far-fetched given the presence of compositionality in the calls of Campbell monkeys.

4) The author’s also  discuss Branded Mongooses who emit close calls while looking for food.  Their calls begin with an initial noisy segment that encodes the caller’s identity, which is stable across all contexts. In searching and moving contexts, there is a second tonal harmonic that varies in length consistently with context. So one could argue that identity and context are being systematically encoded into their call sequences with one to one mappings between signal and meaning.

(One can’t help but think that a discussion of the possibility of compositionality in bee dances is a missed opportunity here.)

Syntax before phonology?

The authors use the above (very sketchy and controversial) examples of compositional structure to make the case that syntax came before phonology. Indeed, there exist languages where a level of phonological patterning does not exist (the go-to example being Al-Sayyid Bedouin Sign Language). However, I would argue that the emergence of combinatoriality is, in large part, the result of the modality one is using to produce language. My current work is looking at how the size and dimensionality of a signal space, as well as how mappable that signal space is to a meaning space (to enable iconicity), can massively effect the emergence of a combinatorial system, and I don’t think it’s crazy to suggest the modality used will effect the emergence narrative for duality of patterning.

Collier et al. attempt to use some evidence from spoken languages with large inventories, or instances where single phonemes in spoken languages are highly context-dependant meaningful elements, to back up a story where syntax might have come first in spoken language. But given the physical and perceptual constraints of a spoken system, it’s really hard for me to imagine how a productive syntactic system could have existed without a level of phonological patterning. The paper makes the point that it is theoretically possible (which is really interesting), but I’m not convinced that it is likely (though this paper by Juliette Blevins is well worth a read).

Whilst I don’t disagree with Collier et al.’s conclusion that phonological patterning is most likely the product of cultural evolution, I feel like the physical constraints of a linguistic modality will massively effect the emergence of such a system, and arguing for an over-arching emergence story without consideration for non-cognitive factors is an over-sight. 


Collier, K., Bickel, B., van Schaik, C., Manser, M., & Townsend, S. (2014). Language evolution: syntax before phonology? Proceedings of the Royal Society B: Biological Sciences, 281 (1788), 20140263-20140263 DOI: 10.1098/rspb.2014.0263

A review of a review on Fitch’s The Evolution of Language

fitch language evolutionMaggie Tallerman has published a review of Techumseh Fitch’s 2010 book, “The Evolution of Language” in the journal of linguistics. It is largely very critical, mostly of Fitch’s ideas about a musical protolanguage stage preceding language, and of the fact that the focus of the book is largely about vocal imitation and the evolution of speech, rather than on linguistic (i.e. cognitive) features such as syntax, semantics and phonology. Tallerman is also very critical of a lack of an emphasis on the uniqueness of human language, stating:

The first problem is that there isn’t enough emphasis on the exceptional nature of language as a human faculty. In particular, the putative parallels with animal communication and cognition are at times exaggerated. Take statements like this: ‘[e]ven syntax, at least at a simple level, finds analogs in other species (e.g. bird and whale ‘‘song’’) which can help us to understand both the brain basis for syntactic rules and the evolutionary pressures that can drive them to become more complex’ (18). While there’s SOME truth in the first half, given the existence of both hierarchical structure and simple dependencies in animal ‘syntax’ (see Hurford 2012 for an excellent survey), I fear that a non-linguist reading the claim that analogues of syntax are found in other animals would get entirely the wrong idea. Grammatical systems in language are NOT merely a more complex version of animal communication systems, which are entirely non-compositional, with no duality of patterning, and which do not contain word classes or headed phrases.

I feel like Tallerman is claiming that her view that language is exceptional as indisputable fact, rather than as a standpoint. However, the view that language is unique among cognitive processes and is unique to humans, is still a very contentious matter and many linguists, biologists and cognitive scientists hold the legitimate opinion that language may well just be the result of domain general cognitive processes and that comparative studies of human and animal abilities have a large roll to play in the future of language evolution research. This is certainly a very attractive standpoint for biologists. I know Hauser, Chomsky & Fitch (2002), the paper for which Fitch is probably most famous in language evolution, put some emphasis on their being an faculty for language, in both a broad (FLB) and narrow (FLN) sense, and Tallerman mentions that the FLN is, “namely, whatever is both uniquely human and uniquely linguistic”. But, Hauser, Chomsky & Fitch (2002) in fact argue that “FLN may have evolved for reasons other than language, hence comparative studies might look for evidence of such computations outside of the domain of communication (for example, number, navigation, and social relations)” – which is completely consistent with the emphasis being away from language being so exceptional, and on the importance of animal studies.

Of course, any good text book should cover both sides of the argument, and perhaps Fitch doesn’t spend enough time covering the ins and outs of a controversy so central to the field of language evolution, but I don’t think that Tallerman’s criticisms consider the importance of both sides of the argument either. I’m also not sure why she quotes Chomsky at the beginning of the paper. Chomsky said in his famous UCL talk:

There’s a field called ‘evolution of language’ which has a burgeoning literature, most of which in my view is total nonsense … In fact, it isn’t even about evolution of language, it’s almost entirely speculations about evolution of communication, which is a different topic

Fitch has written a couple of papers with Chomsky, but the quote’s presence is confusing in a review of a book where Chomsky is not an author, and I can see no other reason to include it other than to confound Fitch with Chomsky’s views, which isn’t a very fair way to start a review judging Fitch’s book.

Beyond this, the main bulk of the paper is on Fitch’s treatment of the problem of cheap, honest signals and also of protolanguage and, in particular,  musical protolanguage. She raises some excellent points and in light of the fact that I have things to be doing, you should go and read it here if you’re interested:

The Oxford Handbook of Language Evolution – Book Review on Linguist List

My review of Maggie Tallerman‘s and Kathleen R. Gibson‘s “Oxford Handbook of Language Evolution”  was published on Linguist List yesterday (you can read it here).

Here’s my opinion in a nutshell: This is a great volume and I’ve really learned a lot from reading it. The authors have done a great job trying to be accessible to an interdisciplinary audience. It’s  a great place to start if you’re interested in language evolution or want to get a quick overview of a specific topic in language evolution research. I would’ve liked it if the chapters had a “Further Reading” section, however (like  Christiansen and Kirby’s 2003 volume). Some chapters felt a bit too short for me (Steven Mithen‘s chapter on “Musicality and Language” for example is only 3 pages long, Merlin Donald‘s chapter on “the Mimetic Origins of Language” is 4 pages long). I also feel that some topics, like language acquisition, could’ve been dealt with  more extensively, but then again, if you compile a handbook, it’s impossible to make everybody happy. Other recent book-length overviews of language evolution (e.g. Fitch’s 2010 book and Hurford’s 2007 and 2012 tomes) are more detailled, but also more technical and not as comprehensive and don’t cover as many topics. To quote my review:

Overall, the Oxford Handbook of Language Evolution is a landmark publication in  the field that will serve as a useful guide and reference work through the  entanglements and pitfalls of the language evolution jungle for both experienced  scholars and newcomers alike.

One last thing I’m particularly unhappy about is that the handbook doesn’t have an Acacia Tree on the cover – which seems like a missed opportunity (kidding).

I’ll try to write about some of my favourite chapters in more detail somewhere down the road/in a couple of weeks.

Visualising language similarities without trees

Gerhard Jäger uses lexostatistics to demonstrate that language similarities can be computed without using tree-based representations (for why this might be important, see Kevin’s post on reconstructing linguistic phylogenies).  On the way, he automatically derives a tree of phoneme similarity directly from word lists.  The result is an alternative and intuitive look at how languages are related (see graphs below).  I review the method, then suggest one way it could get away from prior categorisations entirely.

Jäger presented work at the workshop on Visualization of Linguistic Patterns and Uncovering Language History from Multilingual Resources at the recent EACL conference last month.  He uses the Automated Similarity Judgment Program (ASJP) database, which contains 40 words from the Swadesh-list (universal concepts) for around 5800 languages (including Klingon!).  The words are transcribed in the same coarse transcription.  The task is to calculate the distance between languages based on these lists in a way that they reflect the genetic relationships between languages.

Continue reading “Visualising language similarities without trees”

Babies know who’s boss, whose boss, and who knows what else.

forthcoming paper (grateful nod to ICCI) in PNAS from Olivier Mascaro and Gergely Csibra presents a series of experiments investigating the representation of social dominance relations in human infants, and it’s excellent news: we’re special.

Social dominance can be inferred in a couple of ways. Causal cues such as age, physical aggression and size can tell us about the dominance status of an individual quite intuitively, so we can make a sensible decision about whether or not we get into a scrap with them. Another way we can establish this is to look for direct realisations of dominance, such as who gets the banana if two hungry chimps both want it; chances are, little Pan Pipsqueak isn’t going to get a look in. In order to be useful, we also have to use this information to expect certain things from the individuals around us, so those representations have some property of stability across time that allows us to have those expectations. The question being explored in this paper is whether the representations we have are about the relationship between the two agents who want the banana, or the individual properties each of them has.

In a series of experiments using preferential looking time as a dependent measure, human infants (9 and 12 month olds) were exposed to videos of geometric figures exhibiting similar goal-directed behaviour. Then they would watch, say, a dominant triangle picking up the last figurative banana when the nondominant pentagon also wanted it. For expository purposes and posterity’s sake, I have constructed an artist’s impression of a dominant triangle and a subordinate pentagon in MSPaint (below, right):

A dominant triangle and subordinate pentagon (artist’s impression).

I’m not just showing off my extraordinary artistic talent here; the good thing about these agents is that there are none of the cues like size or aggression that can give rise to the assignment of individual dominance properties. The task also doesn’t indicate anything similar; it’s just about who gets the desired object when there’s only one left. In other words, the goal-directed actions of two agents are in opposition. After seeing a triangle beat a pentagon to an object of ‘banana’ status, 12 month olds looked for longer when they were then presented with an incongruent trial where the pentagon gained over the triangle. 9 month olds (understandably?) couldn’t care less. So, on the basis of this social interaction alone, the 12 month olds were able to notice when something unexpected happened.

To rule out the possibility that this was just the result of some simple heuristic such as “when triangle and pentagon are present, triangle gets the object” and make sure the infants really were assigning some dominance, another experiment (with 12 and 15 month olds) showed the same test video of the two agents collecting little objects. This time, however, the preceding video was of the triangle dominating a little walled-in space that the pentagon also wanted to inhabit. The 12 month olds had no idea what was up, but the 15 month olds generalised from the first “get out of my room” interaction to the “I get the last banana” interaction. So, 15 month olds can extract, just from watching a social interaction, the dominance status of agents and can generalise that information to novel situations. So if a 15 month old watches you lose your favourite seat in front of the TV, they’ll also expect you to miss out on the last slice of pizza, because you’re a loser.

What we still don’t know is whether they think your belly is inherently yellow, or if you’re just a pushover when interacting with a particular person. Is it the relationship between the triangle and pentagon that the babies are tracking, or do they just give each agent some sort of dominance score? This was addressed in experiment 4, where infants were presented with two interactions: one between A and B, where A wins, and then another between B and C, where B wins. If the babies are assigning an individual value to each agent, they should have some sort of linear, transitive representation of dominance like A > B > C. If they’re then presented with a novel interaction between A and C, they would have the expectation that A will beat C. So if they stare in surprise at a trial where C wins, we know it’s violated that kind of expectation, and that they’re representing this stuff linearly – I.E. each agent has a dominance value. In contrast, if the infant is tracking the relations between agents, they can’t really have an expectation of what will happen when A and C both want a banana, because they’ve never seen C before. The results find that the infants look preferentially when they get an incongruent trial using agent pairs they have seen before – as we’d expect from the previous experiment. When they’re presented with a new “I get the last banana” interaction between A and C, however, there’s nothing startling about it when C wins – which means their expectations are not based on something like A > B > C.

The only tiny little harrumph I have about this result is that all it does is falsify the linear representation account. Though I think their account is absolutely right, it’d be nice to see something more predictive come out of the relation-representation hypothesis that is a little more falsifiable. But this result is pretty huge, and stands in contrast with what we know about social cognition in other animals like baboons (Cheney et al, 1995; Bergman et al, 2003), lemurs (Maclean et al., 2008) and even pigeons (Lazareva & Wasserman, 2012), who seem to employ this sort of hierarchical, transitive inference when presented with novel interactions. It may also muddy the waters a little when we want to make the appealing claim that, since language surely emerged in order to enable communication as we navigated a social environment, hierarchical social cognition gives rise to the processing of languagey things like hierarchical syntax or our semantic representation (Hamilton, 2005), which can be characterised as hierarchical (e.g. hyperonym > hyponym). If we consider the nature of the human social environment, though, it should seem more intuitive that something more reliable than simple transitive inference is necessary in order to successfully navigate through our interactions. Due to our prolific production of (and reliance on) culture, humans have a much more diverse range of social currencies, which correspond to values for things like money, intelligence, blackmail information, who your friends are, ad infinitum. That means it’s pretty reasonable that our social cognition needs new strategies in order to get by; we have a little more to consider than just who’s big and angry enough to get all the bananas.


Bergman, T., Beehner, J., Cheney, D. & Seyfarth, R. (2003) “Hierarchical Classification by Rank and Kinship in Baboons” Science 14(302), 1234-1236.

Cheney, D., Seyfarth, R. & Silk, J. (1995) “The response of female baboons (Papio cynocephalus ursinus) to anomalous social interactions: evidence for causal reasoning?” Journal of Comparative Psychology 109(2), 134-141.

Hamilton, D.L. (2005) Social Cognition: Key Readings (p. 104) Psychology Press

Lazareva, O. & Wasserman, E. (2012) “Transitive inference in pigeons: measuring the associative values of stimulus B and D” Behavioural Process 89(3), 244-255.

Maclean, E., Merritt, D. & Brannon, E.M. (2008) “Social complexity predicts transitive reasoning in prosimian primates” Animal Behaviour 76(2), 479-486.

Mascaro, O. & Csibra, G. (forthcoming) “Representation of stable dominance relations by human infants” Proceedings of the National Academy of Sciences


New Book. New Ideas?

A new book is to be published on May the 24th. By John F. Hoffecker the book is entitled “Landscape of the Mind: Human Evolution and the Archaeology of Thought” – it aims to look at the emergence of human thought and language through archaeological evidence

Archeologists often struggle to find fossil evidence pertaining to the evolution of the brain. Thoughts are a hard thing to fossilize. However, John Hoffecker claims that this is not the case and fossils and archaeological evidence for the evolution of the human mind are abundant.

Hoffecker has developed a concept which he calls the “super-brain” which he hypothesises emerged in Africa some 75,000 years ago. He claims that human’s ability to share thoughts between individuals is analogous to the abilities of honey bees who are able to communicate the location of food both in terms of distance and direction. They do this using a waggle-dance. Humans are able to share thoughts between brains using communicative methods, the most obvious of these being language.

Fossil evidence for the emergence of speech is thin on the ground and, where it does exist, is quite controversial. However, symbols emerging in the archaeological record coincides with an increase in evidence of creativity being displayed in many artifacts from the same time. Creative, artistic designs scratched on mineral pigment show up in Africa about 75,000 years ago and are thought to be evidence for symbolism and language

Hoffecker also hypothesises that his concept of the super-brain is likely to be connected to things like bipedalism and tool making. He claims that it was tool making which helped early humans first develop the ability to represent complex thoughts to others.

He claims that tools were a consequence of bipedalism as this freed up the hands to make and use tools. Hoffecker pin points his “super-brain” as beginning to emerge 1.6 million years ago when the first hand axes began to appear in the fossil record. This is because hand axes are thought to be an external realisation of human thought as they bear little resemblance to the natural objects they were made from.

By 75,00 years ago humans were producing perforated shell ornaments, polished bone awls and simple geometric designs incised into lumps of red ochre.

Humans are known to have emerged from Africa between 60,00 to 50,000 years ago based on archeological evidence. Hoeffecker hypothesises that – “Since all languages have basically the same structure, it is inconceivable to me that they could have evolved independently at different times and places.”

Hoeffecker also lead a study in 2007 that discovered a carved piece of mammoth ivory that appears to be the head of a small figurine dating to more than 40,000 years ago. This is claimed to be the oldest piece of figurative art ever discovered. Finds like this illustrate the creative mind of humans as they spread out of Africa.

Figurative art and musical instruments which date back to before 30,000 years ago have also been discovered in caves in France and Germany.

This looks to be nothing new but archaeological evidence is something which a lot of people interested in language evolution do not often discuss. I also don’t really know what to think of Hoeffecker’s claim that “all languages basically have the same structure”. What do you think?

Dialects in Tweets

A recent study published in the proceedings of the Empirical Methods in Natural Language Processing Conference (EMNLP) in October and presented in the LSA conference last week found evidence of geographical lexical variation in Twitter posts. (For news stories on it, see here and here.) Eisenstein, O’Connor, Smith and Xing took a batch of Twitter posts from a corpus released of 15% of all posts during a week in March. In total, they kept 4.7 million tokens from 380,000 messages by 9,500 users, all geotagged from within the continental US. They cut out messages from over-active users, taking only messages from users with less than a thousand followers and followees (However, the average author published around 40~ posts per day, which might be seen by some as excessive. They also only took messages from iPhones and BlackBerries, which have the geotagging function. Eventually, they ended up with just over 5,000 words, of which a quarter did not appear in the spell-checking lexicon aspell.

The Generative Model

In order to figure out lexical variation accurately, both topic and geographical regions had to be ascertained. To do this, they used a generative model (seen above) that jointly figured these in. Generative models work on the assumption that text is the output of a stochastic process that can be analysed statistically. By looking at mass amounts of texts, they were able to infer the topics that are being talked about. Basically, I could be thinking of a few topics – dinner, food, eating out. If I am in SF, it is likely that I may end up using the word taco in my tweet, based on those topics. What the model does is take those topics and figure from them which words are chosen, while at the same time figuring in the spatial region of the author. This way, lexical variation is easier to place accurately, whereas before discourse topic would have significantly skewed the results (the median error drops from 650 to 500 km, which isn’t that bad, all in all.)

ResearchBlogging.orgThe way it works (in summary and quoting the slide show presented at the LSA annual meeting, since I’m not entirely sure on the details) is that, in order to add a topic, several things must be done. For each author, the model a) picks a region from P( r | ∂ ) b) picks a location from P( y | lambda, v ) and c) picks a distribution over P( Theta | alpha ). For each token, it must a) pick a topic from P( z | Theta ), and then b) pick a word from P( w | nu ). Or something like that (sorry). For more, feel free to download the paper on Eisenstien’s website.

This post was chosen as an Editor's Selection for ResearchBlogging.orgWell, what did they find? Basically, Twitter posts do show massive variation based on region. There are geographically-specific proper names, of course, and topics of local prominence, like taco in LA and cab in NY. There’s also variation in foreign language words, with pues in LA but papi in SF. More interestingly, however, there is a major difference in regional slang. ‘uu’, for instance, is pretty much exclusively on the Eastern seaboard, while ‘you’ is stretched across the nation (with ‘yu’ being only slightly smaller.) ‘suttin’ for something is used only in NY, as is ‘deadass’ (meaning very) and, on and even smaller scale, ‘odee’, while ‘af’ is used for very in the Southwest, and ‘hella’ is used in most of the Western states.

Dialectical variation for 'very'

More importantly, though, the study shows that we can separate geographical and topical variation, as well as discover geographical variation from text instead of relying solely on geotagging, using this model. Future work from the authors is hoped to cover differences between spoken variation and variation in digital media. And I, for one, think that’s #deadass cool.

Jacob Eisenstein, Brendan O’Connor, Noah A. Smith, & Eric P. Xing (2010). A Latent Variable Model for Geographic Lexical Variation. Proceedings of EMNLP

Mapping Linguistic Phylogeny to Politics

In a recent article covered in NatureNews in Societes Evolve in Steps, Tom Currie of UCL, and others, like Russell Gray of Auckland, use quantitative analysis of the Polynesian language group to plot socioanthropological movement and power hierarchies in Polynesia. This is based off of previous work, available here, which I saw presented at the Language as an Evolutionary Systemconference last July. The article claims that the means of change for political complexity can be determined using linguistic evidence in Polynesia, along with various migration theories and archaeological evidence.

I have my doubts.

Note: Most of the content in this post is refuted wonderfully in the comment section by one of the original authors of the paper. I highly recommend reading the comments, if you’re going to read this at all – that’s where the real meat lies. I’m keeping this post up, finally, because it’s good to make mistakes and learn from them. -Richard


I had posted this already on the Edinburgh Language Society blog. I’ve edited it a bit for this blog. I should also state that this is my inaugural post on Replicated Typo; thanks to Wintz’ invitation, I’ll be posting here every now and again. It’s good to be here. Thanks for reading – and thanks for pointing out errors, problems, corrections, and commenting, if you do. Research blogging is relatively new to me, and I relish this unexpected chance to hone my skills and learn from my mistakes. (Who am I, anyway?) But without further ado:


In a recent article covered in NatureNews in Societes Evolve in StepsTom Currie of UCL, and others, like Russell Gray of Auckland, use quantitative analysis of the Polynesian language group to plot socioanthropological movement and power hierarchies in Polynesia. This is based off of previous work, available here, which I saw presented at the Language as an Evolutionary Systemconference last July. The article claims that the means of change for political complexity can be determined using linguistic evidence in Polynesia, along with various migration theories and archaeological evidence.

I have my doubts. The talk that was given by Russell Gray suggested that there were still various theories about the migratory patterns of the Polynesians – in particular, where they started from. What his work did was to use massive supercomputers to narrow down all of the possibilities, by using lexicons and charting their similarities. The most probable were then recorded, and their statistical probability indicated what was probably the course of action. This, however, is where the ability for guessing ends. Remember, this is massive quantificational statistics. If one has a 70% probability chance of one language being the root of another, that isn’t to say that that language is the root, much less that the organisation of one determines the organisation of another. But statistics are normally unassailable – I only bring up this disclaimer because there isn’t always clear mapping between language usage and migration.

Continue reading “Mapping Linguistic Phylogeny to Politics”