Basic word order and Uniform Information Density

This week we had a talk by visiting PhD student Luke Maurits about basic word order.  The distributions of basic word orders around the world (Subject-Verb-Object, Subject-Object-Verb etc. ) has been the focus of much attention.  The overwhelming majority of languages have SOV and SVO orders, with fewer having VSO and very small numbers having OVS and OSV.  In order of frequency, this is:

(SOV, SVO) > VSO > (VOS, OVS) > OSV

A standard approach has been to assume that this ordering reflects an ordering of functionality:  Somehow, SOV order is more functional or efficient or intuitive than OSV.  However, Maurits points out that the literature on diachronic change opposes this view.  Languages often change from SOV to VSO or SVO over time, but rarely the other way around (see diagram below).

Are these languages getting less functional?  It seems unlikely.  Instead, Maurits assumes that languages started as SOV and are gradually moving away from it towards the other end of the scale.  That is, OSV may be functionally better than SOV.  The present distribution has occurred because the shift is only halfway through, so there are still proportionately more SOV languages.  Although not the focus of the talk, Maurits pointed to newly created languages such as al-Sayyid Bedouin sign language which have SOV order.  So, now there is a different functionality ordering to explain:  The opposite of the frequency distribution.

Maurits uses Information theory to explain the differential functionality.  The Uniform Information Density (UID) theory suggests that a reliable and efficient code has a similar amount of information in each symbol (here the symbols are Subject, Verb and Object).  You don't want the next symbol to be too predictable (inefficient) or too unpredictable (unreliable), so you spread the information out.  Because there are relatively fewer verbs than nouns, different orderings of Subject, Verb and Object are more efficient.

For example, in a SVO language, if I said "John ate ...", you'd be pretty uncertain about what the Object was going to be.  However, if I used SOV order and said "John cake ...", you'd be pretty confident that the verb would be 'ate' (or 'made', but certainly there's a lot less uncertainty).  However, reducing the uncertainty does not necessarily give the optimal UID.

Maurits has formalised this intuition using information theory and tested it on corpora.  For corpora in English (SVO) and Japanese (SOV),  the Agent, Verb and Patient were extracted from sentences to make probability densities for events.   For example, JOHN = Subject, "CAKE" = Patient and "EAT" = Verb would be fairly common, but "CAKE" as the subject and "John" as the object (as in "The cake ate John") would be fairly rare, and so unpredictable.  UID ranked the word orderings in basically the order of frequency with which those word orderings are represented in the world, supporting Maurits' theory.  Maurits also conducted an experiment using the Amazon Mechanical Turk which bore out the same results (details in the paper).

So, word order distributions explained by rational approaches to information transmission.  Pretty neat.

Luke Maurits will be presenting at NIPS in December, and his paper below (available here) is well worth a look.

Luke Maurits (0). Why are some word orders more common than others? A uniform information density account Advances in Neural Information Processing Systems, 23,