EvoLang Preview: Morphological Redundancy and Survivability

This is a preview of the talk Redundant Features Are Less Likely To Survive: Empirical Evidence From The Slavic Languages by Aleksandrs Berdicevskis and Hanne Eckhoff.  Tuesday 22nd March, 14:30, room D.

One of the methodological trends of this year’s EvoLang seems to be intelligent exaptation. What I mean by this is that people do research on language evolution using tools that were developed for a completely different purpose. Examples include using zombies to observe the emergence of languages under severe phonological constraints, Minecraft to investigate the role of pointing in the emergence of language and EvoLang to study EvoLang. In addition to that, Hanne Eckhoff and I use syntactic parsers to quantify morphological redundancy.

The basic idea is to put to test an assumption that redundant features are more likely to disappear from languages, especially if social factors favour the loss of excessive complexity. The problem is that nobody really knows what is redundant in real languages and what is not. We can define a feature as redundant if it is not necessary for successful communication, i.e. if hearers can infer the meanings of the messages they receive without using this feature. It is, however, still a long way from this definition to a quantitative measure. In theory, one could run psycholinguistic experiments, in practice, it is a difficult and costly venture (I tried).

In this paper, we replace humans with a dependency parser. For those who are not into computational linguistics: a parser is a program which can automatically identify (well, attempt to identify) the syntactic structure of a given sentence. A typical parser is first trained on a large number of human-annotated sentences. After its learning is over, it can parse non-annotated sentences on its own, relying on the information about the form of every word, its lemma, part of speech, morphological features and the linear order of words — just like a human being. If we remove a certain feature from its input and compare performance before and after the removal, we can estimate how important (=non-redundant) the feature was.

redundancy_preview

If we remove all information about, say, dative from the parser’s input (to the left), it will have harder time to understand that the phrase two masters is an oblique object.

We test whether this measure is any good by running a pilot study with the Slavic language group. We estimate the redundancy of morphological features in Common Slavic (Common Slavic itself has left no written legacy, but we happen to have an excellent treebank of Old Church Slavonic, which is often used as a proxy) and try to predict which features are likely to die out in 13 modern Slavic languages. While redundancy is not of course a sole determiner of the survivability, it turns out be a fairly good predictor.

Come to the talk to hear about fierce morphological competitions! They are friends, dative and locative, almost brothers, but if only one can stay alive, which will sacrifice itself? The perfect participle is an underdog past tense, its frequency negligible compared to that of its rivals, the aorist and the imperfect, but does its high non-redundancy score give it some hope?

 

Aleksandrs Berdicevskis is a postdoc in computational historical linguistics at an edge of the world (namely The Arctic University of Norway in the city of Tromsø) with a PhD in sociolinguistics from the University of Bergen, MA in theoretical linguistics from Moscow State University, two years’ experience in science journalism, two kids and a long-standing interest in language evolution.
The first question he usually gets from new acquaintances is about the spelling of his name. The first name is a common Russian name (Aleksandr-) with the obligatory Latvian inflectional marker for nominative masculine singular (-s). The full form is used in formal communication only, otherwise he is usually called Sasha (the Russian hypocorism for Aleksandr) or, for simplicity’s sake, Alex.