Great Andamanese: The key to more than one linguistic puzzle?

Last week we had a lecture from Anvita Abbi on rare linguistic structures in Great Andamanese - a language spoken in the Andaman Islands.  The indigenous populations of the Andaman Islands lived in isolation for tens of thousands of years until the 19th Century, but still exhibit some common features of south-east Asian languages such as retroflex consonants.  This could be evidence for the migration route of humans from India to Australia.  Indeed, recent genetic research has shown that the Andamanese are descendants of the first human migration from Africa in the Palaeolithic, though Abbi suggested that the linguistic evidence is also a strong marker of human migration and an "important repository of our shared human history and civilization".

Although the similarities are fascinating for studies of cultural evolution, the rarity of some structures in Great Andamanese are even more intriguing.

The Andaman Islands

The structures are rare in two senses:  To start with, the number of speakers has declined rapidly over the last few decades - there are only around 10 speakers left.  Most importantly for the talk, however, was the highly complex system for marking inalienability (things which cannot be transferred from the individual), based around 7 body-part classes.   Here is a table of the markers with the associated parts of the body:

Class Partonomy of human body Body class marker
1 Mouth and its semantic extension a-
2 Major external body parts ɛr-
3 Extreme ends of the body like toes and fingernails
4 Bodily products and part-whole relationship ut-
5 Organs inside the body e-
6 Parts designating round shape/sexual organs ara-
7 Parts for legs and related terms o- ɔ-

(Adapted from handout)

Although the classifications above are largely coherent, the classes exhibit a great deal of variation, too.  For example, ɛr- marks body parts as diverse as the forearm, brain and urine.  The classes can be used to create lexical distinctions between ot-cala (a scar left by an arrow-head), er-cala (a scar on the head) and oŋ-cala (a scar on the limbs).  However, the distinctions also extend into other parts of the language. For example, the mouth of a vessel receives a class 1 marker, and the branch of a coconut tree receives a class 2 marker while the sound of rain receives a class 4 marker.  The manner of verbs can also be indicated:  ut-ʃile means 'to aim from above', ek-ʃile means 'to aim at' and e-ʃile means to 'aim to pierce'.  In all, there are ten different ways of denoting inalienable possessiveness.  For example, the word for a pig's head is ra ɛr-co, but a pig's head when it has been cut from the body is ra t-ɛr-co.  The class markers are also used in adjectives, adverbs and intransitive verbs.

Abbi claims that there are no known linguistic systems like this anywhere else, which invites the question - how did it emerge?  Isolation must be part of the answer, but rare linguistic structures fit better with a weak-bias plus cultural transmission view of language than with a strong, language-specific bias approach.  Cysouw and Wohlgemuth (2010) argue that in the debate about language universals, relatively little attention has been given to 'the other end of the scale': features that are incredibly rare and do not seem to be explained by migration.  These cases are as valuable as linguistic universals, they argue, because we can learn about the full limits of human language.  Their book on rare linguistic structures Rethinking Universals looks well worth a read (preview available on Google books here).

Abbi argues that there is a correlation between the rarity of a linguistic structure and the endangerment status of its language, so there is an even greater pressure to record these languages before they become extinct.

Looking up the Andaman languages online, I came across which has a massive amount of information on the culture, heritage and language of the Andaman Islands.  The section on the descent of the Andamanese is a whole course in human human, migration and genetics with loads of graphs and charts.  It also contains an article on the languages of the Andaman Islands by Abbi (see here for more publications).  I was also impressed with a completely unrelated article on a quantitative analysis of the influence of global languages.

Abbi also mentioned a tribe called the Sentineli who have been extremely hostile to contact from other cultures.  Virtually nothing is known about their language because they start shooting arrows at any approaching boats and, as Abbi calmly put it, "some of those arrows are quite poisonous" (although there is a fascinating video here of a relatively friendly first-contact style encounter).  It's good to know that linguistics is still an adventurous undertaking.


Abbi, A. (2006) Endangered Languages of the Andaman Islands. Lincom Europa GMBH, Muenchen, Germany. link

Cysouw, Michael & Jan Wohlgemuth. 2010. The other end of universals: theory and typology of rara. Introduction to Rethinking Universals: How rarities affect linguistic theory, 1-10.

Note: The featured image is a photograph of some Andamanese islanders which shows "A remarkable photograph of 1875 (by E.H. Man) showing the influence the new masters were having on islanders' daily life and technology only 17 years after the British took over. In the foreground is a traditional Great Andamanese outrigger canoe used mostly for fishing. There was no travel over longer distances in pre-British times. In the background is the much longer, new-style outrigger-less canoe that could transport more people over longer distances. The people standing around their vessels are Great Andamanese (some with their traditional weapons) with a supervising Indian Jemadar at the back."

  • ettlinger

    Interesting post, especially with respect to the main idea, that rara and rarissima ( can provide insight into the language faculty unavailable through the examination of universals.

    I do have a quibble with characterizing the inalienability markings as rare, however. It may very well be unique, but this is a pretty narrow corner of the grammar we're looking at. That is, if you define your category narrowly enough, anything may be unique, e.g., English is the only language marking both tense and verb agreement and plurality with a coronal consonant wherein the plural and agreement markers are the same. In my experience, if you look at any language closely enough, you'll find myriad things that are unique about it, even in the context of a large tightly-knit family like Bantu.
    So, it's crucial to provide an appropriate context (i.e., the universal that is violated) to understand how rare something really is.

  • Good point (although rare phenomena don't have to violate universals).

    Cysouw and Wohlgemuth use Plank's definition (see his website here):
    “. . . a trait . . . which is so uncommon across languages as not even to occur in all members of a single . . . family or diffusion area . . . Diachronically speaking, a rarum is a trait which has only been retained, or only been innovated, in a few members of a single family or sprachbund or of a few of them.”

    They also mention Fredrick (2006) quantification of "a threshold of attestations in ≤ 5% of the world’s languages for rara and in 1% of the world's languages for rarissima", although Cysouw and Wohlgemuth discuss problems with this (see page 5).

    However, you're right in saying that, given the vast number of possible ways of expressing yourself, almost every language will have SOME unique trait. It's almost the definition of how to differentiate languages. This links back to my discussion of whether a language is a useful unit of analysis in evolutionary linguistics (see here).

    At any rate, it's clear that there is a lot more to be learned by looking at how universals and rare structures interact.

  • ettlinger

    I think it's a bit more complicated (simpler?) than that. You need an appropriate control, or some stochastic way of determining the rarity of something a priori given probabilities of the different constituents. So, for the English coronal past-tense/plural marking, it only occurs in .02% of the world's languages, far below the constant threshold. What you need to compare it to, however, is how probable that is given your definition of the properties in question, i.e., (making up numbers here:) 40% of languages with tense marking, x 40% with argeement x 30% with suffixal morphology etc etc.

    In terms of rara violating universals, they obviously don't have to but I wonder how interesting they are if they don't bear on some question of universality (of cognitive/historical processes). So, "this language has no /s/" is (only) interesting in the context of universals of sound inventory, where /s/ is near-universal. My imagination falls short in how it would be interesting otherwise.

  • Regarding the violation of universals, I agree with what you said. I guess I was thinking 'positive' rara rather than 'negative' rara. For example, Great Andamanese doesn't violate any big universals like not having common consonants or not exhibiting recursion (see Dan Everett's claims about Piraha), but it does have this extra system on top of other common systems. That is, it seem like an unnecessary embellishment - something unexpected that's emerged only in this language.

  • Indeed, recent genetic research has shown that the Andamanese are descendants of the first human migration from Africa in the Palaeolithic

    if you are talking about the recent papers which came out on australian ancestry, this is not correct. they are probably part of the second eurasian wave. the aboriginals and melanesians are a compound of wave 1 and wave 2.

  • Sorry - after looking this up it seems like I have a hopelessly simple view of migration patterns!
    Thangaraj et al. (2005) find that the Onge and Great Andamanese belong to haplogroup M (new clades M31 and M32). I had misinterpreted their claim that "Our previous studies have shown that all Eurasian and Oceanian founder haplogroups— mitochondrial M, N, and R and Y-chromosomal C, D, and F—coexist in South Asia, suggesting their comigration along the southern coastal route in one wave after the exit of modern humans from Africa."

    Although the comigration was in one wave, they were not making a claim in this sentence about which wave from Africa they were part of.

  • these papers are the most update to date:

    first one speaks directly to andaman islanders, while second one to the fact that oceanians are compound of first out of africa + second.