Retiring Procrustean Linguistics

Many of you are probably already aware of the Edge 2014 question: what scientific ideas are ready for retirement? The question was derived from the Kuhnian-esque, and somewhat tongue-in-cheek, quote by theoretical physicist Max Planck:

A new scientific theory does not triumph by convincing its opponents and making them see the light, but rather because its opponents die, and a new generation grows up that is familiar with it.

Some of the big themes that jumped out at me were bashing the scientific method, bemoaning our enthusiasm for big data and showing us how we don’t understand and routinely misapply statistics. Other relevant candidates that popped up for retirement were culturelearninghuman natureinnateness, and brain plasticity. Lastly, on the language front, we had Benjamin Bergen and Nick Enfield weighing in against universal grammar and linguistic competency, whilst John McWhorter rallied against strong linguistic relativity and Dan Sperber challenged our conventional understanding of meaning.

And just so you’re aware: I’m not necessarily in agreement with all of the perspectives I’ve linked to above, but I do think a lot of them are interesting and definitely worth a read (if only to clarify your own position on the matters). On this note, you should probably go over and read Norbert Hornstein’s post about the flaws of Bergen’s argument, which basically boil down to a conflation between I-languages and E-languages (and where we should expect to observe universal properties).

If I had to offer my own candidate for retirement, then it would be what Anne Buchanan over at the excellent blog, The Mermaid’s Tale, termed Procrustean Science:

In classical Greek mythology, Procrustes was a criminal who produced an iron bed and made his victims fit the bed…by cutting off any parts of their bodies that didn’t fit. The metaphorical use of the word means “enforcing uniformity or conformity without regard to natural variation or individuality.” It is in this spirit that Woese characterized much of modern biology as procrustean, because rather than adapt its explanations to the facts, the facts are forced to lie in a bed of theory that is taken for granted–and thus, the facts must fit!

Procrustean Linguistics

Anne’s post was specifically referring to Carl Woese’s critique of reductionism in molecular biology, but linguistics also faces its own Procrustean problems. What I want to focus on, as my example of Procrustean Linguistics, is the idealised speaker-hearer as articulated by Chomsky (1965: 3-4):

Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speech-community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual performance.

Is this a useful way to approach language? Clearly, as is the case in Anne’s example of molecular biology, we have seen some successes from the enterprise, with there being theoretical utility for a limited range of problems. But I have grave doubts about this ivory tower of theoretical abstraction and idealisation; a place where we are safe from the contaminating influences of performance and context. As Nick Enfield alluded to, in his persuasive and spirited argument against merely focusing on competence, we potentially miss the important role performance plays in shaping our linguistic knowledge.

One example of where Procrustean Linguistics has seemingly led us astray is in the pervasive notion that ambiguity is dysfunctional for communicationAmbiguity exists at many layers of language. You have lexical ambiguity, syntactic ambiguity, scope ambiguity and many other types (see here). Broadly conceived, then, ambiguity corresponds to any state in which a linguistic code contains forms that are conventionally associated with more than one meaning (Hoefler, 2009). Why is ambiguity considered dysfunctional? Well, if we take the perspective that language strictly follows a code model of communication, having a language where a signal has multiple meanings increases the uncertainty as to what is intended. It is not too surprising, then, that a one-to-one mappings of forms and meanings would be considered a more optimal communication system than one where ambiguity was rampant (for example, constructed language Loglan, and its successor Lojban, were specifically designed to eliminate syntactic ambiguity). This leads us to the seemingly logical conclusion that the appearance of ambiguity supports a non-communicative role for language:

The natural approach has always been: Is [language] well designed for use, understood typically as use for communication? I think that’s the wrong question. The use of language for communication might turn out to be a kind of epiphenomenon[...] If you want to make sure that we never misunderstand one another, for that purpose language is not well designed, because you have such properties as ambiguity. If we want to have the property that the things that we usually would like to say come out short and simple, well, it probably doesn’t have that property. (Chomsky, 2002: 107)

Chomsky misses the mark here for a variety of reasons. First, he seems to imply that a communication system well designed for use will have no ambiguity whatsoever. This makes the erroneous assumption that biological design and optimisation is equal to perfection. We know from evolutionary theory 101 that biological systems are the messy result of blind historical tinkering, with the result being solutions that are just good enough. Of course, Chomsky is fully aware of how evolution works, and it should come as no surprise that he gets around this mismatch between ideal speaker and evolutionary theory by claiming language is a natural object; a perfect system similar those found in physics:

Recent work suggests that language is surprisingly ‘perfect’ in this sense, satisfying in a near-optimal way some rather general conditions imposed at the interface. Insofar as that is true, language seems unlike other objects of the biological world, which are typically a rather messy solution to some class of problems, given the physical constraints and the materials that history and accident have made available. (Chomsky, 1996)

Second, and more importantly, if we take into account the informativeness of context, then ambiguity does not pose an insurmountable problem to communication. As an example, listeners may have a large degree of uncertainty about whether a word like run is intended as a noun or a verb. Out of context, it may be difficult to guess, and so the word run is characterised as being highly ambiguous. But when we make use of syntactic information, as well as discourse context, then a listener is able to infer the intended meaning (Piantadosi, Tily & Gibson, 2012). Under this perspective, context is used as a resource in reducing uncertainty, and when these resources are deployed the problem of ambiguity mostly goes away. It doesn’t matter that a signal is perceived to have more than one meaning: so long as the listener can attend to the intended meaning, the speaker needs only to provide an unambiguous signal in particular contexts of use. Such sentiments echo those of Pinker & Bloom (1990: 713) when they argued language exhibits design for communication because it allows for “minimising ambiguity in context“.

Even in cases where ambiguity in language does cause problems for communication, we need to remember language is a system that emerged from competing constraints. Ambiguity could therefore be a byproduct of tradeoffs for processing efficiency and the fact that any good communication system will be skewed towards hearer inference rather than speaker effort (Levinson, 2000; Piantadosi, Tily & Gibson, 2012). Also, in a communication system that can perform repair operations fairly robustly and efficiently (Steels, 2012; Dingemanse, Torreira & Enfield, 2013), it should come as no surprise that we tolerate deleterious ambiguity without the need to downplay language as a system primarily used for communication.

To push the point home, the examples of ambiguity often provided by the likes of Berwick et al (2011) are in fact great examples of language being used for the purposes of communication, exactly because they convey the intended meaning of the speaker/writer: to demonstrate ambiguity! Telling us that sentence (10) is two-way ambiguous is only true in a superficial sense: it helps the authors communicate ambiguity. However, if I were to ever utter this sentence in day-to-day communication, then it is likely to be supported by resources in the broad context, such as the current state of the conversation and the prior discourse (what has just been said and referred to etc).


Speakers/writers are therefore constantly exploiting information in the broader context to make what they mean evident to the hearer/reader — with these intentions drawing on a whole range of pragmatic and discourse strategies. And you can look at this probabilistically: if a sentence is uttered, or written, then the probability of one interpretation over another is embedded in its context of use. This allows us to make the inference that the man had the binoculars (10a) is the intended meaning and not the boy used the binoculars and saw the man (10b). In short, the likelihood of us ever using this sentence is tied to information contained in the context, with the appearance of dysfunctional ambiguity being prevalent only when we view linguistic phenomena in isolation, as Miller (1951: 111-2) noted:

Why do people tolerate such ambiguity? The answer is that they do not. There is nothing ambiguous about ‘take’ as it is used in everyday speech. The ambiguity appears only when we, quite arbitrarily, call isolated words the unit of meaning.

Ambiguity, then, and all the perceived problems of it being dysfunctional for communication, appears to be an artefact of the ideal speaker-hearer model where linguistic meaning exists in a vacuum, divorced from the contexts in which it is learned and used. Language is more of tangled bank than a perfect system, and to ignore this fact is to fail in providing an adequate solution to what Simon Kirby has termed the  problem of linkageBy making this straightforward link between our individual cognitive machinery (biology) and the features we observe in language (language structure), the ideal speaker-hearer model potentially misses an important dynamical system: socio-cultural transmission. Instead, if we want to understand important structural properties of language, such as ambiguity, then we must appeal to complexity and try to see how these pieces interact with one another over multiple timescales:

Language does not spring directly from our language faculty. Rather, it is inherited and constantly shaped by our membership of a speech community. It is only by taking this point seriously that we can begin to understand how individual properties (e.g. features of an individual’s learning mechanism) end up making their influence felt at the population level in the actual structure of language[...] the assumption that there is a straightforward link between the properties of the language acquisition device and the universal properties of language may arise from the notion of an ideal speaker-listener in a homogeneous speech community (Chomsky, 1965), a foundational idealization of much of generative grammar. Whilst this idealization has its place, and much progress on understanding the structure of language has flowed from it, we need to move beyond it when considering language as an adaptive system. In other words, it has no place in an evolutionary approach to linguistics. Accordingly, we should expand our picture of the causal connections in the evolution of language to include cultural transmission. (Kirby, 2012: 593).

Of course, this isn’t a death blow to Chomsky’s LOT (Language of Thought) hypothesis, and it was not my intention to do so. After all, it is an empirical question whether language appears to be better designed with regard to “the internal system with which it must interact” (Chomsky, 2002: 108), but my rallying against the purported ambiguity problem is more of a cautionary tale of being over-reliant on competence and dismissive of performance. We need to appreciate both in forming a comprehensive understanding of language and its phenomena.

Piantadosi ST, Tily H, & Gibson E (2012). The communicative function of ambiguity in language. Cognition, 122 (3), 280-91 PMID: 22192697

  • Patrick

    Really interesting, thought-provoking post James. Small comment: The Language of Thought hypothesis is due to Jerry Fodor, not Chomsky ( ). I was a bit confused about why you suddenly mention it at the end there, it seems pretty orthogonal to the points you raised. It’s also worth noting that your contention that what you call ‘Procrustean linguistics’ should be retired is considerably stronger than what Kirby says: “…this idealization has its place, and much progress on understanding the
    structure of language has flowed from it, we need to move beyond it when
    considering language as an adaptive system.” I interpret this as conveying that abstraction away from performance considerations is useful when looking at specific problems, but it shouldn’t be our exclusive approach to understanding language. I find this more moderate position more appealing.

  • Patrick

    Having thought about this a bit, i guess that my general position is that the job of us Procrustrean linguists (i’m totally re-appropriating that term) with respect to ambiguity is to characterise the set of readings that a given surface string can have, and develop a formal model (a grammar) in order to explain why this particular surface string has these and only these readings. I think that this is a worthwhile task, and any work on ambiguity resolution has to start out with something like this as a foundation. Consider something as basic as quantifier scope ambiguities, for example:

    (1) Everyone likes someone

    This has two readings: (a) There is a certain someone, such that everyone likes them, and (b) For each person, there is someone that they like. Why does it have these two readings? How do we characterise them formally? It’s difficult to see how one might go about this without invoking some formal semantic/syntactic machinery. Notice that we’re not saying anything at all about ambiguity resolution here, that’s up to psycholinguists a.o. This seems like a sensible division of labour to me.

  • wintz85

    Hi Patrick, thanks for the comment (and history lesson) ;-) . I guess the reason why I raised it was just a point of clarification in case anyone actually thought I was going after the LOT hypothesis (which I wasn’t). On the notion that Procrustean Linguistics should be retired: I guess what I really was getting at with that rhetorical flourish was that I wanted the notion of ambiguity being dysfunctional for communication to be retired.

  • wintz85

    I really have no problem with the method for investigating ambiguity as you described above. My main critique was of the notion that, if you view ambiguity in a certain way, then you’re going to arrive at the conclusion it is counter-functional for communication. This is only true if you have a simplified, strawman model of communication, such as that found in the ideal speaker-hearer model. So, for you Procrustean linguists it can be business as usual :-D (N.B. You’re welcome to the term, Patrick, even though I wouldn’t normally class what you do as Procrustean… At least not based on our pub/blog conversations).

  • darrylmcadams

    If you’re into formal pragmatics (e.g. DRT) then the perspective of things being “unambiguous in context” is just the idea that your linguistic competence is more than just syntactic. That of course collapses the argument entirely. Chomsky’s general fear of heading into the realm of meaning is, afaict, dominated by his experience only with much older (GB-style) conceptions of what meaning is doing. More modern theories aren’t at all waffly in the ways that he fears, and thus ought to fell well within the bounds of competence. No more problem with the procrustes.

  • MarkDing

    @Wintz, you characterise Enfield’s Edge answer as arguing “against linguistic competency”, but as I read it, his answer argues not against competence as such but against the Chomskyan idea that only the idealised speaker’s competence is the proper object of study for linguists. In other words, it seems both of you come round to making the same point: that to really understand language in all its aspects, we need to study language in use (‘performance’).

  • wintz85

    Hi Mark, you’re totally right when you say I am basically adding to the point already made by Enfield. And that was pretty much my intention — to build upon Enfield’s response, albeit by specifically looking at ambiguity.

  • wintz85

    Hi Darryl, thanks for the interesting comment. You make the point that formal pragmatic models capture these problems by basically plugging more into the competence than just syntax. Great. I’m 100% behind these enterprises. I got the impression that, from the comments here and on Reddit, I was attacking the very edifice of formal approaches to linguistics and the idea of competence. That was far from my intention. If I had to rephrase, then I guess my line of thinking was why does Chomsky think ambiguity is dysfunctional for communication and how did he arrive at this conclusion? He does this in two ways. First, there’s his hypothesis that language is well designed with regard to the internal systems with which it must interact. I have no qualms with this and it’s an interesting thought (especially given that we shouldn’t be wedded to the assumption that language evolved for communication). Second, he makes a simplifying assumption about how efficient communication systems should work based on an idealised speaker-hearer model. One of these assumptions being that an efficient communication system will look different when we have context as a resource. Under these perspectives, ambiguity is in fact seen as a desirable property of communication systems, as Piantadosi et al (2012) note:

    We present a general information-theoretic argument that all efficient communication systems will be ambiguous, assuming that context is informative about meaning. We also argue that ambiguity allows for greater ease of processing by permitting efficient linguistic units to be re-used [...] We argue for two beneficial properties of ambiguity: first, where context is informative about meaning, unambiguous language is partly redundant with the context and therefore inefficient; and second, ambiguity allows for the re-use of words and sounds which are more easily produced or understood [...] The first shows that most efficient communication system will not convey information already provided by the context. Such communication systems necessarily appear to be ambiguous when examined out of context. Second, we argue that specifically for human language processing mechanism, ambiguity additionally allows re-use for “easy” linguistic elements — words that are short, frequent, and phonotactically high probability.