PLM2012 Coverage: Dirk Geeraerts: Corpus Evidence for Non-Modularity

The first plenary talk at this year’s Poznań Linguistic Meeting was by Dirk Geeraerts, who is professor of linguistics at the University of Leuven, Belgium.

In his talk, he discussed the possibility that corpus studies could yield evidence against the supposed modularity of language and mind endorsed by, for example, Generative linguists (you can find the abstract here)

Geeraerts began his talk by stating that there seems to be a paradigm shift in linguistics from an analysis of structure that is based on introspection to analyses of behaviour based on quantitative linguistic studies. More and more researchers are adopting quantified corpus-based analyses, which test hypotheses using statistical testing of language behaviour. As a data-set they use experimental data or large corpora. In his talk, he discussed the possibility that corpus studies could yield evidence against the supposed modularity of language and mind endorsed by, for example, Generative linguists (you can find the abstract here)


One further trend Geeraerts identified in this paradigm shift is that these kinds of analyses become more and more multifactorial in that they include multiple different factors which are both internal and external to language. Importantly, this way of doing linguistics is fundamentally different than the mainstream late 20th century view of linguistics.

What is important to note here when comparing this trend to other approaches to studying language is that multifactoriality goes against Chomsky’s idea of grammar as an ideal mental system that can be studied through introspection. In the traditional view, it is supposed that there is some kind of ideal language system which everyone has access to. This line of reasoning then justifies introspection as a method of studying the whole system of language and making valid generalizations about it. However, this goes against the emerging corpus linguistic view of language. On this view a random speaker is not representative for the linguistic community as a whole. The linguistic system is not homogenous across all speakers, and therefore introspection doesn’t suffice.


The main thrust of Geeraerts’ talk was that research within this emerging paradigm also might call into question the assumption of the modularity of the mind (as advocated, for example by Jerry Fodor or Neil Smith): The view of the mind as a compartmentalized system consisting of discrete components or modules (for example, the visual system, language) plus a central processor.

The modularity thesis holds that these separate modules work independently and each feeds its information to the central processing unit. One of the main kinds of evidence adduced for this position are cases of double dissociation. One example would be the modularity of language and intelligence. This means that language skills can be intact while intelligence is negatively affected (an example for this +language – intelligence situation would be Down syndrome), or the other way around (e.g. – language + intelligence à aphasia).

By analogy, this view of modularity was also taken to apply to the language, so that Grammar as a mental system is seen as consisting of independently operating modules (syntax, semantic, pragmatics etc.). In addition, most modularists think that there is one module that is more important than others, namely syntax. This differs from the view of cognitive-functional approaches, which Geeraerts advocates. They don’t assume a hierarchy between the levels of language.

This is the point where it gets quite technical and I won’t go into detail into how Geeraerts proposed to use corpus evidence to falsify the idea of modularity in grammar, but the main idea is that according to him, a multifactorial corpus analysis can show that there are a variety of factors that influence the way utterances are produced. Importantly, such corpus studies (e.g. Speelman & Geeraerts 2009) indicate that these different factors, which include both language-internal (e.g. syntactic patterns, lexical collocations and conceptual closeness) and language-external factors (like speaker characteristics, regional, register variation,  dialogues/multilogues vs monologues, private vs public, spontaneous vs prepared speech), seem to interact in such a way that is incompatible with the assumption of informationally encapsulated modules which work independently of each other.

In summary then, Geeraerts makes the case that the corpus-based quantitative turn in linguistics offers an opportunity for falsifying deep seated assumptions of mainstream 20th century linguistics, such as the homogeneity of linguistics and the modularity of grammar


Speelman, Dirk and Dirk Geeraerts. 2009. “”Causes for causatives: the case of Dutch ‘doen’ and ‘laten'”. In Ted Sanders and Eve Sweetser (eds.), Causal Categories in Discourse and Cognition 173-204. Berlin/New York: Mouton de Gruyter.

4 thoughts on “PLM2012 Coverage: Dirk Geeraerts: Corpus Evidence for Non-Modularity”

  1. This sounds interesting, it’s a shame there isn’t more detail on how Geeraerts proposes to use corpus data to falsify modularity of mind; it seems to me very unlikely that such data could bear any relevance to that question.

    The idea of a language module that is predicated of the existence of other cognitive packages such as social cognition doesn’t seem too problematic to me. It also means we shouldn’t be surprised when these things work in concert – after all, my computer works in a modular way, but each respective module works in concert with the others. To characterise this in terms of ‘influence’ that somehow disproves modularity doesn’t really follow.

    It seems to me that a more interesting question is what exactly constitutes a module? I’d argue that this is where evolutionary linguistics can make a contribution by couching it in terms of adaptation – i.e. what is the ultimate function that is common to a set of capacities?

  2. Yeah, I had similar concerns about the talk. Geeraerts’ did, however, ask a similar question to yours, except he framed it in terms of whether encapsulation is a diagnostic criteria of modularity. So his point isn’t that significant effects of various factors alone is not enough to prove the point; this could merely indicate the non-opaqueness of the modules. We will instead have to look for ‘non-compositional’ effects: the effect of two variables working together is different from what can be expected when working separately. His solution to this is to use a regression analysis (I’m somewhat sceptical as to the utility of this). In short: do we find interaction effects suggestive of non-modularity? For Geeraerts, this would first entail variables belonging to: syntactic module (syntactic pattern), lexical module (idiomaticity), a semantic module (animacy, causativizationability). Then, the goal is to look for interaction effects between different types of variables, such as lectal, lexical and syntactic. Lastly, he raises two more points: (1) more studies are needed to chart the various interaction effects that occur, so that we can draw general conclusions; (2) possibly we need redefine the notion of modularity: if these are not the testable predictions following from modularity, then what are — and how should they be falsified?

    It was interesting talk — though, like you, I’m not too convinced by the use of corpus data to falsify modularity.

  3. Yeah, I think finding out whether the two variables are working together or working separately speaks more to the question of what constitutes a module. I think he’s right that encapsulation is probably off the mark. That’s why I’d propose more emphasis on the function of the module – to run with my increasingly-inappropriate computer analogy, the disparate modules in my computer still coordinate (or “work together”) to make a computer, which is itself a functional module at a higher level.

    Maybe I’m just being unhelpful, because it now sounds like Geeraerts’ level of analysis was concerned with encapsulated modules within the language faculty specifically, where I initially thought he was talking about the language faculty as a module among others. Though I suppose these points apply either way.

    I’m also just assuming that his proposed difference in interaction effects between variables is a good proxy for encapsulation, which I kind of doubt; the modules themselves may well ‘interact’ in different ways, so a difference between them may not tell us all that much.

    I’m going to stop commenting though, because this is all based on my assumptions about what he did and why. Which means I actually don’t know what I’m on about. Interesting ideas, though.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.