Happy Darwin Day!

I had hoped to celebrate Darwin day with a longer post discussing how language is often viewed as a challenging puzzle to natural selection. My main worry is that the formal design metaphor used in much of linguistics has been used, incorrectly IMHO, to divert attention away from studying language as a biological system based on organic logic. If this doesn’t make much sense, then you can do some background reading with Terrence Deacon’s paper, Language as an emergent function: Some radical neurological and evolutionary implications. Alas, that’s all I have to say on the matter for now, but if you’re looking for something related to Darwin, evolution and the origin of language, then I strongly suggest you head over to the excellent Darwin Correspondence project and read their blog post on the subject:

Darwin started thinking about the origin of language in the late 1830s. The subject formed part of his wide-ranging speculations about the transmutation of species. In his private notebooks, he reflected on the communicative powers of animals, their ability to learn new sounds and even to associate them with words. “The distinction of language in man is very great from all animals”, he wrote, “but do not overrate—animals communicate to each other” (Barrett ed. 1987, p. 542-3). Darwin observed the similarities between animal sounds and various natural cries and gestures that humans make when expressing strong emotions such as fear, surprise, or joy. He noted the physical connections between words and sounds, exhibited in words like “roar”, “crack”, and “scrape” that seemed imitative of the things signified. He drew parallels between language and music, and asked: “did our language commence with singing—is this the origin of our pleasure in music—do monkeys howl in harmony”? (Barrett ed. 1987, p. 568).

Retiring Procrustean Linguistics

Many of you are probably already aware of the Edge 2014 question: what scientific ideas are ready for retirement? The question was derived from the Kuhnian-esque, and somewhat tongue-in-cheek, quote by theoretical physicist Max Planck:

A new scientific theory does not triumph by convincing its opponents and making them see the light, but rather because its opponents die, and a new generation grows up that is familiar with it.

Some of the big themes that jumped out at me were bashing the scientific method, bemoaning our enthusiasm for big data and showing us how we don’t understand and routinely misapply statistics. Other relevant candidates that popped up for retirement were culturelearninghuman natureinnateness, and brain plasticity. Lastly, on the language front, we had Benjamin Bergen and Nick Enfield weighing in against universal grammar and linguistic competency, whilst John McWhorter rallied against strong linguistic relativity and Dan Sperber challenged our conventional understanding of meaning.

And just so you’re aware: I’m not necessarily in agreement with all of the perspectives I’ve linked to above, but I do think a lot of them are interesting and definitely worth a read (if only to clarify your own position on the matters). On this note, you should probably go over and read Norbert Hornstein’s post about the flaws of Bergen’s argument, which basically boil down to a conflation between I-languages and E-languages (and where we should expect to observe universal properties).

If I had to offer my own candidate for retirement, then it would be what Anne Buchanan over at the excellent blog, The Mermaid’s Tale, termed Procrustean Science:

In classical Greek mythology, Procrustes was a criminal who produced an iron bed and made his victims fit the bed…by cutting off any parts of their bodies that didn’t fit. The metaphorical use of the word means “enforcing uniformity or conformity without regard to natural variation or individuality.” It is in this spirit that Woese characterized much of modern biology as procrustean, because rather than adapt its explanations to the facts, the facts are forced to lie in a bed of theory that is taken for granted–and thus, the facts must fit!

Continue reading “Retiring Procrustean Linguistics”

On the entangled banks of representations (pt.1)

ResearchBlogging.orgLately, I took time out to read through a few papers I’d put on the backburner until after my first year review was completed. Now that’s out of the way, I found myself looking through Berwick et al.‘s review on Evolution, brain, and the nature of language. Much of the paper manages to pull off the impressive job of making it sound as if the field has arrived on a consensus in areas that are still hotly debated. Still, what I’m interested in for this post is something that is often considered to be far less controversial than it is, namely the notion of mental representations. As an example, Berwick et al. posit that mind/brain-based computations construct mental syntactic and conceptual-intentional representations (internalization), with internal linguistic representations then being mapped onto their ordered output form (externalization). From these premises, the authors then arrive at the reasonable enough assumption that language is an instrument of thought first, with communication taking a secondary role:

In marked contrast, linear sequential order does not seem to enter into the computations that construct mental conceptual-intentional representations, what we call ‘internalization’… If correct, this calls for a revision of the traditional Aristotelian notion: language is meaning with sound, not sound with meaning. One key implication is that communication, an element of externalization, is an ancillary aspect of language, not its key function, as maintained by what is perhaps a majority of scholars… Rather, language serves primarily as an internal ‘instrument of thought’.

If we take for granted their conclusions, and this is something I’m far from convinced by, there is still the question of whether or not we even need representations in the first place. If you were to read the majority of cognitive science, then the answer is a fairly straight forward one: yes, of course we need mental representations, even if there’s no solid definition as to what they are and the form they take in our brain. In fact, the notion of representations has become a major theoretical tenet of modern cognitive science, as evident in the way much of field no longer treats it as a point of contention. The reason for this unquestioning acceptance has its roots in the notion that mental representations enriched an impoverished stimulus: that is, if an organism is facing incomplete data, then it follows that they need mental representations to fill in the gaps.

Continue reading “On the entangled banks of representations (pt.1)”

Language, economic behaviour, a fancy video and some marshmallows

Most of you are probably now familiar with the following video about Keith Chen’s work on The Effect of Language on Economic Behavior:

Given this blog’s link with Chen’s study (see Sean’s RT posts here and here), and that Sean and I recently had our own paper published on the topic of these correlational studies, I thought I’d share some of my own thoughts in regards to this video. First up, the video provides some excellent animation, and it does a reasonable job at distilling the core argument of Chen’s paper. However, I do have some concerns, namely the conclusion presented in the video that “even seemingly insignificant features of our language can have a massive impact on our health, our national prosperity and the very way we live and die“.

This is stated far too strongly. After all, the study is only correlational in nature, and there are no experiments supporting this claim. Also, the video makes no mention of the various critiques that have popped up around the web by professional linguists, such as this excellent post by Osten Dahl. Of course, we could hand wave away these critiques, and argue it’s just a fun video. But I worry these popular renditions often lend significant media weight to dubious and unsubstantiated claims, with the potential to influence social policy. Still, we can’t completely blame the video. There’s somewhat of an academic smokescreen at work in the way Chen writes up the paper — it reads as if he had a particular hypothesis, and then tested this using an available dataset. I’m not 100% sure this is the whole story. I wouldn’t be too surprised to hear the initial finding was discoveredrather than actively sought out in a strict hypothesis-testing sense. This is all conjecture on my part, and I could be completely wrong here, but it does seem like Chen was fishing for correlations: you throw out your line into a large sea of data, find a particularly strong association, and then proceed to attach an hypothesis to it. Such practices are exactly the type of problem Sean and I were warning against in our paper. And as Geoff Pullum pointed out: Chen’s causal intuition could easily have been reversed and presented in an equally compelling fashion. It just happened to be the case that the correlation fell in one particular direction.

Besides the numerous theoretical and methodological critiques of the paper, the simple fact of the matter is that Chen’s work is being presented as if it’s demonstrated a causal relation. Let’s be clear about this: he hasn’t even got close to making that point. All he’s found is a strong correlation. So far, the best we can say is that we’re at the hypothesis-generating stage, with the general hypothesis being that differences in grammatical marking of the future influence future-oriented behaviours. Now, if we are to test this hypothesis, then experimental work is going to be needed. I doubt this will be too difficult to do given the large literature into delayed gratification. One useful approach might be found in the Stanford Marshmallow Experiment:

Here, you could control for a whole host of factors, whilst seeing if delayed gratification varied according to the language of particular groups. Surely Chen would expect there to be differences between those populations with strong-FTR languages and those with weak-FTR languages? Also, I wouldn’t be too surprised if we discovered that marshmallow consumption is linked to a propensity to save as well as road traffic accidents, acacia trees and campfires. In short: Marshmallows are the social science equivalent of the Higgs Boson. They’ll unify everything.

Could the Higgsmellow unify all of social science?
Could the Higgsmallow unify all of social science?

New Perspectives on Duality of Patterning

For those of you who might be interested: Language and Cognition has a special issue on the nature and emergence of duality of patterning (paywall access, sorry!). As one of Hockett’s (1960) design features, duality of patterning is the property of human language that enables parts of language to be recombined in a systematic way to create new forms. In the introductory paper, de Boer, Sandler & Kirby (2012) identify two distinct levels where we see duality of patterning: combinatorial (meaningless sounds can be combined into meaningful morphemes and words) and compositional (morphemes and words can be combined to create new constructions with different meanings). For Hockett, not only is duality of patterning a design feature of language (in that all human languages have it), but also it is a unique characteristic of human language.

These two assumptions have been challenged on several fronts. First of all, simple combinatorial structure is found in systems of primate vocalisations, albeit restricted to a relatively limited set of signals. Meanwhile, in the Al-Sayyid Bedouin Sign Language (ABSL), the community does not have a conventionalised level of meaningless elements (although it does have compositional structure at the levels of morphology and syntax). These two examples offer important insights into the duality of patterning debate:

We see then from the case of ABSL that the need to express a large set of signals does not necessarily lead to combinatorial structure, while conversely from the animal systems, it appears that combinatorial structure does not necessarily need a very large set of signals to emerge. As combinatorial structure is the main defining characteristic of duality of patterning, it appears that both the status of duality of patterning as a design feature of language and the evolutionary pathways leading to it need to be rethought. (de Boer, Sandler & Kirby, 2012: 252).

The rest of the special issue is divided up between theoretical and experimental/modelling contributions. The abstracts and links to the papers (again, paywall, sorry!) are posted below. In summary, the general picture emerging from these papers is that duality of patterning is not a clearcut design feature of language, and nor is it necessarily a unique property of our capacity for language. Furthermore, we should also show a greater appreciation of the role that cultural evolution plays:

An apparent point of consensus from the papers in this special issue is that we should not see duality of patterning as a feature hard-wired into an innate language faculty, but rather as arising from multiple pressures operating on language as it emerges and changes in socially interacting populations. When we talk about the evolution of this design of language, then, we are referring more to cultural rather than biological evolution […] It appears that duality of patterning is a rather general state towards which sufficiently complex systems of signals evolve for different reasons: distinctiveness, learnability and a tendency to keep meaningful distinctions, while at the same time trying to make one’s utterances sound similar to those of others in the population. Thus, multiple cognitive processes seem to lead to duality of patterning and therefore, there are probably multiple evolutionary pathways that lead to duality of patterning as well. (de Boer, Sandler & Kirby, 2012: 257).

Continue reading “New Perspectives on Duality of Patterning”

Is ambiguity dysfunctional for communicatively efficient systems?

Based on yesterday’s post, where I argued degeneracy emerges as a design solution for ambiguity pressures, a Reddit commentator pointed me to a cool paper by Piantadosi et al (2012) that contained the following quote:

The natural approach has always been: Is [language] well designed for use, understood typically as use for communication? I think that’s the wrong question. The use of language for communication might turn out to be a kind of epiphenomenon… If you want to make sure that we never misunderstand one another, for that purpose language is not well designed, because you have such properties as ambiguity. If we want to have the property that the things that we usually would like to say come out short and simple, well, it probably doesn’t have that property (Chomsky, 2002: 107).

The paper itself argues against Chomsky’s position by claiming ambiguity allows for more efficient communication systems. First of all, looking at ambiguity from the perspective of coding theory, Piantadosi et al argue that any good communication system will leave out information already in the context (assuming the context is informative about the intended meaning). Their other point, and one which they test through a corpus analysis of English, Dutch and German, suggests that as long as there are some ambiguities the context can resolve, then ambiguity will be used to make communication easier. In short, ambiguity emerges as a result of tradeoffs between ease of production and ease of comprehension, with communication systems favouring hearer inference over speaker effort:

The essential asymmetry is: inference is cheap, articulation expensive, and thus the design requirements are for a system that maximizes inference. (Hence … linguistic coding is to be thought of less like definitive content and more like interpretive clue.) (Levinson, 2000: 29).

If this asymmetry exists, and hearers are good at disambiguating in context, then a direct result of such a tradeoff should be that linguistic units which require less effort should be more ambiguous. This is what they found in results from their corpus analysis of word length, word frequency and phonotactic probability:

We tested predictions of this theory, showing that words and syllables which are more efficient are preferentially re-used in language through ambiguity, allowing for greater ease overall. Our regression on homophones, polysemous words, and syllables – though similar – are theoretically and statistically independent. We therefore interpret positive results in each as strong evidence for the view that ambiguity exists for reasons of communicative efficiency (Piantadosi et al., 2012: 288).

At some point, I’d like to offer a more comprehensive overview of this paper, but this will have to wait until I’ve read more of the literature. Until then, here’s some graphs of the results from their paper:

Continue reading “Is ambiguity dysfunctional for communicatively efficient systems?”

Degeneracy emerges as a design feature in response to ambiguity pressures

Two weeks ago my supervisor, Simon Kirby, gave a talk on some of the work that’s been going on in the LEC. Much of his talk focused on one of the key areas in language evolution research: the emergence of the basic design features that underpin language as a system of communication. He gave several examples of these design features, mostly drawn from the eminent linguist, Charles Hockett, before moving on to one of the main areas of focus over the past few years: compositionality (the ability for complex expressions to derive their meaning from the combined meaning of their parts; see Michael’s post and Sean’s post for some good previous coverage). Simon’s argument is that compositionality, as well as some other design features of language, emerge from two competing constraints: a pressure to be useful (expressivity) and a pressure to be learned (compressibility).

The general gist of the talk was that by varying the relative pressures of these two constraints we can evolve very different systems of communication. To get something approaching language we thus need to reach a balance between learning and use. First, naïve learning is required because it forces language to adapt to the learning bottleneck imposed by the maturational constraints on child learning. Still, even with this inter-generational learning pressure, language isn’t merely a passive task of remembering and reproducing a set of forms and meanings. Instead, we need to also account for usage dynamics: here, the system must display a capacity to be expressive, in so much that there is an ability for signals to differentiate between meanings within a language.

From Kirby, Cornish & Smith’s (2008) work we know that a language heavily biased towards maximally expressivity is very much like the initial generation of their experiments: there is an idiosyncratic set of one-form to one-meaning pairs without any systematic structure. It’s expressive because every possible meaning in the space has a label. By contrast, a stronger bias towards learnability results in highly compressible languages: that is, we see highly underspecified systems of communication, with the most extreme example being one-form to all-meanings. The result of balancing these two forces over Iterated Learning (henceforth, IL) is the emergence of compositionality: a learnable, yet highly structured communication system that is the result of a pressure to generalise over a set of novel stimuli.

Continue reading “Degeneracy emerges as a design feature in response to ambiguity pressures”

Niche as a determinant of word fate in online groups (featuring @hanachronism and @richlitt)

ResearchBlogging.org

Last year Altmann, Pierrehumbert & Motter (henceforth, APM) released a great paper in PLoS One: Niche as a determinant of word fate in online groups. Having referenced the paper extensively in my non-bloggy academic world, I thought it was about time I mentioned it on a Replicated Typo. Below is the abstract:

Patterns of word use both reflect and influence a myriad of human activities and interactions. Like other entities that are reproduced and evolve, words rise or decline depending upon a complex interplay between their intrinsic properties and the environments in which they function. Using Internet discussion communities as model systems, we define the concept of a word niche as the relationship between the word and the characteristic features of the environments in which it is used. We develop a method to quantify two important aspects of the word niche: the range of individuals using the word and the range of topics it is used to discuss. Controlling for word frequency, we show that these aspects of the word niche are strong determinants of changes in word frequency. Previous studies have already indicated that word frequency itself is a correlate of word success at historical time scales. Our analysis of changes in word frequencies over time reveals that the relative sizes of word niches are far more important than word frequencies in the dynamics of the entire vocabulary at shorter time scales, as the language adapts to new concepts and social groupings. We also distinguish endogenous versus exogenous factors as additional contributors to the fates of words, and demonstrate the force of this distinction in the rise of novel words. Our results indicate that short-term nonstationarity in word statistics is strongly driven by individual proclivities, including inclinations to provide novel information and to project a distinctive social identity.

Continue reading “Niche as a determinant of word fate in online groups (featuring @hanachronism and @richlitt)”

Evolution of the Speech Code: Higher-Order Symbolism and the Linguistic Big Bang

Two months ago Daniel Silverman (San Jose State University) gave a talk at the LEC on the Evolution of the Speech Code: Higher-Order Symbolism and the Linguistic Big Bang. With his permission, I’ve posted below a PDF of a paper he’s written based on the talk — it’s really fascinating stuff and chock-a-block with ideas. Keep in mind that it’s a work in progress, but I’m sure he’ll appreciate any (informative) comments. So, on that note, go and read:

[gview file=”http://seedyroad.com/academics/Evolutionofthespeechcode.pdf” save=”1″]

Higgs Boson and Big Data

It’s not about cultural evolution, but I think most people who have even a passing interest in science are gearing up to welcome Higgs Boson to the elementary particle party. Anyway, here’s a nicely put together video on explaining what the Higgs Boson is and why its discovery is significant:

The Higgs Boson Explained from PHD Comics on Vimeo.

There’s also a more general point about needing to gather a huge amount of data (15 petabytes a year — enough to fill more than 1.7 million dual-layer DVDs a year) to find the very small effect size that is predicted for the Higgs Boson. In itself, data of this magnitude will likely come with significantly more noise, which means physicists have needed to develop well-defined statistical methods (they even have their own statistics committee). It really is a massive achievement for modern science.