You’re clever for your kids’ sake: A feedback loop between intelligence and early births

The gap between our cognitive skills and that of our closest evolutionary ancestors is quite astonishing. Within a relatively short evolutionary time frame humans developed a wide range of cognitive abilities and bodies that are very different to other primates and animals. Many of these differences appear to be related to each other. A recent paper by Piantadosi and Kidd argues that human intelligence originates in human infants’ restriction of their birth size, leading to premature births and long weaning times that require intensive and intelligent care. This is an interesting hypothesis that links the ontogeny of the body with cognition.

Human weaning times are extraordinarily long. Human infants spend their first few months being highly dependent on their caregivers, not just for food but for pretty much any interaction with the environment. Even by the time they are walking they still spend years being dependant on their caregivers. Hence, it would be a good for their parents to stick around and care for them – instead of catapulting them over the nearest mountain.  Piantadosi and Kidd argue that “[h]umans must be born unusually early to accommodate larger brains, but this gives rise to particularly helpless neonates. Caring for these children, in turn, requires more intelligence—thus even larger brains.” [p. 1] This creates a runaway feedback loop between intelligence and weaning times, similar to those observed in sexual selection.

Piantadosi and Kidd’s computational model takes into account infant mortality as a function of intelligence and head circumference, but also take into account the ooffspring’s likelihood to survive into adulthood, depending on parental care/intelligence. The predictions are based on the population level, and the model predicts a fitness landscape where two optima emerge: populations either drift towards long development and smaller head circumference (a proxy for intelligence in the model) or they drift towards the second optimum – larger heads but shorter weaning time. Once a certain threshold has been crossed, a feedback loop emerges and more intelligent adults are able to support less mature babies. However, more intelligent adults will have even bigger heads when they are born – and thus need to be born even more premature in order to avoid complications at birth.

To test their model’s predictions, the authors also correlated weaning times and intelligence measures within primates and found a high correlation within the primate species. For example, bonobos and chimpanzees have an average weaning time of approximately 1100 days, and score highly in standardised intelligence measures. Lemurs on the other hand only spend 100 days with their offspring, and score much lower in intelligence. Furthermore, Piantadosi and Kidd also look at the relationship between weaning age with various other physical measures of the body, such as the size of the neocortex, brain volume and body mass. However, weaning time remains the most reliable predictor in the model.

Piantadosi and Kidd’s model provides a very interesting perspective on how human intelligence could have been the product of a feedback loop between developmental maturity and neonatal head size, and infant care. Such a feedback component could explain the considerable evolutionary change humans have undergone. Yet between the two optima of long birth age and a small brain radius and a short birth age and a large brain, most populations do drift towards the longer birth/smaller brain (See graph 2.A in the paper). It appears that the model cannot explain the original evolutionary pressure for more intelligence that pushed humans over the edge: If early humans encountered an increased number of early births, why did those populations with early births not simply die out, instead of taking the relatively costly route of becoming more intelligent? Only once there is a pressure towards more intelligence, it is possible that humans were pushed into a location leading the self-enforcing cycle of low birth age and high parental intelligence, and this cycle drove humans towards much higher intelligence than they would have developed otherwise. Even if the account falls short of ultimate explanations (i.e. why a certain feature has evolved, the reason), Piantadosi and Kidd have described an interesting proximate explanation (i.e. how a feature evolved, the mechanism).

Because the data is correlative in its nature only, the reverse hypothesis might also hold – humans might be more intelligent because they spend more time interacting with their caregivers. In fact, a considerable amount of their experiences is modulated by their caregivers, and their unique experience might also create a strong embodied perspective on the emergence of social signals. For example, infants in their early years see a proportionately high number of faces (Fausey et al., 2016). Maybe infants’ long period of dependence makes them learn so well from other people around them, thereby allowing for the acquisition of cultural information and a more in-depth understanding of the world around them. Therefore, the longer weaning time makes them pay much more attention to caregivers, providing a stimulus rich environment that human infants are immersed in for much longer than other species. Whatever the connection might be, I think that this kind of research offers a fascinating view on how children develop and what makes us human.


Fausey, C. M., Jayaraman, S., & Smith, L. B. (2016, Jul). From faces to hands: Changing visual input in the first two years. Cognition, 152, 101–107. doi: 10.1016/j.cognition.2016.03.005
Piantadosi, S. T., & Kidd, C. (2016). Extraordinary intelligence and the care of infants. Proceedings of the National Academy of Sciences. doi: 10.1073/pnas.1506752113
Thanks to Denis for finding the article.

I know (1) that you think (2) it’s funny, and you know (3) that I know (4) that, too.

A large part of human humour depends on understanding that the intention of the person telling the joke might be different to what they are actually saying. The person needs to tell the joke so that you understand that they’re telling a joke, so they need to to know that you know that they do not intend to convey the meaning they are about to utter… Things get even more complicated when we are telling each other jokes that involve other people having thoughts and beliefs about other people. We call this knowledge nested intentions, or recursive mental attributions. We can already see, based on my complicated description, that this is a serious matter and requires scientific investigation. Fortunately, a recent paper by Dunbar, Launaway and Curry (2015) investigated whether the structure of jokes is restricted by the amount of nested intentions required to understand the joke and they make a couple of interesting predictions on the mental processing that is involved in processing humour, and how these should be reflected in the structure and funniness of jokes. In today’s blogpost I want to discuss the paper’s methodology and some of its claims.

Continue reading “I know (1) that you think (2) it’s funny, and you know (3) that I know (4) that, too.”

What’s in a Name? – “Digital Humanities” [#DH] and “Computational Linguistics”

In thinking about the recent LARB critique of digital humanities and of responses to it I couldn’t help but think, once again, about the term itself: “digital humanities.” One criticism is simply that Allington, Brouillette, and Golumbia (ABG) had a circumscribed conception of DH that left too much out of account. But then the term has such a diverse range of reference that discussing DH in a way that is both coherent and compact is all but impossible. Moreover, that diffuseness has led some people in the field to distance themselves from the term.

And so I found my way to some articles that Matthew Kirschenbaum has written more or less about the term itself. But I also found myself thinking about another term, one considerably older: “computational linguistics.” While it has not been problematic in the way DH is proving to be, it was coined under the pressure of practical circumstances and the discipline it names has changed out from under it. Both terms, of course, must grapple with the complex intrusion of computing machines into our life ways.

Digital Humanities

Let’s begin with Kirschenbaum’s “Digital Humanities as/Is a Tactical Term” from Debates in the Digital Humanities (2011):

To assert that digital humanities is a “tactical” coinage is not simply to indulge in neopragmatic relativism. Rather, it is to insist on the reality of circumstances in which it is unabashedly deployed to get things done—“things” that might include getting a faculty line or funding a staff position, establishing a curriculum, revamping a lab, or launching a center. At a moment when the academy in general and the humanities in particular are the objects of massive and wrenching changes, digital humanities emerges as a rare vector for jujitsu, simultaneously serving to position the humanities at the very forefront of certain value-laden agendas—entrepreneurship, openness and public engagement, future-oriented thinking, collaboration, interdisciplinarity, big data, industry tie-ins, and distance or distributed education—while at the same time allowing for various forms of intrainstitutional mobility as new courses are approved, new colleagues are hired, new resources are allotted, and old resources are reallocated.

Just so, the way of the world.

Kirschenbaum then goes into the weeds of discussions that took place at the University of Virginia while a bunch of scholars where trying to form a discipline. So:

A tactically aware reading of the foregoing would note that tension had clearly centered on the gerund “computing” and its service connotations (and we might note that a verb functioning as a noun occupies a service posture even as a part of speech). “Media,” as a proper noun, enters the deliberations of the group already backed by the disciplinary machinery of “media studies” (also the name of the then new program at Virginia in which the curriculum would eventually be housed) and thus seems to offer a safer landing place. In addition, there is the implicit shift in emphasis from computing as numeric calculation to media and the representational spaces they inhabit—a move also compatible with the introduction of “knowledge representation” into the terms under discussion.

How we then get from “digital media” to “digital humanities” is an open question. There is no discussion of the lexical shift in the materials available online for the 2001–2 seminar, which is simply titled, ex cathedra, “Digital Humanities Curriculum Seminar.” The key substitution—“humanities” for “media”—seems straightforward enough, on the one hand serving to topically define the scope of the endeavor while also producing a novel construction to rescue it from the flats of the generic phrase “digital media.” And it preserves, by chiasmus, one half of the former appellation, though “humanities” is now simply a noun modified by an adjective.

And there we have it. Continue reading “What’s in a Name? – “Digital Humanities” [#DH] and “Computational Linguistics””

Chomsky, Hockett, Behaviorism and Statistics in Linguistics Theory

Here’s an interesting (and recent) article that speaks to statistical thought in linguistics: The Unmaking of a Modern Synthesis: Noam Chomsky, Charles Hockett, and the Politics of Behaviorism, 1955–1965 (Isis, vol. 17, #1, pp. 49-73: 2016), by Gregory Radick (abstract below). Commenting on it at Dan Everett’s FB page, Yorick Wilks observed: “It is a nice irony that statistical grammars, in the spirit of Hockett at least, have turned out to be the only ones that do effective parsing of sentences by computer.”

Abstract: A familiar story about mid-twentieth-century American psychology tells of the abandonment of behaviorism for cognitive science. Between these two, however, lay a scientific borderland, muddy and much traveled. This essay relocates the origins of the Chomskyan program in linguistics there. Following his introduction of transformational generative grammar, Noam Chomsky (b. 1928) mounted a highly publicized attack on behaviorist psychology. Yet when he first developed that approach to grammar, he was a defender of behaviorism. His antibehaviorism emerged only in the course of what became a systematic repudiation of the work of the Cornell linguist C. F. Hockett (1916–2000). In the name of the positivist Unity of Science movement, Hockett had synthesized an approach to grammar based on statistical communication theory; a behaviorist view of language acquisition in children as a process of association and analogy; and an interest in uncovering the Darwinian origins of language. In criticizing Hockett on grammar, Chomsky came to engage gradually and critically with the whole Hockettian synthesis. Situating Chomsky thus within his own disciplinary matrix suggests lessons for students of disciplinary politics generally and—famously with Chomsky—the place of political discipline within a scientific life.

Culture shapes the evolution of cognition

A new paper, by Bill Thompson, Simon Kirby and Kenny Smith, has just appeared which contributes to everyone’s favourite debate. The paper uses agent-based Bayesian models that incorporate learning, culture and evolution to make the claim that weak cognitive biases are enough to create population-wide effects, making a strong nativist position untenable.



A central debate in cognitive science concerns the nativist hypothesis, the proposal that universal features of behavior reflect a biologically determined cognitive substrate: For example, linguistic nativism proposes a domain-specific faculty of language that strongly constrains which languages can be learned. An evolutionary stance appears to provide support for linguistic nativism, because coordinated constraints on variation may facilitate communication and therefore be adaptive. However, language, like many other human behaviors, is underpinned by social learning and cultural transmission alongside biological evolution. We set out two models of these interactions, which show how culture can facilitate rapid biological adaptation yet rule out strong nativization. The amplifying effects of culture can allow weak cognitive biases to have significant population-level consequences, radically increasing the evolvability of weak, defeasible inductive biases; however, the emergence of a strong cultural universal does not imply, nor lead to, nor require, strong innate constraints. From this we must conclude, on evolutionary grounds, that the strong nativist hypothesis for language is false. More generally, because such reciprocal interactions between cultural and biological evolution are not limited to language, nativist explanations for many behaviors should be reconsidered: Evolutionary reasoning shows how we can have cognitively driven behavioral universals and yet extreme plasticity at the level of the individual—if, and only if, we account for the human capacity to transmit knowledge culturally. Wherever culture is involved, weak cognitive biases rather than strong innate constraints should be the default assumption.


CfP: Interaction and Iconicity in the Evolution of Language

Following the ICLC theme session on “Cognitive Linguistics and the Evolution of Language” last year,  I’m guest-editing a Special Issue of the journal Interaction Studies together with Michael Pleyer, James Winters, and Jordan Zlatev. This volume, entitled “Interaction and Iconicity in the Evolution of Language: Converging Perspectives from Cognitive and Evolutionary Linguistics”, will focus on issues that emerged as common themes during the ICLC workshop.

Although many contributors to the theme session have already agreed to submit a paper, we would like to invite a limited number of additional contributions relevant to the topic of the volume. Here’s our Call for Papers.

Continue reading “CfP: Interaction and Iconicity in the Evolution of Language”

Posture helps robots learn words, and infants, too.

What kind of information do children and infants take into account when learning new words? And to what extent do they need to rely on interpreting a speakers intention to extract meaning? A paper by Morse, Cangelosi and Smith (2015), published in PLoS One, suggests that bodily states such as body posture might be used by infants to acquire word meanings in the absence of the object named. To test their hypothesis, the authors ran a series of experiments using a word learning task with infants—but also a self-learning robot, the iCub.

Continue reading “Posture helps robots learn words, and infants, too.”

Future tense and saving money: Small number bias

Last week saw the release of the latest Roberts & Winters collaboration (with guest star Keith Chen). The paper, Future Tense and Economic Decisions: Controlling for Cultural Evolution, builds upon Chen’s previous work by controlling for historical relationships between cultures. As Sean pointed out in his excellent overview, the analysis was extremely complicated, taking over two years to complete and the results were somewhat of a mixed bag, even if our headline conclusion suggested that the relationship between future tense (FTR) and saving money is spurious. What I want to briefly discuss here is one of the many findings buried in this paper — that the relationship could be a result of a small number bias.

One cool aspect about the World Values Survey (WVS) is that it contains successive waves of data (Wave 3: 1995-98; Wave 4: 1999-2004; Wave 5: 2005-09; Wave 6: 2010-14). This allows us to test the hypothesis that FTR is a predictor of savings behaviour and not just an artefact of the structural properties of the dataset. What do I mean by this? Basically, independent datasets sometimes look good together: they produce patterns that line up neatly and produce a strong effect. One possible explanation for this pattern is that there is a real causal relationship (influences y). Another possibility is that these patterns aligned by chance and what we’re dealing with is a small number bias: the tendency for small datasets to initially show a strong relationship that disappears with larger, more representative samples.

Since Chen’s original study, which only had access to Waves 3-5 (1995-2009), the WVS has added Wave 6, giving us an additional 5 years to see if the initial finding holds up to scrutiny. If the finding is a result of the small number bias, then we should expect FTR to produce stronger effects with smaller sub-samples of data; the initial effect being washed out as more data is added. We can also compare the effect of FTR with that of unemployment and see if there are any differences in how these two variables react to more data being added. Unemployment is particularly useful because we’ve already got a clear casual story regarding its effect on savings behaviour: unemployed individuals are less likely to save than someone who is employed, as the latter will simply have a greater capacity to set aside money for savings (of course, employment could also be a proxy for other factors, such as education background and a decreased likelihood to engage in risky behaviour etc).

What did we find? Well, when looking at the coefficients from the mixed effect models, the estimated FTR coefficient is stronger with smaller sub-samples of data (FTR coefficients for Wave 3 = 0.57; Waves 3-4 = 0.72; Waves 3-5 = 041; Waves 3-6 = 0.26). As the graphs below show, when more data is added over the years a fuller sample is achieved and the statistical effect weakens. In particular, the FTR coefficient is at its weakest when all the currently available data is used. By comparison, the coefficient for employment status is weaker with smaller sub-samples of data (employment coefficient for Wave 3 = 0.41; Waves 3-4 = 0.54; Waves 3-5 = 0.60; Waves 3-6 = 0.61). That is, employment status does not appear to exhibit a small number bias, and as the sample size increases we can be increasingly confident that employment status has an effect on savings behaviour.





So it looks like the relationship between savings behaviour and FTR is an artefact of the small number bias. But it could be the case that FTR does have a real effect albeit a weaker one — we’ve just got a better resolution for variables like unemployment and these are dampening the effect of FTR. All we can conclude for now is that the latest set of results suggest a much weaker bias for FTR on savings behaviour. When coupled with the findings of the mixed effect model — that FTR is not a significant predictor of savings behaviour — it strongly suggests this is a spurious finding. It’ll be interesting to see how these results hold up when Wave 7 is released.


Future tense and saving money: no correlation when controlling for cultural evolution

This week our paper on future tense and saving money is published (Roberts, Winters & Chen, 2015).  In this paper we test a previous claim by Keith Chen about whether the language people speak influences their economic decisions (see Chen’s TED talk here or paper).  We find that at least part of the previous study’s claims are not robust to controlling for historical relationships between cultures. We suggest that large-scale cross-cultural patterns should always take cultural history into account.

Does language influence the way we think?

There is a longstanding debate about whether the constraints of the languages we speak influence the way we behave. In 2012, Keith Chen discovered a correlation between the way a language allows people to talk about future events and their economic decisions: speakers of languages which make an obligatory grammatical distinction between the present and the future are less likely to save money.

Continue reading “Future tense and saving money: no correlation when controlling for cultural evolution”

Cognitive Linguistics and the Evolution of Language

On Tuesday, July 21st, this year’s International Cognitive Linguistics Conference will host a theme session on “Cognitive Linguistics and the Evolution of Language” co-organized by three Replicated Typo authors: Michael Pleyer, James Winters, and myself. In addition, two Replicated Typo bloggers are co-authors on papers presented in the theme session.

The general idea of this session goes back to previous work by James and Michael, who promoted the idea of integrating Cognitive Linguistics and language evolution research in several conference talks as well as in a 2014 paper – published, quite fittingly, in a journal called “Theoria et Historia Scientiarum”, as the very idea of combining these frameworks requires some meta-theoretical reflection. As both cognitive and evolutionary linguistics are in themselves quite heterogeneous frameworks, the question emerges what we actually mean when we speak of “cognitive” or “evolutionary” linguistics, respectively.

I might come back to this meta-scientific discussion in a later post. For now, I will confine myself to giving a brief overview of the eight talks in our session. The full abstracts can be found here.

In the first talk, Vyv Evans (Bangor) proposes a two-step scenario of the evolution of language, informed by concepts from Cognitive Linguistics in general and Langacker’s Cognitive Grammar in particular:

The first stage, logically, had to be a symbolic reference in what I term a words-to-world direction, bootstrapping extant capacities that Autralopithecines, and later ancestral Homo shared with the great apes. But the emergence of a grammatical capacity is also associated with a shift towards a words-to-words direction symbolic reference: words and other grammatical constructions can symbolically refer to other symbolic units.

Roz Frank (Iowa) then outlines “The relevance of a ‘Complex Adaptive Systems’ approach to ‘language’” – note the scarequotes. She argues that “the CAS approach serves to replace older historical linguistic notions of languages as ‘organisms’ and as ‘species’”.

Sabine van der Ham, Hannah Little, Kerem Eryılmaz, and Bart de Boer (Brussels) then talk about two sets of experiments investigating the role of individual learning biases and cultural transmission in shaping language, in a talk entitled “Experimental Evidence on the Emergence of Phonological Structure”.

In the next talk, Seán Roberts and Stephen Levinson (Nijmegen) provide experimental evidence for the hypothesis that “On-line pressures from turn taking constrain the cultural evolution of word order”. Chris Sinha’s talk, entitled “Eco-Evo-Devo: Biocultural synergies in language evolution”, is more theoretical in nature, but no less interesting. Starting from the hypothesis that “many species construct “artefactual” niches, and language itself may be considered as a transcultural component of the species-specific human biocultural niche”, he argues that

Treating language as a biocultural niche yields a new perspective on both the human language capacity and on the evolution of this capacity. It also enables us to understand the significance of language as the symbolic ground of the special subclass of symbolic cognitive artefacts.

Arie Verhagen (Leiden) then discusses the question if public and private communication are “Stages in the Evolution of Language”.  He argues against Tomasello’s idea that ““joint” intentionality emerged first and evolved into what is essentially still its present state, which set the stage for the subsequent evolution of “collective” intentionality” and instead defends the view that

these two kinds of processes and capacities evolved ‘in tandem’: A gradual increase in the role of culture (learned patterns of behaviour) produced differences and thus competition between groups of (proto-)humans, which in turn provided selection pressures for an increased capability and motivation of individuals to engage in collaborative activities with others.

James Winters (Edinburgh) then provides experimental evidence that “Linguistic systems adapt to their contextual niche”, addressing two major questions with the help of an artificial-language communication game:

(i) To what extent does the situational context influence the encoding of features in the linguistic system? (ii) How does the effect of the situational context work its way into the structure of language?

His results “support the general hypothesis that language structure adapts to the situational contexts in which it is learned and used, with short-term strategies for conveying the intended meaning feeding back into long-term, system-wider changes.”

The final talk, entitled “Communicating events using bodily mimesis with and without vocalization” is co-authored by Jordan Zlatev, Sławomir Wacewicz, Przemysław Żywiczyński,  andJoost van de Weijer (Lund/Torun). They introduce an experiment on event communication and discuss to what extent the greater potential for iconic representation in bodily reenactment compared to in vocalization might lend support for a “bodily mimesis hypothesis of language origins”.

In the closing session of the workshop, this highly promising array of papers is discussed with one of the “founding fathers” of modern language evolution research, Jim Hurford (Edinburgh).

But that’s not all: Just one coffee break after the theme session, there will be a panel on “Language and Evolution” in the general session of the conference, featuring papers by Gareth Roberts & Maryia Fedzechkina; Jonas Nölle; Carmen Saldana, Simon Kirby & Kenny Smith; Yasamin Motamedi, Kenny Smith, Marieke Schouwstra & Simon Kirby; and Andrew Feeney.