Future tense and saving money: Small number bias

Last week saw the release of the latest Roberts & Winters collaboration (with guest star Keith Chen). The paper, Future Tense and Economic Decisions: Controlling for Cultural Evolution, builds upon Chen’s previous work by controlling for historical relationships between cultures. As Sean pointed out in his excellent overview, the analysis was extremely complicated, taking over two years to complete and the results were somewhat of a mixed bag, even if our headline conclusion suggested that the relationship between future tense (FTR) and saving money is spurious. What I want to briefly discuss here is one of the many findings buried in this paper — that the relationship could be a result of a small number bias.

One cool aspect about the World Values Survey (WVS) is that it contains successive waves of data (Wave 3: 1995-98; Wave 4: 1999-2004; Wave 5: 2005-09; Wave 6: 2010-14). This allows us to test the hypothesis that FTR is a predictor of savings behaviour and not just an artefact of the structural properties of the dataset. What do I mean by this? Basically, independent datasets sometimes look good together: they produce patterns that line up neatly and produce a strong effect. One possible explanation for this pattern is that there is a real causal relationship (influences y). Another possibility is that these patterns aligned by chance and what we’re dealing with is a small number bias: the tendency for small datasets to initially show a strong relationship that disappears with larger, more representative samples.

Since Chen’s original study, which only had access to Waves 3-5 (1995-2009), the WVS has added Wave 6, giving us an additional 5 years to see if the initial finding holds up to scrutiny. If the finding is a result of the small number bias, then we should expect FTR to produce stronger effects with smaller sub-samples of data; the initial effect being washed out as more data is added. We can also compare the effect of FTR with that of unemployment and see if there are any differences in how these two variables react to more data being added. Unemployment is particularly useful because we’ve already got a clear casual story regarding its effect on savings behaviour: unemployed individuals are less likely to save than someone who is employed, as the latter will simply have a greater capacity to set aside money for savings (of course, employment could also be a proxy for other factors, such as education background and a decreased likelihood to engage in risky behaviour etc).

What did we find? Well, when looking at the coefficients from the mixed effect models, the estimated FTR coefficient is stronger with smaller sub-samples of data (FTR coefficients for Wave 3 = 0.57; Waves 3-4 = 0.72; Waves 3-5 = 041; Waves 3-6 = 0.26). As the graphs below show, when more data is added over the years a fuller sample is achieved and the statistical effect weakens. In particular, the FTR coefficient is at its weakest when all the currently available data is used. By comparison, the coefficient for employment status is weaker with smaller sub-samples of data (employment coefficient for Wave 3 = 0.41; Waves 3-4 = 0.54; Waves 3-5 = 0.60; Waves 3-6 = 0.61). That is, employment status does not appear to exhibit a small number bias, and as the sample size increases we can be increasingly confident that employment status has an effect on savings behaviour.





So it looks like the relationship between savings behaviour and FTR is an artefact of the small number bias. But it could be the case that FTR does have a real effect albeit a weaker one — we’ve just got a better resolution for variables like unemployment and these are dampening the effect of FTR. All we can conclude for now is that the latest set of results suggest a much weaker bias for FTR on savings behaviour. When coupled with the findings of the mixed effect model — that FTR is not a significant predictor of savings behaviour — it strongly suggests this is a spurious finding. It’ll be interesting to see how these results hold up when Wave 7 is released.


Future tense and saving money: no correlation when controlling for cultural evolution

This week our paper on future tense and saving money is published (Roberts, Winters & Chen, 2015).  In this paper we test a previous claim by Keith Chen about whether the language people speak influences their economic decisions (see Chen’s TED talk here or paper).  We find that at least part of the previous study’s claims are not robust to controlling for historical relationships between cultures. We suggest that large-scale cross-cultural patterns should always take cultural history into account.

Does language influence the way we think?

There is a longstanding debate about whether the constraints of the languages we speak influence the way we behave. In 2012, Keith Chen discovered a correlation between the way a language allows people to talk about future events and their economic decisions: speakers of languages which make an obligatory grammatical distinction between the present and the future are less likely to save money.

Continue reading “Future tense and saving money: no correlation when controlling for cultural evolution”

Cognitive Linguistics and the Evolution of Language

On Tuesday, July 21st, this year’s International Cognitive Linguistics Conference will host a theme session on “Cognitive Linguistics and the Evolution of Language” co-organized by three Replicated Typo authors: Michael Pleyer, James Winters, and myself. In addition, two Replicated Typo bloggers are co-authors on papers presented in the theme session.

The general idea of this session goes back to previous work by James and Michael, who promoted the idea of integrating Cognitive Linguistics and language evolution research in several conference talks as well as in a 2014 paper – published, quite fittingly, in a journal called “Theoria et Historia Scientiarum”, as the very idea of combining these frameworks requires some meta-theoretical reflection. As both cognitive and evolutionary linguistics are in themselves quite heterogeneous frameworks, the question emerges what we actually mean when we speak of “cognitive” or “evolutionary” linguistics, respectively.

I might come back to this meta-scientific discussion in a later post. For now, I will confine myself to giving a brief overview of the eight talks in our session. The full abstracts can be found here.

In the first talk, Vyv Evans (Bangor) proposes a two-step scenario of the evolution of language, informed by concepts from Cognitive Linguistics in general and Langacker’s Cognitive Grammar in particular:

The first stage, logically, had to be a symbolic reference in what I term a words-to-world direction, bootstrapping extant capacities that Autralopithecines, and later ancestral Homo shared with the great apes. But the emergence of a grammatical capacity is also associated with a shift towards a words-to-words direction symbolic reference: words and other grammatical constructions can symbolically refer to other symbolic units.

Roz Frank (Iowa) then outlines “The relevance of a ‘Complex Adaptive Systems’ approach to ‘language’” – note the scarequotes. She argues that “the CAS approach serves to replace older historical linguistic notions of languages as ‘organisms’ and as ‘species’”.

Sabine van der Ham, Hannah Little, Kerem Eryılmaz, and Bart de Boer (Brussels) then talk about two sets of experiments investigating the role of individual learning biases and cultural transmission in shaping language, in a talk entitled “Experimental Evidence on the Emergence of Phonological Structure”.

In the next talk, Seán Roberts and Stephen Levinson (Nijmegen) provide experimental evidence for the hypothesis that “On-line pressures from turn taking constrain the cultural evolution of word order”. Chris Sinha’s talk, entitled “Eco-Evo-Devo: Biocultural synergies in language evolution”, is more theoretical in nature, but no less interesting. Starting from the hypothesis that “many species construct “artefactual” niches, and language itself may be considered as a transcultural component of the species-specific human biocultural niche”, he argues that

Treating language as a biocultural niche yields a new perspective on both the human language capacity and on the evolution of this capacity. It also enables us to understand the significance of language as the symbolic ground of the special subclass of symbolic cognitive artefacts.

Arie Verhagen (Leiden) then discusses the question if public and private communication are “Stages in the Evolution of Language”.  He argues against Tomasello’s idea that ““joint” intentionality emerged first and evolved into what is essentially still its present state, which set the stage for the subsequent evolution of “collective” intentionality” and instead defends the view that

these two kinds of processes and capacities evolved ‘in tandem’: A gradual increase in the role of culture (learned patterns of behaviour) produced differences and thus competition between groups of (proto-)humans, which in turn provided selection pressures for an increased capability and motivation of individuals to engage in collaborative activities with others.

James Winters (Edinburgh) then provides experimental evidence that “Linguistic systems adapt to their contextual niche”, addressing two major questions with the help of an artificial-language communication game:

(i) To what extent does the situational context influence the encoding of features in the linguistic system? (ii) How does the effect of the situational context work its way into the structure of language?

His results “support the general hypothesis that language structure adapts to the situational contexts in which it is learned and used, with short-term strategies for conveying the intended meaning feeding back into long-term, system-wider changes.”

The final talk, entitled “Communicating events using bodily mimesis with and without vocalization” is co-authored by Jordan Zlatev, Sławomir Wacewicz, Przemysław Żywiczyński,  andJoost van de Weijer (Lund/Torun). They introduce an experiment on event communication and discuss to what extent the greater potential for iconic representation in bodily reenactment compared to in vocalization might lend support for a “bodily mimesis hypothesis of language origins”.

In the closing session of the workshop, this highly promising array of papers is discussed with one of the “founding fathers” of modern language evolution research, Jim Hurford (Edinburgh).

But that’s not all: Just one coffee break after the theme session, there will be a panel on “Language and Evolution” in the general session of the conference, featuring papers by Gareth Roberts & Maryia Fedzechkina; Jonas Nölle; Carmen Saldana, Simon Kirby & Kenny Smith; Yasamin Motamedi, Kenny Smith, Marieke Schouwstra & Simon Kirby; and Andrew Feeney.

A Note on Dennett’s Curious Comparison of Words and Apps

I continue to think about Dan Dennett’s inadequate account of words-as-memes in his paper, The Cultural Evolution of Words and Other Thinking Tools (PDF), Cold Spring Harbor Symposia on Quantitative Biology, Volume LXXIV, pp. 1-7, 2009. You find the same account in, for example, this video of a talk he gave in 2011: “A Human Mind as an Upside Down Brain”. I feel it warrants (yet another) long-form post. But I just don’t want to wrangle my way through that now. So I’m just going to offer a remark that goes a bit beyond what I’ve already said in my working paper, Cultural Evolution, Memes, and the Trouble with Dan Dennett, particularly in the post, Watch Out, Dan Dennett, Your Mind’s Changing Up on You!.

In that article Dennett asserts that “Words are not just like software viruses; they are software viruses, a fact that emerges quite uncontroversially once we adjust our understanding of computation and software.” He then uses Java applets to illustrate this comparison. I believe the overstates the similarity between words and apps or viruses to the point where the comparison has little value. The adjustment of understanding that Dennett calls for is too extreme.

In particular, and here is my new point, it simply vitiates the use of computation as an idea in understanding the modeling mental processes. Dennett has spent much of his career arguing that the mind is fundamentally a computational process. Words are thus computational objects and our use of them is a computational process.

Real computational processes are precise in their nature and the requirements of their physical implementation – and there is always a physical implementation for real computation. Java is based on a certain kind of computational objects and processes, a certain style of computing. But not all computing is like that. What if natural language computing isn’t? What happens to the analogy then? Continue reading “A Note on Dennett’s Curious Comparison of Words and Apps”

How spurious correlations arise from inheritance and borrowing (with pictures)

James and I have written about Galton’s problem in large datasets.  Because two modern languages can have a common ancestor, the traits that they exhibit aren’t independent observations.  This can lead to spurious correlations: patterns in the data that are statistical artefacts rather than indications of causal links between traits.

However, I’ve often felt like we haven’t articulated the general concept very well.  For an upcoming paper, we created some diagrams that try to present the problem in its simplest form.

Spurious correlations can be caused by cultural inheritance 


Above is an illustration of how cultural inheritance can lead to spurious correlations.  At the top are three independent historical cultures, each of which has a bundle of various traits which are represented as coloured shapes.  Each trait is causally independent of the others.  On the right is a contingency table for the colours of triangles and squares.  There is no particular relationship between the colour of triangles and the colour of squares.  However, over time these cultures split into new cultures.  Along the bottom of the graph are the currently observable cultures.  We now see a pattern has emerged in the raw numbers (pink triangles occur with orange squares, and blue triangles occur with red squares).  The mechanism that brought about this pattern is simply that the traits are inherited together, with some combinations replicating more often than others: there is no causal mechanism whereby pink triangles are more likely to cause orange squares.

Spurious correlations can be caused by borrowing


Above is an illustration of how borrowing (or areal effects or horizontal cultural inheritance) can lead to spurious correlations.  Three cultures (left to right) evolve over time (top to bottom).  Each culture has a bundle of various traits which are represented as coloured shapes.  Each trait is causally independent of the others.  On the right is a count of the number of cultures with both blue triangles and red squares.  In the top generation, only one out of three cultures have both.  Over some period of time, the blue triangle is borrowed from the culture on the left to the culture in the middle, and then from the culture in the middle to the culture on the right.  By the end, all languages have blue triangles and red squares.  The mechanism that brought about this pattern is simply that one trait spread through the population: there is no causal mechanism whereby blue triangles are more likely to cause red squares.

A similar effect would be caused by a bundle of causally unrelated features being borrowed, as shown below.


Tone and Humidity: FAQ

Everett, Blasi & Roberts (2015) review literature on how inhaling dry air affects phonation, suggesting that lexical tone is harder to produce and perceive in dry environments.  This leads to a prediction that languages should adapt to this pressure, so that lexical tone should not be found in dry climates, and the paper presents statistical evidence in favour of this prediction.

Below are some frequently asked questions about the study (see also the previous blog post explaining the statistics).

Continue reading “Tone and Humidity: FAQ”

Reminder for upcoming conferences

The deadline is approaching for several relevant call for papers:

At this year’s International Congress of Phonetic Sciences in Glasgow there will be a special interest group on the Evolution of our phonetic capabilities. It will focus on the interaction between biological and cultural evolution and encourages work from different modalities too. The deadline is 16th Feb. The call for papers is here.

There’s also a special discussant session on Sound change and speech evolution at ICPhS headed by Andy Wedel. The deadline for the actual conference is 1st Feb. Call for Papers here.

The next event in the ways to (proto)language conference is being held in Rome! The deadline is also 1st Feb. Call for Papers here.

This year’s CogSci is being organised by the guys at Cognitive and Information Sciences at the University of California in Merced, who do some great stuff related to language evolution. The deadline is 1st Feb as well, and the call for paper is here. 

Happy submitting!

The Vocal Iconicity Challenge!

Do you fancy the prospect of putting your communication skills to the test and winning $1000? If so, you should probably go and check out The Vocal Iconicity Challenge: http://sapir.psych.wisc.edu/vocal-iconicity-challenge/

Devised by Gary Lupyan and Marcus Perlman, of the University of Wisconsin-Madison, the aim of the game is to devise a system of vocalizations to communicate a set of Paleolithic-relevant meanings. The team whose vocalizations are guessed most accurately will be crowned the Vocal Iconicity Champion (and win the $1000 Saussure Prize!). More information is on their website.

Evolve an App Name

Edit: The results are out!

I’m working with the Language in Interaction project to create an App game about linguistic diversity.  It’s a game where you listen to several recordings of people talking and have to match the ones who are speaking the same language.  It’s quite a lot like the Great Language Game, but we’re using many lesser-known languages from the DOBES archive.

But first – we need a name.  Help us create one with the power of Iterated Learning!

Click to take part in our 1-minute experiment to evolve an app name.

We’ll throw some app names at you, you try to remember them, then we throw your names at someone else.

Here’s a screenshot of the App in development:


(P.S.: I’ve done this kind of thing before to evolve a band name)

Languages adapt to their contextual niche (Winters, Kirby & Smith, 2014)

ResearchBlogging.orgLast week saw the publication of my latest paper, with co-authors Simon Kirby and Kenny Smith, looking at how languages adapt to their contextual niche (link to the OA version and here’s the original). Here’s the abstract:

It is well established that context plays a fundamental role in how we learn and use language. Here we explore how context links short-term language use with the long-term emergence of different types of language systems. Using an iterated learning model of cultural transmission, the current study experimentally investigates the role of the communicative situation in which an utterance is produced (situational context) and how it influences the emergence of three types of linguistic systems: underspecified languages (where only some dimensions of meaning are encoded linguistically), holistic systems (lacking systematic structure) and systematic languages (consisting of compound signals encoding both category-level and individuating dimensions of meaning). To do this, we set up a discrimination task in a communication game and manipulated whether the feature dimension shape was relevant or not in discriminating between two referents. The experimental languages gradually evolved to encode information relevant to the task of achieving communicative success, given the situational context in which they are learned and used, resulting in the emergence of different linguistic systems. These results suggest language systems adapt to their contextual niche over iterated learning.


Context clearly plays an important role in how we learn and use language. Without this contextual scaffolding, and our inferential capacities, the use of language in everyday interactions would appear highly ambiguous. And even though ambiguous language can and does cause problems (as hilariously highlighted by the ‘What’s a chicken?’ case), it is also considered to be communicatively functional (see Piantadosi et al., 2012).  In short: context helps in reducing uncertainty about the intended meaning.

If context is used as a resource in reducing uncertainty, then it might also alter our conception of how an optimal communication system should be structured (e.g., Zipf, 1949). With this in mind, we wanted to investigate the following questions: (i) To what extent does the context influence the encoding of features in the linguistic system? (ii) How does the effect of context work its way into the structure of language?  To get at these questions we narrowed our focus to look at the situational context: the immediate communicative environment in which an utterance is situated and how it influences the distinctions a speaker needs to convey.

Of particular relevance here is Silvey, Kirby & Smith (2014): they show that the incorporation of a situational context can change the extent to which an evolving language encodes certain features of referents. Using a pseudo-communicative task, where participants needed to discriminate between a target and a distractor meaning, the authors were able to manipulate which meaning dimensions (shape, colour, and motion) were relevant and irrelevant in conveying the intended meaning. Over successive generations of participants, the languages converged on underspecified systems that encoded the feature dimension which was relevant for discriminating between meanings.

The current work extends upon these findings in two ways: (a) we added a communication element to the setup, and (b) we further explored the types of situational context we could manipulate.  Our general hypothesis, then, is that these artificial languages should adapt to the situational context in predictable ways based on whether or not a distinction is relevant in communication.

Continue reading “Languages adapt to their contextual niche (Winters, Kirby & Smith, 2014)”