Update: I have carried out some more analyses that paint a different picture to the one presented below. Oops!
A recently accepted paper by Keith Chen has been getting a lot of press coverage. Chen has discovered a close link between the properties of the language people speak and their economic decisions. People who speak languages which mark the future tense differently to the present tense tend to make fewer provisions for the future. This includes economic decisions such as being less likely to save money, but also secondary indicators such as greater prevalence of smoking and obesity.
The hypothesis is that marking the future tense differently makes the future seem further away, and therefore you are less likely to plan for the future.
Chen has talked about this hypothesis at a TED conference and has been covered in the media, most recently in a BBC economics column (which, to be fair, was fairly critical). The hypothesis has been criticised by several linguists, notably on language log (and a great model post by Mark Liberman), where Chen gave a response. The data has been criticised (e.g. English is marked as 'strong future tense marking', but has a range of ways of using present tense for future time reference), as well as the thinking behind the hypothesis itself (e.g. why wouldn't marking a difference in the language actually make the future MORE salient?). Some have also pointed out weaknesses in the statistical claim, for instance, Östen Dahl has pointed out that speaking a language with front rounded vowels is also a good predictor of economic decisions.
Here at Replicated Typo, we have discussed many cases of spurious correlations - statistical links between cultural traits that are unlikely to be causal. James Winters and I recently published a paper on the dangers of making claims based on large-scale, cross-cultural statistics. Basically, it's very easy to find statistical links between any two variables because cultrual traits are inherited in bundles (they are not independent).
In this post, I address an issue that I haven't seen systematically answered yet: Chen predicts that there is a correlation between future tense marking and economic decisions, and finds a strong link. However, he should also predict that future tense marking is a stronger predictor than other linguistic variables. In other words, can we find a different aspect of language that is even better at predicting economic behaviour? Here I test the link between the propensity to save money and many different linguistic factors.
Looking for correlations
I used the world values survey, used by Chen, to compare the propensity to save money with 144 features from the world atlas of language structures. For each linguistic variable, I ran a linear regression with propensity to save money as the dependent variable and independent variables including the linguistic variable, age, sex, employment status, marriage status, religion, number of children and survey year. I compared the F-statistic (goodness of model fit) for each regression. This is the same approach as in Chen's analysis.
The future tense marking variable had an F-statistic greater than 65% of the linguistic variables. Below is a histogram of the resulting F-statistics with a red line indicating the strength of the future tense variable.
Here are ten best linguistic variables for predicting economic decisions:
- Uvular Consonants
- Third Person Zero of Verbal Person Marking
- Order of Relative Clause and Noun
- Alignment of Verbal Person Marking
- M-T Pronouns
- Ditransitive Constructions: The Verb Give
- Position of Interrogative Phrases in Content Questions
- Indefinite Pronouns
- Order of Degree Word and Adjective
And here are some graphs demonstrating links between economic decisions and linguistic variables:
Doing the analysis by aggregating over language families produces a similar result: the future tense variable has an F-score greater than 70% of the linguistic variables.
In contrast to Chen's hypothesis, linguistic variables that have nothing to do with concepts of time are equally good at predicting economic decisions. That is, a hypothesis that linked the presence of uvular consonants to economic decisions would have equal statistical support as the 'Whorfian economics' hypothesis. It's quite difficult to imagine a causal reason for such a link (indeed, I'm claiming that there isn't a direct causal link), and so Chen's hypothesis seems more plausible. But then the question arises: what is the role of statistical analyses in evaluating hypotheses?
James Winters and I argue that statistical analyses can help generate, motivate and explore hypotheses. However, they have weaknesses that need to be supported by other methods such as experiments, models and theoretical work (Roberts & Winters, 2012). Chen has recently tested his hypothesis looking at uses of morphological future tense marking in weather reports (here). While this is impressive, it still relies on large-scale, cross-cultural statistics. It would be interesting to see whether economic decisions could be manipulated in a lab experiment by priming different expressions of time (e.g. Chen's example of "Rain is likely this weekend.":present tense 'is', versus "It will likely rain this weekend.": future tense 'will rain'). I'd be surprised if someone isn't already running this experiment.
Change over time
Another point I'd like to raise is the variation in economic decisions over time. Below are some graphs illustrating the change in economic decisions within linguistic groups:
The proportion of people speaking a given language can change by up to 80% over the years the survey was carried out. For example, 100% of Italian speakers were saving money in 1997, compared to 18.4% in 2000. This gives an idea of how small the effect of language might be.
There is a strong statistical link between future tense marking and economic decisions. This is an intriguing finding, and Chen's hypothesis is very interesting and deserves further research. However, this link may not be significantly greater than links between other cultural traits, so there's no principled reason to highlight this link in particular without further evidence.
Without a more fleshed-out theory of how linguistic categorisation of time is related to perception of time, this kind of approach risks damaging the reputation of more sophisticated work on linguistic relativity, and also the reputation of economics. More importantly, misinterpretations of this work could lead to changes in public perception and policy. For example, one journalist covered the story using the title "Want to end the various global debt crises? Try abandoning English, Greek, and Italian in favor of German, Finnish, and Korean."
Chen, M. (2011). The Effect of Language on Economic Behavior: Evidence from Savings Rates, Health Behaviors, and Retirement Assets SSRN Electronic Journal DOI: 10.2139/ssrn.1914379
Chen's paper can be read here
Roberts, S. & Winters, J. (2012). Social Structure and Language Structure: the New Nomothetic Approach. Psychology of Language Learning, 16 (2), 89-112 : 10.2478/v10057-012-0008-6