In a lineage of ancestors, humans are the only species left without a coat of body hair. Keeping in mind thermoregulation of bare skin, we speculate conditions for evolution of nakedness. Can it be coupled with bipedality?
One of the salient features of the Mammals group is possession of body hair. Well, most of them at least. But we stand living proof against that. How, where and why did our body hair disappear and nakedness evolve? While Darwin argued that nakedness evolved for sexual ornamental purposes, Andersson disagrees on the premise that, if sexual traits like a shiny plumage are indicative of good health, skin devoid of hair would convey poor health and won’t attract mates. It is important to determine the initial step of this denudation. A coat of body hair prevents too much heat reaching the body in daytime as well as shielding from cold at night. Protection from wind, wounds, bites, and UV radiation also feature in the advantages. Why then, did Homo sapiens end up losing one great layer of protection? If one believes in ‘Survival of the Fittest’, the benefits stemming from near disappearance of human body hair must surely be great enough to outweigh the costs of these protective functions. The repository of hypotheses trying to explain this step of evolution is still growing.
This year at EvoLang, I’m releasing CHIELD: The Causal Hypotheses in Evolutionary Linguistics Database. It’s a collection of theories about the evolution of language, expressed as causal graphs. The aim of CHIELD is to build a comprehensive overview of evolutionary approaches to language. Hopefully it’ll help us find competing and supporting evidence, link hypotheses together into bigger theories and generally help make our ideas more transparent. You can access CHIELD right now, but hang around for details of the challenges.
The first thing that CHIELD can help express is the (sometimes unexpected) causal complexity of theories. For example, Dunbar (2004) suggests that gossip replaced physical grooming in humans to support increasingly complicated social interactions in larger groups. However, the whole theory is actually composed of 29 links, involving predation risk, endorphins and resource density:
The graph above might seem very complicated, but it was actually constructed just by going through the text of Dunbar (2004) and recording each claim about variables that were causally linked. By dividing the theory into individual links it becomes easier to think about each part.
Second, CHIELD also helps find other theories that intersect with this one through variables like theory of mind, population size or the problem of freeriders, so you can also use CHIELD to explore multiple documents at once. For example, here are all the connections that link population size and morphological complexity (9 papers so far in the database):
The first thing to notice is that there are multiple hypotheses about how population size and morphological complexity are linked. We can also see at a glance that there are different types of evidence for each link. Some are supported from multiple studies and methods, while others are currently just hypotheses without direct evidence.
However, CHIELD won’t work without your help! CHIELD has built-in tools for you – yes YOU – to contribute. You can edit data, discuss problems and add your own hypotheses. It’s far from perfect and of course there will be disagreements. But hopefully it will lead to productive discussions and a more cohesive field.
Which brings us to the challenges …
The EvoLang Causal Graph challenge: Contribute your own hypotheses
You can add data to CHIELD using the web interface. The challenge is to draw your EvoLang paper as a causal graph. It’s fun! The first two papers to be contributed will become part of my poster at EvoLang.
Here are some tips:
Break down your hypothesis into individual causal links.
Try to use existing variable names, so that your hypothesis connects to other work. You can find a list of variables here, or the web interface will suggest some. But don’t be afraid to add new variables.
Try to add direct quotes from the paper to the “Notes” field to support the link.
If your paper is already included, do you agree about the interpretation? If not, you can raise an issue or edit the data yourself.
Bonus Challenge: Contribute 5 papers, become a co-author!
I’ll be writing an article about the database and some initial findings for the Journal of Language Evolution. If you contribute 5 papers or more, then you’ll be added as a co-author. As an incentive to contribute further, co-authors will be ordered by the number of papers they contribute. This offer is open to anyone studying evolutionary linguistics, not just people presenting at EvoLang. You should check first whether the paper you want to add has already been included.
Bonus Challenge: Contribute some code, become a co-author!
CHIELD is open source. The GitHub repository for CHIELD has some outstanding issues. If you contribute some programming to address them, you’ll become a co-author on the journal article.
We live in an age where we have more data on more languages than ever before, and more data to link it with from other domains. This should make it easier to test hypotheses involving adaptation, and also to spot new patterns that might be explained by adaptation. For example, the proposed link between climate and tone languages could never have been investigated without massive global databases. However, there is not much discussion of the overall approach to research in this area.
This week I published a paper in a special issue on the Adaptive Value of Langauges, outlining the maximum robustness approach to these problems. I then try to apply this approach to the debate about the link between tones and climate.
In a nutshell, I suggest that research should be:
Instead of aiming for the most valid test for a hypothesis, we should consider as many sources of data and as many processes as possible. Agreement between them supports a theory, but differences can also highlight which parts of a theory are weak.
Researchers should be more explicit about the causal effects in their hypotheses. Formal tools from causal graph theory can help formulate tests, recognise weaknesses and avoid talking past each other.
Realistically, a single paper can’t be the final word on a topic, and shouldn’t aim to. Statistical studies of large-scale, cross-cultural data are very complicated, and we should expect small steps to establishing causality.
I applying these ideas to the debate about tone and climate. Caleb Everett also published a paper in this issue showing that speakers in drier regions use vowels less frequently in their basic vocabulary. I test whether the original link with tone and the new link with vowels holds up when using different data sources and different statistical frameworks. The correlation with tone is not robust, while the correlation with vowels seems more promising.
I then suggest some ideas for alternative methodological approaches to this theory that could be tested. For example:
An iterated artificial learning experiment
A phonetic study of vowel systems
A historical case-study of 5 Bantu languages
A corpus study of tone use in Cantonese and conversational repair in Mandarin
In 2016, Casey Hattrey combined literary genres that had long been kept far apart from each other: science fiction, academic funding applications and cultural evolution theory. Space Funding Crisis I: Persister was a story that tried to “put the fun in academic funding application and the itch in hyper-niche”. It was criticised as “unrealistic and too centered on academics to be believable” and “not a very good book”. Dan Dediu’s advice was “better not even start reading it,” and Fiona Jordan’s review was literally a four-letter word. Still, that hasn’t stopped Hattrey from writing the sequel that the title of the first book tried to warn us about.
Space Funding Crisis II: Resister continues to follow the career of space linguist Karen Arianne. Just when she thought she’d gotten out of academia, the shadowy Central Academic Funding Council Administration pulls her back in for one more job. Or at least a part-time post-doc. Her mission: solve the mystery of the great convergence. Over thousands of years of space-faring, human linguistic diversity has exploded, but suddenly people have started speaking the same language. What could have caused this sinister twist? Who are the Panini Press? And what exactly is research insurance? Arianne’s latest adventure sees her struggle against ‘splainer bots, the conference mafia and her own inability to think about the future.
To say that this was the “difficult second book” would give too much credit to the first. Hattrey seems to have learned nothing about writing or science since the last time they ventured into the weird world of self-published online novels. The characters have no distinct voice, the plot doesn’t make much sense and there are eye-watering levels of exposition. In the appendix there’s even an R script which supports some of the book’s predictions, and even that is badly composed. Even some of the apparently over-the-top futuristic ideas like insurance for research hypotheses are a bit behind existing ideas like using prediction markets for assessing replicability.
If there is a theme between the poorly formatted pages, then it’s emergence: complex patterns arising from simple rules. Arianne has a kind of spiritual belief in just reacting, Breitenberg-like, to the here-and-now rather than planning ahead. Apparently Hattrey intends this to translate into a criticism of the pressures of early-career academic life. But this never really materialises out of the bland dialogue and insistence on putting lasers everywhere.
Still, where else are you going to find a book that makes fun of the slow science movement, generative linguistics and theories linking the emergence of tone systems to the climate?
A lot of evolutionary talks and papers nowadays touch upon language complexity (at least nine papers did this at the Evolang 2016). One of the reasons is probably that complexity is a very convenient testbed for testing hypotheses that establish causal links between linguistic structure and extra-linguistic factors. Do factors such as population size, or social network structure, or proportion of non-native speakers shape language change, making certain structures (for instance, those that are morphologically simpler) more evolutionary advantageous and thus more likely? Or don’t they? If they do, how exactly?
Recently, quite a lot has been published on that topic, including attempts to do rigorous quantitative tests of the existing hypotheses. One problem that all such attempts face is that complexity can be understood in many different ways, and operationalized in yet many more. And unsurprisingly, the outcome of a quantitative study depends on what you choose as your measure! Unfortunately, there currently is little consensus about how measures themselves can be evaluated and compared.
To overcome this, we organize a shared task “Measuring Language Complexity”, a satellite event of Evolang 2018, to take place in Torun on April 15. Shared tasks are widely used in computational linguistics, and we strongly believe they can prove useful in evolutionary linguistics, too. The task is to measure the linguistic complexity of a predefined set of 37 language varieties belonging to 7 families (and then discuss the results, as well as their mutual agreement/disagreement at the workshop). See the detailed CfP and other details here.
So far, the interest from the evolutionary community has been rather weak. But there is still time! We extended the deadline until February 28 and are looking forward to receiving your submissions!
As mentioned in this blog before, evolutionary thinking can help the study of various cultural practices, not just language. The perspective of cultural evolution is currently seeing an interesting case of global growth and coordination – the widely featured founding of the Cultural Evolution Society (also on replicatedtypo), the recent inaugural conference and follow-ups are bringing a diverse set of researchers around the same table. If this has gone past you unnoticed – there’s nice resourcesgathered on the society website.
Evolutionary thinking seems useful for various purposes. However does it work the same everywhere, and can research progress in one domain be easily carried over to another?
We invite contributions from cultural evolution researchers of various persuasions and interests to talk about their work and how the evolutionary models help with that. Deadline for abstracts on Feb 14.
Discussion of individual contributions will hopefully lead to a better understanding of commonalities and differences in how cultural evolution is applied in different areas, and help build an understanding of how to most productively use evolutionary thinking – what are the prospects and limitations. We aim to allow for building a common ground through plenty of space and opportunities for formal and informal discussion on site.
Both case studies and general perspectives welcome. In addition to original research we encourage participants to think of the following questions:
– What do you get out of cultural evolution research?
– How should we best apply evolutionary thinking to culture?
– What matters when we apply this to different domains or timescales?
Deadline for abstracts: February 14, 2018
Event dates: June 6-8
Location: Tartu University, Estonia
Full call for papers and information on the website. Also available as PDF.
This year’s topic is ‘triggers of change’: What causes a sound system or lexicon or grammatical system to change? How can we explain rapid changes followed by periods of stability? Can we predict the direction and rate of change according to external influences?
We have also added two new researchers to our keynote speaker list, which now stands as:
A new paper by Anita Slonimska and myself attempts to link global tendencies in the lexicon to constraints from turn taking in conversation.
Question words in English sound similar (who, why, where, what …), so much so that this class of words are often referred to as wh-words. This regularity exists in many languages, though the phonetic similarity differs, for example:
eTa; eedi; ekkaDa
In her Master’s thesis, Anita suggested that these similarities help conversation flow smoothly. Turn taking in conversation is surprisingly swift, with the usual gap between turns being only 200ms. This is even more surprising when one considers that the amount of time it takes to retrieve, plan and begin pronouncing one word is 600ms. Therefore, speakers must begin planning what they will say before current speaker has finished speaking (as demonstrated by many recent studies, e.g. Barthel et al., 2017). Starting your turn late can be interpreted as uncooperative, or lead to missing out on a chance to speak.
Perhaps the harshest environment for turn-taking is answering a content question. Responders must understand the question, retrieve the answer, plan their utterance and begin speaking. It makes sense to expect that cues would evolve to help responders recognise that a question is coming. Indeed there are many paralinguistic cues, such as rising intonation (even at the beginning of sentences) and eye gaze. Another obvious cue is question words, especially when they appear at the beginning of question sentences. Slonimska hypothesised that wh-words sound similar in order to provide an extra cue that a question is about to be asked, so that the speaker can begin preparing their turn early.
We tried to test this hypothesis, firstly by simply asking whether wh-words really do have a tendency to sound similar within languages. We combined several lexical databases to produce a word list for 1000 concepts in 226 languages, including question words. We found that question words are:
More similar within languages than between languages
More similar than other sets of words (e.g. pronouns)
Often composed of salient phonemes
Of course, there are several possible confounds, such as languages being historically related, and many wh-words being derived from other wh-words within a language. We attempted to control for this using stratified permutation, excluding analysable forms, and comparing wh words to many other sets of words such as pronouns which are subject to the same processes. Not all languages have similar-sounding wh-words, but across the whole database the tendancy was robust.
Another prediction is that the wh-word cues should be more useful if they appear at the beginning of question sentences. We tested this using typological data on whether wh-words appear in initial position. While the trend was in the right direction, the result was not significant when controlling for historical and areal relationships.
Despite this, we hope that our study shows that it is possible to connect constraints from turn taking to macro-level patterns across languages, and then test the link using large corpora and custom methods.
Anita will be presenting an experimental approach to this question at this year’s CogSci conference. We show that /w,h/ is a good predictor of questions in real English conversations, and that people actually use /w,h/ to help predict that a question is coming up.
Slonimska, A., & Roberts, S. G. (2017). A case for systematic sound symbolism in pragmatics: Universals in wh-words. Journal of Pragmatics, 116, 1-20. Article. PDF.
A new paper by Monica Tamariz, myself, Isidro Martínez and Julio Santiago uses an iterated learning paradigm to investigate the emergence of iconicity in the lexicon. The languages were mappings between written forms and a set of shapes that varied in colour, outline and, importantly, how spiky or round they were.
We found that languages which begin with no iconic mapping develop a bouba-kiki relationship when the languages are used for communication between two participants, but not when they are just learned and reproduced. The measure of the iconicity of the words came from naive raters.
Here’s one of the languages at the end of a communication chain, and you can see that the labels for spiky shapes ‘sound’ more spiky:
These experiments were actually run way back in 2013, but as is often the case, the project lost momentum. Monica and I met last year to look at it again, and we did some new analyses. We worked out whether each new innovation that participants created increased or decreased iconicity. We found that new innovations are equally likely to result in higher or lower iconicity: mutation is random. However, in the communication condition, participants re-used more iconic forms: selection is biased. That fits with a number of other studies on iconicity, including Verhoef et al., 2015 (CogSci proceedings) and Blasi et al. (2017).
Matthew Jones, Gabriella Vigliocco and colleagues have been working on similar experiments, though their results are slightly different. Jones presented this work at the recent symposium on iconicity in language and literature (you can read the abstract here), and will also present at this year’s CogSci conference, which I’m looking forward to reading:
Jones, M., Vinson, D., Clostre, N., Zhu, A. L., Santiago, J., Vigliocco, G. (forthcoming). The bouba effect: sound-shape iconicity in iterated and implicit learning. Proceedings of the 36th Annual Meeting of the Cognitive Science Society.
Our paper is quite short, so I won’t spend any more time on it here, apart from one other cool thing: For the final set of labels in each generation we measured iconicity using scores from nieve raters, but for the analysis of innovations we had hundreds of extra forms. We used a random forest to predict iconicity ratings for the extra labels from unigrams and bigrams of the rated labels. It accounted for 89% of the variance in participant ratings on unseen data. This is a good improvement over some old techniques such as using the average iconicity of the individual letters in the label, since random forests allows the weighting of particular letters to be estimated from the data, and also allows for non-linear effects when two letters are combined.
However, it turns out that most of the prediction is done by this simple decision tree with just 3 unigram variables. Shapes were rated as more spiky if they contained a ‘k’, ‘j’ and ‘z’ (our experiment was run in Spanish):
So the method was a bit overkill in this case, but might be useful for future studies.
All data and code for doing the analyses and random forest prediction is available in the supporting information of the paper, or in this github repository.
One of the fundamental principles of linguistics is that speakers that are separated in time or space will start sound different, while speakers who interact with each other will start to sound similar. Historical linguists have traced the diversification of languages using objective linguistic measurements, but so far there has never been a widespread test of whether languages further away on a family tree or more physically distant from each other actually sound different to human listeners.
An opportunity arose to test this in the form of The Great Language Game: a web-based game where players listen to a clip of someone talking and have to guess which language is being spoken. It was played by nearly one million people from 80 countries, and so is, as far as we know, the biggest linguistic experiment ever. Actually, this is probably my favourite table I’ve ever published (note the last row):
Continent of IP-address
Number of guesses
We calculated the probability of confusing any of the 78 languages in the Great Language Game for any of the others (excluding guesses about a language if it was an official language of the country the player was in). Players were good at this game – on average getting 70% of guesses correct.
Using partial Mantel tests, we found that languages are more likely to be confused if they are:
Geographically close to each other;
Similar in their phoneme inventories
Similar in their lexicon
Closely related historically (but this effect disappears when controlling for geographic proximity)
We also used Random Forests analyses to show that a language is more likely to be guessed correctly if it is often mentioned in literature, is the main language of an economically powerful country, is spoken by many people or is spoken in many countries.
We visualised the perceptual similarity of languages by using the inverse probability of confusion to create a neighbour net:
This diagram shows a kind of subway map for the way languages sound. The shortest route between two languages indicates how often they are confused for one another – so Swedish and Norwegian sound similar, but Italian and Japanese sound very different. The further you have to travel, the more different two languages sound. So French and German are far away from many languages, since these were the best-guessed in the corpus.
The labels we’ve given to some of the clusters are descriptive, rather than being official terms that linguists use. The first striking pattern is that some languages are more closely connected than others, for example the Slavic languages are all grouped together, indicating that people have a hard time distinguishing between them. Some of the other groups are more based on geographic area, such as the ‘Dravidian’ or ‘African’ cluster. The ‘North Sea’ cluster is interesting: it includes Welsh, Scottish Gaelic, Dutch, Danish, Swedish, Norwegian and Icelandic. These diverged from each other a long time ago in the Indo-European family tree, but have had more recent contact due to trade and invasion across the North Sea.
The whole graph splits between ‘Western’ and ‘Eastern’ languages (we refer to the political/cultural divide rather than any linguistic classification). This probably reflects the fact that most players were Western, or at least could probably read the English website. That would certainly explain the linguistically confused “East Asian” cluster. There are also a lot of interconnected lines, which indicates that some languages are confused for multiple groups, for example Turkish is placed halfway between “West” and “East” languages.
It was also possible to create neighbour nets for responses from specific parts of the world. While the general pattern is similar, there are also some interesting differences. For example, respondents from North America were quite likely to confused Yiddish and Hebrew. They come from different language families, but are spoken by a mainly Jewish population and this may form part of players’ cultural knowledge of these languages.
In contrast, players from African placed Hebrew with the other Afro-Asiatic languages.
Results like this suggest that perception may be shaped by our linguistic history and cultural knowledge.
We also did some preliminary analyses on the phoneme inventories of languages, using a binary decision tree to explore which sounds made a language distinctive. Binary decision trees identified some rare and salient features as critical cues to distinctiveness.
The analyses were complicated because we knew little about the individuals playing beyond the country of their IP address. However, Hedvig and I, together with a team from the Language in Interaction consortium (Mark Dingemanse, Pashiera Barkhuysen and Peter Withers) create a version of the game called LingQuest that does collect people’s linguistic background. It also asks participants to compare sound files directly, rather than use written labels.