The Color Game: Challenges for App projects

Over at ICCI are a couple of blog posts by Olivier Morin about project I’m involved in, the Color Game. The first post provides an introduction to the app and how it will contribute to research on language and communication. And, as I mentioned on Twitter, the second blog post highlights one of the Color Game’s distinct advantages over traditional experiments:

An ambitious project

What I want to briefly mention is that the Color Game is an extremely ambitious project that marks the culmination of two years worth of work. A major challenge from a scientific perspective has been to design multiple projects that get the most out the potential data. Experiments are normally  laser-focused on meticulously testing a narrow set of predictions. This is quite rightly viewed as a positive quality, and it is why well-designed experiments are far better suited for discerning mechanistic and causal explanations than other research methods. But I think the Color Game does make some headway in addressing long-standing constraints:

  • Limitations in sample size and representation.
  • Technical challenges of scaling up complex methods.
  • Underlying motivation for participation.
Sample size and representation

Discussions about the limitations of experiments in terms of sample size and the sample they are representing are abundant. Such issues are particularly prevalent in the ongoing  replication and reproducibility crisis. Just looking at the first week of data for the Color Game and there are already over a 1000 players from a wide variety of countries:

Color Game players from around the world. Darker, redder colours indicate more concentrated regions of players. From: http://cognitionandculture.net/blog/color-game/the-color-games-world

By contrast, many psychological experiments will be lucky to get an n of 100, and this number is often determined on the basis of reaching sufficient statistical power for the analyses (cautionary note: having a large sample size can also be the source of big inferential errors). It is also the case that standard psychology populations are distinctly WEIRD. Apps can help connect researchers with populations normally inaccessible, especially given the proliferation of mobile phones.

Technical challenges

The Color Game’s larger and more diverse sample leads to my second point: that scaling up complex methods is both costly and technically challenging. Even though web experiments are booming, and this can mitigate the downside of having a small n, they are often extremely simple and restricted. Prioritising simplicity is fine if it is premised on scientific principles, but there is also the temptation to make design choices for reasons of expediency.

So, to give one example, if you want participants to complete your experiment, then making the experiment shorter (through restricting the number of trials and/or the time it takes to complete a trial) increases the probability of finishing. It can also lead to implementing methodological decisions to make the task technically easier. All else being equal, it is simpler to create a pseudo-communicative task (where the participant is told they are communicating with someone, even though they aren’t) than it is to create an actual communicative task. Same goes for using feedback over repair mechanisms.

All experiments are faced with these problems. But, anecdotally, it seems to be acutely problematic for web-based experiments.  Just to be clear: I’m not making a judgement about whether or not a study suffered from making a particular methodological choice. The point is to simply say that these design choices should (where possible) first consider the scientific consequences above technical and practical expediency. My worry is that when scientific considerations are not prioritised, you lose too much in terms of generalisability to real world phenomena. And, even when this is not the case and the experiment is justifiably simple, I wouldn’t be surprised to find that this creates a bias in the types of web experiments performed. In short, there’s the possibility that web-based experiments systematically underutilise certain methodological designs, leading to a situation where web-experiments occupy and explore a much narrower region of the design space.

I hope that the Color Game makes some small steps towards avoiding this pitfall. For instance, we incorporated features not often found in other web-based communication game experiments, such as the ability to communicate synchronously or asynchronously and for participants to engage in simple repair mechanisms instead of receiving feedback. Players are also free to choose who they want to play with in the forum, giving a much more naturalistic flavour to the interaction dynamics. This allows for self-organisation and it’ll be interesting to see what role (if any) the emergent population structure plays in structuring the languages. App games therefore offer a promising avenue for retaining the technically complex features of traditional lab experiments whilst profiting from the larger sample sizes of web experiments.

Having a more complex set up also allowed us to pre-register six  projects that aim to answer distinct questions about the emergence and evolution of communication systems. To achieve a similar goal with other methods is far more costly in terms of time and money. But there are downsides. One of which is that the changes and requirements imposed by a single project can impact the scope and design of all the other projects. Imagine you have a project which requires that the population size parameter is manipulated (FYI, this is not a Color Game project): every other project now needs to control for this fact be it through methodological choices (e.g., you only sample populations with number of players) or in the statistical analyses.

In some sense, this reintroduces the complexity of the real-world back into the app, both in terms of its upsides and downsides. Suffice to say, we tried to minimise these conflicts as much as possible, but in some cases they were simply unavoidable. Also, even if there are cases where this introduces unforeseen consequences in the reliability of our results, we can always follow up on our findings with more traditional lab experiments and computer models.

Underlying motivation

Assuming I haven’t managed to annoy anyone who isn’t using app-based experiments, I’ve saved my most controversial point for last. It’s a hard sell, and I’m not even sure I fully buy it, but I think the underlying motivation for playing apps is very different from participating in a standard experiment. At the task level, the Color Game is not too dissimilar from other experiments, as you receive motivation to continue playing via points and to get points in the first place you need to be successful in communication. Where it differs is in terms of why people participate in the first place. In short, the Color Game is different because people principally play it for entertainment (or, at least, that’s what I keep telling myself). Although lab-based experiments are often fun, this normally stands as an ancillary concern that’s not considered crucial to the scientific merits of a study.

Undergraduate experiments are (in)famously built on rewards of cookies and cohort obligations, and it is fair to say that most lab experiments incentivise participation via monetary remuneration (although this might not be the only reason why someone participates). Yet, humans engage in all sorts of behaviours for endogenous rewards, and app games are really nice examples of such behaviour. People are free to download the game (or not), they can play as little or as much as they please, and as I’ve already mentioned there is freedom in their choice of interaction partners. Similarly, in the real-world, people have flexibility in when and why they engage in communicative behaviour, with monetary gain being just a small subset (e.g., a large part of why you don’t have to go far to find a motivational speaker is because they earn money for public lectures and other speaking events).

If you’re interested, and want to see what all the fuss is about, feel free to download the app (available on Android and iOS):

 

What’s in a Name? – “Digital Humanities” [#DH] and “Computational Linguistics”

In thinking about the recent LARB critique of digital humanities and of responses to it I couldn’t help but think, once again, about the term itself: “digital humanities.” One criticism is simply that Allington, Brouillette, and Golumbia (ABG) had a circumscribed conception of DH that left too much out of account. But then the term has such a diverse range of reference that discussing DH in a way that is both coherent and compact is all but impossible. Moreover, that diffuseness has led some people in the field to distance themselves from the term.

And so I found my way to some articles that Matthew Kirschenbaum has written more or less about the term itself. But I also found myself thinking about another term, one considerably older: “computational linguistics.” While it has not been problematic in the way DH is proving to be, it was coined under the pressure of practical circumstances and the discipline it names has changed out from under it. Both terms, of course, must grapple with the complex intrusion of computing machines into our life ways.

Digital Humanities

Let’s begin with Kirschenbaum’s “Digital Humanities as/Is a Tactical Term” from Debates in the Digital Humanities (2011):

To assert that digital humanities is a “tactical” coinage is not simply to indulge in neopragmatic relativism. Rather, it is to insist on the reality of circumstances in which it is unabashedly deployed to get things done—“things” that might include getting a faculty line or funding a staff position, establishing a curriculum, revamping a lab, or launching a center. At a moment when the academy in general and the humanities in particular are the objects of massive and wrenching changes, digital humanities emerges as a rare vector for jujitsu, simultaneously serving to position the humanities at the very forefront of certain value-laden agendas—entrepreneurship, openness and public engagement, future-oriented thinking, collaboration, interdisciplinarity, big data, industry tie-ins, and distance or distributed education—while at the same time allowing for various forms of intrainstitutional mobility as new courses are approved, new colleagues are hired, new resources are allotted, and old resources are reallocated.

Just so, the way of the world.

Kirschenbaum then goes into the weeds of discussions that took place at the University of Virginia while a bunch of scholars where trying to form a discipline. So:

A tactically aware reading of the foregoing would note that tension had clearly centered on the gerund “computing” and its service connotations (and we might note that a verb functioning as a noun occupies a service posture even as a part of speech). “Media,” as a proper noun, enters the deliberations of the group already backed by the disciplinary machinery of “media studies” (also the name of the then new program at Virginia in which the curriculum would eventually be housed) and thus seems to offer a safer landing place. In addition, there is the implicit shift in emphasis from computing as numeric calculation to media and the representational spaces they inhabit—a move also compatible with the introduction of “knowledge representation” into the terms under discussion.

How we then get from “digital media” to “digital humanities” is an open question. There is no discussion of the lexical shift in the materials available online for the 2001–2 seminar, which is simply titled, ex cathedra, “Digital Humanities Curriculum Seminar.” The key substitution—“humanities” for “media”—seems straightforward enough, on the one hand serving to topically define the scope of the endeavor while also producing a novel construction to rescue it from the flats of the generic phrase “digital media.” And it preserves, by chiasmus, one half of the former appellation, though “humanities” is now simply a noun modified by an adjective.

And there we have it. Continue reading “What’s in a Name? – “Digital Humanities” [#DH] and “Computational Linguistics””

Chomsky, Hockett, Behaviorism and Statistics in Linguistics Theory

Here’s an interesting (and recent) article that speaks to statistical thought in linguistics: The Unmaking of a Modern Synthesis: Noam Chomsky, Charles Hockett, and the Politics of Behaviorism, 1955–1965 (Isis, vol. 17, #1, pp. 49-73: 2016), by Gregory Radick (abstract below). Commenting on it at Dan Everett’s FB page, Yorick Wilks observed: “It is a nice irony that statistical grammars, in the spirit of Hockett at least, have turned out to be the only ones that do effective parsing of sentences by computer.”

Abstract: A familiar story about mid-twentieth-century American psychology tells of the abandonment of behaviorism for cognitive science. Between these two, however, lay a scientific borderland, muddy and much traveled. This essay relocates the origins of the Chomskyan program in linguistics there. Following his introduction of transformational generative grammar, Noam Chomsky (b. 1928) mounted a highly publicized attack on behaviorist psychology. Yet when he first developed that approach to grammar, he was a defender of behaviorism. His antibehaviorism emerged only in the course of what became a systematic repudiation of the work of the Cornell linguist C. F. Hockett (1916–2000). In the name of the positivist Unity of Science movement, Hockett had synthesized an approach to grammar based on statistical communication theory; a behaviorist view of language acquisition in children as a process of association and analogy; and an interest in uncovering the Darwinian origins of language. In criticizing Hockett on grammar, Chomsky came to engage gradually and critically with the whole Hockettian synthesis. Situating Chomsky thus within his own disciplinary matrix suggests lessons for students of disciplinary politics generally and—famously with Chomsky—the place of political discipline within a scientific life.

The evolution of phonetic capabilities: causes, constraints and consequences

At next year’s International Congress of Phonetic Sciences in Glasgow there will be a special interest group on the Evolution of our phonetic capabilities. It will focus on the interaction between biological and cultural evolution and encourages work from different modalities too. The call for papers is below:

In recent years, there has been a resurgence in research in the evolution of language and speech. New techniques in computational and mathematical modelling, experimental paradigms, brain and vocal tract imaging, corpus analysis and animal studies, as well as new archeological evidence, have allowed us to address questions relevant to the evolution of our phonetic capabilities.

This workshop requests contributions from researchers which address the emergence of our phonetic capabilities. We are interested in empirical evidence from models and experiments which explore evolutionary pressures causing the emergence of our phonetic capabilities, both in biological and cultural evolution, and the consequences biological constraints will have on processes of cultural evolution and vice versa. Contributions are welcome to cover not only the evolution of our physical ability to produce structured signals in different modalities, but also cognitive or functional processes that have a bearing on the emergence of phonemic inventories. We are also interested in contributions which look at the interaction between the two areas mentioned above which are often dealt with separately in the field, that is the interaction between physical constraints imposed by a linguistic modality, and cognitive constraints born from learning biases and functional factors, and the consequences this interaction will have on emerging linguistic systems and inventories.

Contributions must fit the same submission requirements on the main ICPhS 2015 call for papers page.

Contributions can be sent as an attachment to hannah@ai.vub.ac.be by 16th February 2015

The deadline is obviously quite far away, but feel free to use the same email address above to ask any questions about suitability of possible submissions or anything else.

The History of Modern Linguistic Theory: Seuren on Chomsky

For those interested in the history of modern linguistic theory, Noam Chomsky is a major figure, though one whose influence is rapidly waning. I recommend the recent series of blog posts by Pieter Seuren. I quoted from his first post in Chomsky’s Linguistics, a Passing Fancy?, but you can go directly to Seuren’s first post: Chomsky in Retrospect – 1. What’s particularly interesting to me at this moment is that Chomsky had been associated with machine translation:
While at Harvard during the early 1950s, and later at the MIT department of machine translation, he engaged—as an amateur—in some intensive mathematical work regarding the formal properties of natural language grammars, whereby the notion that a natural language should be seen as a recursively definable infinite set of sentences took a central position. One notable and impressive result of this work was the so-called Chomsky hierarchy of algorithmic grammars, original work indeed, as far as we know, but which has now, unfortunately, lost all relevance..
I of course have known about the Chomsky hierarchy for decades, but hadn’t realized that Chomsky was that close to people actually working in computational linguistics. For Chomsky computation obviously was a purely abstract activity. Real computation, computation that starts somewhere, goes through a succession of states, and then produces a result, that is NEVER an abstract activity. It may be arcane, complex, and almost impossible to follow, but it is always a PHYSICAL process taking place in time and consuming resources (memory space and energy).
That sense of the physical is completely missing in the Chomsky tradition, and in its offshoots – I’m thinking particularly of the Lakoff line of work on embodied cognition. There is embodiment and there is embodiment. It is one thing to assert that the meanings of words and phrases are to be found in human perception and action, which is what Lakoff asserts, and quite something else to figure out how to get a physical device – whether a bunch of cranks and gears, a huge vacuum-tube  based electrical contraption, a modern silicon-based digital computer, or an animal brain – to undertake computation.
But that’s a distraction from the main object of this note, which is to list the further posts in Seuren’s Chomsky retrospect.

* * * * *

Continue reading “The History of Modern Linguistic Theory: Seuren on Chomsky”

Five pre-doctoral and two post-doctoral fellowships in the evolution of shared semantics in computational environments

Some readers may find the following of interest:

The ESSENCE (Evolution of Shared SEmaNtics in Computational Environments, www.essence-network.eu) Marie Curie Initial Training Network is offering five Early-Stage Researcher (pre-doctoral) and two Experienced Researcher (post-doctoral) positions, to start in February 2014. The application deadline for these posts is 15th December 2013.

This is a rare opportunity to be involved in a highly prestigious European training network for outstanding applicants in an emergent and important research area, led by internationally leading groups in their fields!

ESSENCE conducts research and provides research training in various aspects of translating human capabilities for negotiating meaning to open computational environments such as the web, multi-robot systems, and sensor networks. The network will support 15 pre- and post-doctoral fellows who will work toward a set of different research projects within this overall theme, ranging from symbol grounding and ontological reasoning to game-theoretic models of communication and crowdsourcing.

ESSENCE involves a top-quality consortium of internationally leading research
institutions which will act as hosts for the following projects in the current
recruitment round:

Early-Stage Researchers (36 months):
– Communication Planning (CISA, Informatics, The University of Edinburgh, UK)
– Concept Convergence: Argumentation and Agreement over Meaning (IIIA-CSIC, Barcelona, Spain)
– The Social Construction of Conceptual Space (ILLC, Universiteit van Amsterdam, The Netherlands)
– Sociolinguistics and Network Games (ILLC, Universiteit van Amsterdam, The
Netherlands)
– Open-ended Robot Interaction (AI Lab, Vrije Universiteit Brussel, Belgium)

Early-Stage Researchers must, at the time of recruitment by the host organisation be in the first 4 years (full-time equivalent research experience) of their research careers, and not yet have a doctoral degree.

Experienced Researchers (24 months):
– The ESSENCE Platform: Architecture (CISA, Informatics, The University of Edinburgh, UK)
– The ESSENCE Challenge (Information Engineering and Computer Science, Universit‡ degli Studi di Trento, Italy)

Experienced Researchers must (at the time of recruitment by the host organisation) be in possession of a doctoral degree, or have at least four years of full-time equivalent research experience, and have less than five years of full-time equivalent research experience (including time spent on doctoral research).

For both categories, research experience is measured from the date when they obtained the degree which formally entitled them to embark on a doctorate.

All positions are very competitively remunerated (significantly above the respective average national salaries/studentships for pre- and post-doctoral positions)  and aimed at outstanding candidates. Please consult the individual descriptions of projects at http://www.essence-network.eu/hiring for detailed salary information.

Researchers can be of any nationality, though at the time of recruitment by the host organisation, researchers must not have resided or carried out their main activity (work, studies, etc) in the country of their host organisation for more than 12 months in the 3 years immediately prior to the reference date. (Short stays such as holidays and/or compulsory national service are not taken into account.)

The ESSENCE network aims to attract 40% females among the recruited researchers. Female applicants are explicitly encouraged to apply and treated preferentially whenever they are equally qualified as other male candidates. The ESSENCE network will encourage flexible working hours at each host institution and/or the opportunity to work part-time from home if necessary. ESSENCE will provide specific support for female researchers in terms of targeted training events and dedicated mentoring.

All applicants are asked to pre-apply at http://www.essence-network.eu/hiring. Please contact Dr Michael Rovatsos (mrovatso@inf.ed.ac.uk) for informal enquiries.

Evolution in a Changing Environment

Following on from the Baronchelli et al paper a couple of months ago, PLOS ONE has published  “Evolution in a Changing Environment” by the same authors. The conclusions of the 2 papers both argue that if language is rapidly changing (and it is), then generalist, neutral genes, rather than specialist ones, are advantageous. This argues that language is likely more the result of general cognitive abilities as language change happens so rapidly. In contrast to the last paper though, this one focuses much less on (specifically) linguistic change, and features a super sexy stochastic interacting particle model (if you’re into that sort of thing).

Abstract:

We propose a simple model for genetic adaptation to a changing environment, describing a fitness landscape characterized by two maxima. One is associated with “specialist” individuals that are adapted to the environment; this maximum moves over time as the environment changes. The other maximum is static, and represents “generalist” individuals not affected by environmental changes. The rest of the landscape is occupied by “maladapted” individuals. Our analysis considers the evolution of these three subpopulations. Our main result is that, in presence of a sufficiently stable environmental feature, as in the case of an unchanging aspect of a physical habitat, specialists can dominate the population. By contrast, rapidly changing environmental features, such as language or cultural habits, are a moving target for the genes; here, generalists dominate, because the best evolutionary strategy is to adopt neutral alleles not specialized for any specific environment. The model we propose is based on simple assumptions about evolutionary dynamics and describes all possible scenarios in a non-trivial phase diagram. The approach provides a general framework to address such fundamental issues as the Baldwin effect, the biological basis for language, or the ecological consequences of a rapid climate change.

Baronchelli A, Chater N, Christiansen MH, Pastor-Satorras R (2013) Evolution in a Changing Environment. PLoS ONE 8(1): e52742. doi:10.1371/journal.pone.0052742

 

Corpus Linguistics, Literary Studies, and Description

One of my main hobbyhorses these days is description. Literary studies has to get a lot more sophisticated about description, which is mostly taken for granted and so is not done very rigorously. There isn’t even a sense that there’s something there to be rigorous about. Perhaps corpus linguistics is a way to open up that conversation.
The crucial insight is this: What makes a statement descriptive IS NOT how one arrives at it, but the role it plays in the larger intellectual enterprise.

A Little Background Music

Back in the 1950s there was this notion that the process of aesthetic criticism took the form of a pipeline that started with description, moved on to analysis, then interpretation and finally evaluation. Academic literary practice simply dropped evaluation altogether and concentrated its efforts on interpretation. There were attempts to side-step the difficulties of interpretation by asserting that one is simply describing what’s there. To this Stanley Fish has replied (“What Makes an Interpretation Acceptable?” in Is There a Text in This Class?, Harvard 1980, p. 353):

 

The basic gesture then, is to disavow interpretation in favor of simply presenting the text: but it actually is a gesture in which one set of interpretive principles is replaced by another that happens to claim for itself the virtue of not being an interpretation at all.

 

And that takes care of that.
Except that it doesn’t. Fish is correct in asserting that there’s no such thing as a theory-free description. Literary texts are rich and complicated objects. When the critic picks this or that feature for discussion those choices are done with something in mind. They aren’t innocent.
But, as Michael Bérubé has pointed out in “There is Nothing Inside the Text, or, Why No One’s Heard of Wolfgang Iser” (in Gary Olson and Lynn Worsham, eds. Postmodern Sophistries, SUNY Press 2004, pp. 11-26) there is interpretation and there is interpretation and they’re not alike. The process by which the mind’s eye makes out letters and punctuation marks from ink smudges is interpretive, for example, but it’s rather different from throwing Marx and Freud at a text and coming up with meaning.
Thus I take it that the existence of some kind of interpretive component to any description need not imply that the necessity of interpretation implies that it is impossible to descriptively carve literary texts at their joints. And that’s one of the things that I want from description, to carve texts at their joints.
Of course, one has to know how to do that. And THAT, it would seem, is far from obvious.

Literary History, the Future: Kemp Malone, Corpus Linguistics, Digital Archaeology, and Cultural Evolution

In scientific prognostication we have a condition analogous to a fact of archery—the farther back you draw your longbow, the farther ahead you can shoot.
– Buckminster Fuller

The following remarks are rather speculative in nature, as many of my remarks tend to be. I’m sketching large conclusions on the basis of only a few anecdotes. But those conclusions aren’t really conclusions at all, not in the sense that they are based on arguments presented prior to them. I’ve been thinking about cultural evolution for years, and about the need to apply sophisticated statistical techniques to large bodies of text—really, all the texts we can get, in all languages—by way of investigating cultural evolution.

So it is no surprise that this post arrives at cultural evolution and concludes with remarks on how the human sciences will have to change their institutional ways to support that kind of research. Conceptually, I was there years ago. But now we have a younger generation of scholars who are going down this path, and it is by no means obvious that the profession is ready to support them. Sure, funding is there for “digital humanities” and so deans and department chairs can get funding and score points for successful hires. But you can’t build a profound and a new intellectual enterprise on financially-driven institutional gamesmanship alone.

You need a vision, and though I’d like to be proved wrong, I don’t see that vision, certainly not on the web. That’s why I’m writing this post. Consider it sequel to an article I published back in 1976 with my teacher and mentor, David Hays: Computational Linguistics and the Humanist. This post presupposes the conceptual framework of that vision, but does not restate nor endorse its specific recommendations (given in the form of a hypothetical program for simulating the “reading” of texts).

The world has changed since then and in ways neither Hays nor I anticipated. This post reflects those changes and takes as its starting point a recent web discussion about recovering the history of literary studies by using the largely statistical techniques of corpus linguistics in a kind of digital archaeology. But like Tristram Shandy, I approach that starting point indirectly, by way of a digression.

Who’s Kemp Malone?

Back in the ancient days when I was still an undergraduate, and we tied an onion in our belts as was the style at the time, I was at an English Department function at Johns Hopkins and someone pointed to an old man and said, in hushed tones, “that’s Kemp Malone.” Who is Kemp Malone, I thought? From his Wikipedia bio:

Born in an academic family, Kemp Malone graduated from Emory College as it then was in 1907, with the ambition of mastering all the languages that impinged upon the development of Middle English. He spent several years in Germany, Denmark and Iceland. When World War I broke out he served two years in the United States Army and was discharged with the rank of Captain.

Malone served as President of the Modern Language Association, and other philological associations … and was etymology editor of the American College Dictionary, 1947.

Who’d have thought the Modern Language Association was a philological association? Continue reading “Literary History, the Future: Kemp Malone, Corpus Linguistics, Digital Archaeology, and Cultural Evolution”

3rd Linguistic Conference for Doctoral Students: Interdisciplinary Perspectives on Language, Discourse, and Culture

Here’s a link to another conference that might be of interest:

The 3rd Linguistic Conference for Doctoral Students will take place at Heidelberg University, Germany from 05.-06. April 2013. The overarching topic of the conference will be: “Interdisciplinary Perspectives on Language, Discourse, and Culture.” The deadline for submissions is 15 February.

I’ve included the Call for Papers below (The Call for Papers can also be downloaded here):

Continue reading “3rd Linguistic Conference for Doctoral Students: Interdisciplinary Perspectives on Language, Discourse, and Culture”