We’ve had responses from about 70 people, and we have some results. Some are summarised on the live results page.
Astoundingly, people actually managed to get 9 numbers shown for only 210 ms! Replicated Typo’s very own James Winters was one of those mavericks, but puts it down to luck.
There were some early leaders, but in the last few hours, the player known as ‘mjb’ has really kicked everybody’s ass and got to the top of all three leaderboards. Who are you, magic human? Let us know!
First, here’s the data from Inoue & Matsuzawa (2007), for 5 numerals shown over a range of latencies (left) compared to our findings for the QHImp Qhallenge (right):
The program aimed to show numerals for 210ms, 300ms, 400ms, 600ms and 1 second. However, the latencies for our data is variable because of differences in online processing speeds. The data we’re using is the actual time difference between displaying the numerals and masking them. This data is binned into “less than 220ms”, “less than 350ms” etc. The y axis shows the number of buttons pressed in the correct order. I’ve put 99% confidence intervals on the means.
While the latency clearly affects accuracy, the QHImp players appear to be scoring much higher than the humans in Inoue & Matsuzawa (2007), and comparably to Ayumu. There are some differences in the methodology, but in the graph above we haven’t even looked at just the best players, or player’s responses after a certain amount of training – this is the entire raw data. I’ll have to look back over this to make sure I’m not including dodgy data etc.
Here are the results for ‘Chimp mode’ where 9 numbers were displayed at around 210ms. Humans do pretty poorly, but some actually manage 9 numbers! I’ll have to think more about how to quantify the probability of having 4 trails guessing 9 numbers in a row…
Astoundingly, one player played over 1200 trials in chimp mode – 9 numerals at 210 ms. Here’s a graph of their performance over time:
So, is this player improving? A Spearman rank order correlation was marginally significant (rho=0.05, p = 0.068), and a Page’s L trend appeared to be significant, but I’ve not used this statistic before (L = 6108, z-score = 104, p <0.001). It’s marginal, but people might be improving. The data from the live page looks like it has a positive slope.
We’re continuing to tune the program, and have implemented a version that uses shades of colour instead of numbers – we’ll release this when we get the chance. In the meantime, you can keep playing the game here and we’ll continue to collect data.