Underwood and Sellers 2015: Cosmic Background Radiation, an Aesthetic Realm, and the Direction of 19thC Poetic Diction

I’ve read and been thinking about Underwood and Sellers 2015, How Quickly Do Literary Standards Change?, both the blog post and the working paper. I’ve got a good many thoughts about their work and its relation to the superficially quite different work that Matt Jockers did on influence in chapter nine of Macroanalysis. I am, however, somewhat reluctant to embark on what might become another series of long-form posts, which I’m likely to need in order to sort out the intuitions and half-thoughts that are buzzing about in my mind.

What to do?

I figure that at the least I can just get it out there, quick and crude, without a lot of explanation. Think of it as a mark in the sand. More detailed explanations and explorations can come later.

19th Century Literary Culture has a Direction

My central thought is this: Both Jockers on influence and Underwood and Sellers on literary standards are looking at the same thing: long-term change in 19th Century literary culture has a direction – where that culture is understood to include readers, writers, reviewers, publishers and the interactions among them. Underwood and Sellers weren’t looking for such a direction, but have (perhaps somewhat reluctantly) come to realize that that’s what they’ve stumbled upon. Jockers seems a bit puzzled by the model of influence he built (pp. 167-168); but in any event, he doesn’t recognize it as a model of directional change. That interpretation of his model is my own.

When I say “direction” what do I mean?

That’s a very tricky question. In their full paper Underwood and Sellers devote two long paragraphs (pp. 20-21) to warding off the spectre of Whig history – the horror! the horror! In the Whiggish view, history has a direction, and that direction is a progression from primitive barbarism to the wonders of (current Western) civilization. When they talk of direction, THAT’s not what Underwood and Sellers mean.

But just what DO they mean? Here’s a figure from their work:

19C Direction

Notice that we’re depicting time along the X-axis (horizontal), from roughly 1820 at the left to 1920 on the right. Each dot in the graph, regardless of color (red, gray) or shape (triangle, circle), represents a volume of poetry and its position on the X-axis is volume’s publication date.

But what about the Y-axis (vertical)? That’s tricky, so let us set that aside for a moment. The thing to pay attention to is the overall relation of these volumes of poetry to that axis. Notice that as we move from left to right, the volumes seem to drift upward along the Y-axis, a drift that’s easily seen in the trend line. That upward drift is the direction that Underwood and Sellers are talking about. That upward drift was not at all what they were expecting.

Drifting in Space

But what does the upward drift represent? What’s it about? It represents movement in some space, and that space represents poetic diction or language. What we see along the Y-axis is a one-dimensional reduction or projection of a space that in fact has 3200 dimensions. Now, that’s not how Underwood and Sellers characterize the Y-axis. That’s my reinterpretation of that axis. I may or may not get around to writing a post in which I explain why that’s a reasonable interpretation.

For now, I’m simply going to flat out assert it. If that means that, at this point, I’m simply talking to myself, well then, so be it. I’ve got make sense of this stuff for myself before I have even a ghost of a chance of explaining my thoughts to anyone else.

What are those 3200 dimensions? Each one corresponds to one of the words that Underwood and Sellers have used in their statistical model; each of those words occurs in the corpus of texts they’re investigating. So a position along the Y-axis corresponds to a location in the space of that model. The upward drift of the trend line thus tracks the movement of poetry in that 3200-D space.

But what’s that space? As I said, it’s derived from the words in the corpus under investigation. It’s an attempt at an abstract characterization, a “distant” reading, of the poems. So, we might call it a poetic space or, following Dan Dennett, a design space; for that matter, we might call Fredericka, or Bramble. But why don’t we call it 32K19, for 3200 dimensions in the 19th century. At the moment I just want a name I can use for the purpose of referring to the space.

Now, how do we get from 3200 dimensions to only one? The basic move is to examine a set of elite periodicals that review poetry; Underwood and Sellers chose 14. (You’ll probably need to read their work in order to understand what I’m saying about it.) The editors of those periodicals choose to review some of those volumes of poetry (the ones represented by red triangles), but only some. Think of that decision as having the effect of making a yes/no (review or not) decision about a 3200-D object. That judgement reduces or projects complex objects (volumes of poems) located in space 32K19 onto a single dimension having two points (yes or no). The statistical model that Underwood and Sellers have constructed is an attempt to mirror (“predict” is the word they use) the yes/no decisions made by those editors.

In short, the basic, the fundamental reduction of a high dimensional object (a body of poetry) to a single dimension has been done by human beings, the editors of journals. Underwood and Smalls constructed their model so as to mimic that reduction in order to undertake a “distant” reading of a relatively large corpus of texts. Once they’d done so they discovered that the poems in that corpus tended to drift upward in space 32K19 along the dimension of the reduction. Just why that should be the case, they don’t know, though they offer some speculations.

Influence, Direction, and Jockers’ Macroanalysis

Let’s set that aside for the moment and consider my reanalysis of Jockers’ work on influence in 19th centry novels. Jockers wasn’t examining poetry; he was examining novels. The space he constructed didn’t have 3200 dimension. It had only 600 or so. Thus it is, in some sense, a smaller space, but still way more than we can visualize.

Working with some 3000 texts, he located each one of them in his 600-D space and then calculated the distance between each pair of texts. Jockers reasoned that, if one author Q was influenced by some earlier author, W, then Q’s texts should be relatively close to those of W. With that in mind Jockers then constructed a graph containing only links between highly similar texts.

This is a 2-D projection of that graph:


Each node in the graph represents a single text; the edges connect highly similar texts, where the length of the edge is proportional to the degree of similarity (the shorter the length, the greater the similarity). As we move from left to right through the space (from gray to magenta) we move from the early 19th century to the late 19th century.

And yet, as Jockers emphasized in his discussion, there is no temporal information in the database. Why then does arranging texts according to degree of similarity result in an overall temporal ordering? Using an informal geometric line of reasoning, I have argued that this implies that the process that produced those texts must be following a one-dimensional gradient in the 600-D space that Jockers used to characterize those texts.

In short, just as the poetry that Underwood and Sellers examined is moving in some direction in space 32K19, so the novels Jockers has examined are moving in some direction his 600-D novel space. We’ve got two different spaces and movement along some direction in each of them. What if, upon deeper analysis, those two spaces turn out to be the same space – call it 19th Century Anglophone literary culture? Does that mean that the direction we’re dealing with in each case is, in fact, the same direction?

If so, what is that direction?

Into an Autonomous Aesthetic Realm

I have a thought of two on that topic. But let’s back into it. Here’s a passage from the Wikipedia entry on the discovery of cosmic microwave background radiation:

Working at Bell Labs in Holmdel, New Jersey, in 1964, Arno Penzias and Robert Wilson were experimenting with a supersensitive, 6 meter (20 ft) horn antenna originally built to detect radio waves bounced off Echo balloon satellites. To measure these faint radio waves, they had to eliminate all recognizable interference from their receiver. They removed the effects of radar and radio broadcasting, and suppressed interference from the heat in the receiver itself by cooling it with liquid helium to −269 °C, only 4 K above absolute zero.

When Penzias and Wilson reduced their data they found a low, steady, mysterious noise that persisted in their receiver. This residual noise was 100 times more intense than they had expected, was evenly spread over the sky, and was present day and night. They were certain that the radiation they detected on a wavelength of 7.35 centimeters did not come from the Earth, the Sun, or our galaxy. After thoroughly checking their equipment, removing some pigeons nesting in the antenna and cleaning out the accumulated droppings, the noise remained. Both concluded that this noise was coming from outside our own galaxy—although they were not aware of any radio source that would account for it...

When a friend (Bernard F. Burke, Prof. of Physics at MIT) told Penzias about a preprint paper he had seen by Jim Peebles on the possibility of finding radiation left over from an explosion that filled the universe at the beginning of its existence, Penzias and Wilson began to realize the significance of their discovery. The characteristics of the radiation detected by Penzias and Wilson fit exactly the radiation predicted by Robert H. Dicke and his colleagues at Princeton University.

Penzias and Wilson weren’t looking for evidence of the Big Bang, but that’s what they discovered.

Jockers, Underwood, and Sellers weren’t looking for directionality in 19th century literary culture, but that’s what they stumbled into. But what is that directionality about? Consider a passage from one of Edward Said’s last essays, “Globalizing Literary Study,” published in 2001 in PMLA (vol. 116, pp. 64-65). He says:

I myself have no doubt, for instance, that an autonomous aesthetic realm exists, yet how it exists in relation to history, politics, social structures, and the like, is really difficult to specify. Questions and doubts about all these other relations have eroded the formerly perdurable national and aesthetic frameworks, limits, and boundaries almost completely. The notion neither of author, nor of work, nor of nation is as dependable as it once was, and for that matter the role of imagination, which used to be a central one, along with that of identity has undergone a Copernical transformation in the common understanding of it.

Just what Said means by “an autonomous aesthetic realm” (a notion he has from Adorno I believe) is by no means clear. He certainly doesn’t mean a realm that is outside society and history and free of social and historical forces, which is why he indicated that the existence of such a realm “is really difficult to specify”. But if idea is to mean anything, it must indicate a field of action that is not completely determined by various social institutions.

Here is what I said at the end of longish position and methodological paper, Literary Morphology: Nine Propositions in a Naturalist Theory of Form:

How then can we account for the traits of an individual literary work, or a body of works? The writer’s brain, that is what is directly responsible for those works. Everything that acts upon and through the writer is somehow present in the writer’s brain. But that brain consists of [billions] of neurons, each of which is linked to thousands and tens of thousands of other neurons. Such a system is complex beyond our comprehension and understanding. Given that complexity it is not unreasonable to think of the artist exercising autonomous powers of imagination. These powers are not ethereal, disembodied, and outside history, but the cannot be accounted for in any simple way. The brain is irreducibly complex. That it is the brain that is complex does not somehow mean that it is “outside” or “other than” the person. It is, the person.

That person is at the interactive nexus of cultural and biological forces. The cultural forces are the cumulative result of historical processes extending back into the past a million or ten million years ago, where they vanish into biological forces extending billions of years back to the beginning of life on earth. Though each person’s brain is subject to cultural influence, it nonetheless bears the forms and processes of events that that are much older. That biology is ever available to resist, to sidestep, moribund and oppressive cultural forces. It is thus precisely because human aesthetics is grounded in human biology that it has a means of resisting and eventually working around oppressive institutions.

Speculative? Sure, why not?

Could it be that the direction that Jockers has found in one case and Underwood and Smalls in another is evidence of that biologically grounded resistance and striving? My head aches at the thought of attempting such an argument in the best of intellectual circumstances. The project of attempting such an argument in the face of a discipline that is proudly above and beyond mere biology is, well, it is not to be contemplated.

That aside, there is much to be done independently mounting that particular argument. Perhaps the major task to be undertaken is that of exploring the relationships between the very different methodologies used by Jockers on the one hand and Underwood and Smalls on the other. I have asserted that, by using different methods of exploring different kinds of texts, but from roughly the same historical period, they’ve arrived at a common phenomenon, a direction for literary change. Jockers was using similarity between texts as a proxy for influence and had no interest in the difference between elite and non-elite literary culture. Underwood and Smalls had no interest in influence nor any direct interest in similarity between texts. How then, could they end up looking at the same thing?

There is an easy answer to that last question: Because that’s the way the world is. What we need to investigate is how two such very different tools can reveal the same pattern in the world. It seems to me that such exploration will force us to think about the nature and mechanisms of language, human judgment, statistical reasoning, and who knows what else.

That’s a tall order.

Given that this post has turned out to be much longer than I had in mind when I started drafting it several hours ago, I make no promises about whether or not I will say anything more on these matters. Oh, I probably will, but just what and when, I cannot predict much less promise.