Beyond Quantification: Digital Criticism and the Search for Patterns

I've collected some recent posts (from New Savanna) on patterns into a working paper. It's online at SSRN. Here's the abstract and the introduction.

Abstract: Literary critics seek patterns, whether patterns in individual texts or patterns in large collections of texts. Valid patterns are taken as indices of causal mechanisms of one sort or another. Most abstractly, a pattern emerges or is enacted as some machine makes its way in an environment. An ecological niche is a pattern “traced” by an organism in its environment. Literary texts are themselves patterns traced by writers (and readers) through their life worlds. Patterns are frequently described through visualizations. The concept of pattern thus dissolves the apparent conflict between quantification and meaning, for quantification is but a means to describing a pattern. It is up to the critic to determine whether or not a pattern is meaningful by identifying the mechanism that produced the pattern. Examples from Shakespeare and Joseph Conrad.

Introduction: Patterns and Descriptions There is a sense, of course, in which I’ve been aware of and have been perceiving and thinking about patterns all my life. They are ubiquitous after all. But it wasn’t until I began studying cognitive science with the late David Hays that “pattern” became a term of art. Hays and his students were developing a network model of cognitive structure – such works became common in the 1970s. Such networks admit of two general kinds of computational process, path tracing and pattern recognition. Path tracing is computationally easy, while the pattern recognition is not. Human beings, however, are very good at perceiving and recognizing patterns.

What put the idea before me, though, as something demanding specific thought, are remarks Franco Moretti made in coming to grips with his work on the network analysis of plot structure. In Network Theory, Plot Analysis (Literary Lab Pamphlet 2, 2011, p. 11) he noted that he “did not need network theory; but I probably needed networks.... What I took from network theory were less concepts than visualization.” We then examine the visualizations to determine whether or not they indicate patterns that are worth further exploration.

That, it seems to me, should put to rest fears about the incommensurability of numbers and meaning or, even worse, anxiety about infecting humanistic inquiry with quantitative evil. It’s not about numbers and counting. It’s about patterns. Numerical work is subordinate to and in service of looking for patterns, whether patterns in individual texts, as Moretti was doing in his work on plot structures, or patterns in collections of hundreds and thousands of texts spanning decades or more of historical time.

But, just what IS a pattern anyhow? How do we tell the difference between patterns and, well, non-patterns? Those are tricky questions, questions I pursue in the posts that make up this working paper. If what we’re looking for is some a priori way of specifying what patterns are so that we can then theorize about patterns in a general way, then I think we’re in trouble. In the sections, “Pattern” as a Term of Art and Patterns as Epistemological Objects, I suggest that there is no such thing. What emerges from those discussions is something like this: A pattern is something that emerges or is enacted as some machine makes its way in an environment in which it either survives or fails – where the italicized terms are understood in a very general and abstract sense. Thus understood, patterns are relations between machines and environments.

The level of abstraction and generalization I have in mind is that which is typical of theoretical computer science, a set of disciplines in which I am by no means expert. Nonetheless I will hazard a few remarks. Consider the opening of the Wikipedia’s entry on computational complexity:

Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other. A computational problem is understood to be a task that is in principle amenable to being solved by a computer, which is equivalent to stating that the problem may be solved by mechanical application of mathematical steps, such as an algorithm.

A problem is regarded as inherently difficult if its solution requires significant resources, whatever the algorithm used. The theory formalizes this intuition, by introducing mathematical models of computation to study these problems and quantifying the amount of resources needed to solve them, such as time and storage. Other complexity measures are also used, such as the amount of communication (used in communication complexity), the number of gates in a circuit (used in circuit complexity) and the number of processors (used in parallel computing).

It is the second paragraph we need to think about, for it is about resources. Pattern recognition isn’t the only kind of problem that is computationally difficult, but it is one of them. And, of course, not all computational operations are particularly demanding.

My thought then, is this, if patterns are computationally difficult, and computational difficulty is about resources – time, storage, communication lines, processors – then patterns are things that exist for, are defined by, computers – abstractly understood in the most general case, but by actual computational devices in some specific cases. No wonder thinking about them is difficult!

Now, in digital criticism, pattern recognition is being done by the critics, not by the computer. Computing of the “big data” kind requires a fair bit of computational horsepower, but much of it is well within in the range available in current laptop machines. The number crunching is not, in fact, computationally complex. It’s relatively straight-forward but simply requires a lot of CPU cycles and a fair amount of storage to keep track of the data and of intermediate results.

Pattern recognition, on the other hand, typically leads to what is called combinatorial explosion in which intermediate results multiply as the computation proceeds and there is no guarantee that, at some point, the intermediate results will become “absorbed” into the ongoing work and finally disappear, leaving you with a sure result – either a recognized pattern, or the certainty that no pattern is there. People do that kind of thing relatively well – lots of resources in the form of processors, where each neuron in the nervous system is considered to be a processor, giving us 100 billion processors.

So, while patterns are what digital criticism is about, the computers aren’t being used to recognize the patterns. They’re being used to create the information displays, the visualizations, in which we recognize patterns. I note as well that this is true whether one is looking for patterns in individual texts or in large collections of texts.

And, I warrant it is true even if you aren’t using computers at all. I’ve been writing a great deal about ring-composition over the past few years, but I’ve not been using computers to help me identify ring-forms – except in the trivial sense that I using word-processing software and I make tables and charts to help in the search. What I end up with are descriptions, another frequent blogging topic.

A description is a description, whether it is created by hand or with computer aids. The visualizations so common in digital criticism are descriptive in kind. And so, I believe, we need to think explicitly about description: How do we formulate descriptions?

What are the roles of verbal description, visualization, and even mathematical and logical formalism? What are we describing? Patterns, or at any rate, possible patterns. Whether those patterns are real or not, that is, whether or not they indicate some causal process at work in the world, that’s something we have to determine. Just how we do that, well...we’ve got new disciplines to create, do we not?

* * * * *

This working paper consists of the following posts from New Savanna:

From Quantification to Patterns in Digital Criticism: Here I take up the general question of computation and quantification as set forth by several investigators and assert that patterns is what we’re seeking through the use of computers, not numbers.

Rens Bod on Patterns: This is an abstract of a paper Bod gave on patterns, with the link to the paper.

Pattern: Ramsay on Shakespeare, and Beyond: In which I examine an essay in which Ramsay uses network diagrams to describe patterns of scene locations in Shakespeare plays and in which he wonders: What are these patterns good for?

Epiphenomena? Ramsay on Patterns, Again: I look at a more recent Ramsay piece in which he again foregrounds the notion of pattern, suggesting that they are “emergent textual epiphenomenal”. Emergent, perhaps. Epiphenomenal, no. “They’re the main event.”

“Pattern” as a Term of Art: This is where I introduce the idea of a pattern as a relationship between a machine and its environment. I draw this conclusion by thinking about the ecological niche, where the machine is an organism and its environment is its life world. To this I have appended the abstract for an article David Hays and I wrote on evolution and complexity.

Patterns as Epistemological Objects: Now I develop that idea, this time using examples from human cognition and reasoning. As a specific example I consider paragraph length in Conrad’s Heart of Darkness, which exhibits a remarkable pattern. But is it real?

Patterns and Literature: Literature itself is a means of tracing patterns through the world. Hermeneutical methodologies attempt to explicate the patterns that given texts trace in the world; the more sophisticated methodologies grapple with the fact that those texts, after all, exist in the world and so are components of the paths they trace. The naturalist critic, however, “takes a step back” from the text and concentrates on describing the formal patterns in the text itself, the patterns through which the text imposes itself on life.

* * * * *

Let me suggest, finally, that as I have, following others, recently sketched out an ontology based on objects (Living with Abundance in a Pluralist Cosmos: Some Metaphysical Sketches), I am now approaching an epistemology based on patterns. Pattern-oriented epistemology anyone?

  • Do you have any sense of when man first had the cognitive ability to recognize patterns? Did it come late to our genus--as Homo sapiens--or early, when Homo habilis first devised tools? I wonder if it was part of our growing brain's evolution long ago.

    Thanks for your thoughts. Fascinating.