A Future for Empirical Reader Studies - Journal of Cultural Analytics



A Future for Empirical Reader Studies

October 19, 2021 EDT

by James F. English

As a literary scholar I’ve watched the Journal of Cultural Analytics establish itself in just five short years as the leading venue for computational analysis of literary texts. Based on some of the journal’s recent articles as well as wider trends in the field, I see a great opportunity over the next five years for it also to become the leading venue for computational analysis of literary readers. That would be a boon for scholars looking to publish technically complex work on digital reading cultures and for everyone interested in broadening the sociology of literature to include neglected readerships and emergent forms of literary reception.

Readers have long been the most elusive objects of literary study. Back in the 1980s, when I was in grad school, John Sutherland pointed to the publishing industry as "the hole at the centre of literary sociology." His message was received, and, thanks to a generation of work by book historians, the advances in that area have been huge. While there’s still much to be learned about how the publishing business shapes literary culture, one would not say today that "scholarly ignorance about book trade and publishing history technicalities" is the glaring weakness of literary sociological research (Sutherland 1988).

But I wish Sutherland had directed attention to a second and even larger hole in literary sociology, the scholarly ignorance about actual readers and readerly behaviors, i.e. readers engaging with literature outside the environment of the school. Book history has been able to fill in some portions of this gap, too—as have pyschologists and sociologists of reading. But compared with what we know about the production side of the literary world—the machinery of publishing as well as its products—our grasp of the reception side remains weak. As David Miall has repeatedly lamented, literary scholars have never shown any serious "commitment to empirical study with actual readers" (Miall 345). Even among those of us who might be interested in understanding the reading practices of ordinary, non-professional, non-academic people, virtually none have training in relevant methods of empirical research such as ethnographic fieldwork, human-subject experiments, or survey research. We are far more adept at inferring a reader from a text, or reconstructing an historical readership from data on sales and borrowing, or taking academic engagement with a text as the be-all and end-all of its reception, than we are at chasing down readers "in the wild." And Digital Humanities has mostly followed suit with the wider discipline, approaching questions about readers, where it approaches them at all, through analysis of literary textual corpora and/or corpora of professional reviews and scholarly articles. That work can be tremendously inventive and illuminating, but it doesn’t address questions about the behavior of ordinary readers, and isn’t intended to.

This is why I find the recent turn to the data of social reading so energizing. It marks a sharper and potentially more consequential deviation from standard critical practice than the advent of distant reading itself. It shifts our attention onto the space of what, in their analysis of user reviews on Goodreads, Mavrody, McGrath, Nomura, and Sherman call the "vernacular critical system." Theirs is one of three articles JCA has published since early 2020 that approach questions about popular literary taste and values through quantitative analysis of users’ data scraped from the Goodreads site. Provisional though they are, the findings in these articles suggest the importance of what might be learned from the data of social reading. With respect to gender, for example, the women-dominated space of vernacular criticism appears to be less biased than academic criticism toward the work of male authors (Bourrier and Thelwall). With respect to race and ethnicity, on the other hand, vernacular criticism seems to support a vision of literature that is significantly less diverse than the contemporary academic curriculum (Walsh and Antoniak). By discovering differences among different sets of readers, this kind of work frees us from the tedious circularity of reasoning which has us constantly projecting imagined readers who are similar to ourselves—our habit of making what Piper, in Can We Be Wrong? calls "recipient generalizations."

The authors of these pieces are clear that even large datasets scraped from Goodreads are not statistically representative samples of all user activity on that site, and that Goodreads users are not a representative sample of all reader/reviewers online. Nor for that matter are people who rate, review, and discuss books online necessarily representative of the population of people who read in general. What we are getting from this kind of work, at this early stage in the field’s emergence, are glimpses into particular zones of engagement with literature, not a comprehensive model of the space of everyday reading. But online social reading has become a massive phenomenon, as hundreds of millions of people deposit onto social media detailed traces not just of their purchasing and borrowing habits, their star-ratings and written reviews, but also their curatorial and classificatory strategies, their casual book-chatting and book-quarreling with friends, their bookish photos, favorite book shops, book-club affiliations, and much more.

This is occurring across many apps and platforms, involving many different clusters and communities of readers. A 2020 article by Pianzola, Rebora, and Lauer, published on PLOS, presents computational analysis of paragraph-by paragraph reader comments entered into Wattpad, a social storytelling site with a strong tilt toward YA fanfic. At more than 50 million active monthly users, Toronto-based Wattpad is roughly as large as Amazon’s subsidiary Goodreads, but whereas users on the slow-growing Goodreads site tend toward middle age, the rapidly expanding Wattpad reports that more than 80% of its users are 24 or younger, with an average age of 20. Wattpad is thus, as Pianzola et al remark, a valuable source of data for scholars who "want to understand the reading culture of the youngest generations in the 21st century" and on that basis to anticipate possible futures for literary reading beyond simplistic and overly familiar narratives of contraction and decline.

One way to describe this turn toward digital social reading would be as the realization in literary studies of the vision of cultural analytics laid out by Lev Manovich in his manifesto for the first issue of JCA: a convergence of "digital humanities" with "social computing." For Manovich, the exciting potential of this merger had mainly to do with advantages of scale. Digital humanities as he described it was a small field, centered on literature and history, and mostly working with small datasets of historically privileged cultural artifacts. It stood to gain from partnership with social computing, a huge field, inherently sociological but spanning many disciplines, and working with very large datasets of utterly ordinary cultural practices.

I would want to qualify this emphasis on scale. Even large scrapes of social reading sites may yield only small samples suitable for addressing particular research questions. At Penn’s Price Lab, for example, we are currently studying readers on Goodreads who strongly favor either romance or mystery novels. Given how little data exists in most users’ accounts, it requires a scrape of 2 million random Goodreads users to yield 200 that we can confidently include in our analysis. Pianzola’s team gathered a dataset of 2.5 million comments left by the Wattpad readers of 20 popular literary classics and 20 popular works of teen fiction. When they filtered the data to select the comments of highly engaged users who left at least 100 comments in each of these two categories of fiction, they arrived at fewer than n=5,000 comments, written by just n=18 users. What makes this sort of data interesting is not its scale but the specific readerly practices and orientations it brings into view, such as for example the tendency observed by the Pianzola team toward more intensive user-user engagement when reading classics (as measured by the much higher frequency of comments referring to the comments of other users). For these teenage Wattpad users, engagement with literary classics is a more social, more collaborative activity than is engagement with teen fiction.

The ability to slice and filter social reading data to home in on the distinct reading practices of particular groups of readers is needed to study groups that have been largely excluded from models of the reading class. We know from major national surveys that a substantial majority of literary readers in the US and the UK are college educated white women. Empirical researchers such as Tony Bennett and the Cultural Capital and Social Exclusion group have had to use minority ethnic boost sampling to gather even rudimentary data on BIPOC readers, and such efforts have still left entire swathes of readers—Black British males, for example, or, in the US, Asian American millennials—well below the threshold of visibility. While it is a challenge to derive good demographic data from sites like Goodreads and Wattpad, with careful filtering of the sites’ vast numbers of users it is possible to center analysis on these kinds of smaller, socially-defined fractions of the reading class, and to gain a better sense of their distinct place in the larger relational system of literary reception. The global reach and multilingual content of the sites should make it possible also to compare reading practices among users in different countries, reading and reviewing in different languages. The stories and commentaries on Wattpad span 50 languages; as of 2015, nearly a quarter of the site was composed of Tagalog or Turkish (Nowatka). The analytics of social reading could help to steer the JCA toward a less Anglophone future, as strongly urged by Hoyt Long and Cecily Raynor.

This kind of empirical work on the reception side of literature is by its nature piecemeal, involving a diverse array of teams and individual scholars using various data-driven approaches to study many different kinds of online readers and reading practices. JCA is ideally positioned to select and promote the best of this work as it emerges, and thus to provide some coordination and quality control for the field as a whole.

WORKS CITED

Bennett, Tony, Mike Savage, Elizabeth Silva, Alan Warde, Modesto Gayo-Cal and David Wright. Culture, Class, Distinction. New York: Oxford UP, 2009.

Bourrier, Karen, and Mike Thelwall. "The Social Lives of Books: Reading Victorian Literature on Goodreads." Journal of Cultural Analytics 5.1 (2020): 1-34. doi: 10.22148/001c.12049

Manovich, Lev. "The Science of Culture? Social Computing, Digital Humanities and Cultural Analytics." Journal of Cultural Analytics 1.1 (2016): 1-15. doi: 10.22148/16.004

Mavrody, Nika, Laura B. McGrath, Nichole Nomura, and Alexander Sherman. "Voice." Journal of Cultural Analytics 6.2: 218-242. doi: 10.22148/001c.22222

Miall, David. "Emotions and the Structuring of Narrative Responses." Poetics Today 32.2 (2011): 323-348. (2011). doi: 10.1215/03335372-1162704

Nowatka, Edward. "Will Wattpad Attract a Billion Users? CEO says ‘Easy.’" Publishing Perspectives (July 31, 2015). https://publishingperspectives.com/2015/07/will-wattpad-attract-a-billion-users-ceo-says-easy/

Pianzola, Federico, Simone Rebora, and Gerhard Lauer. "Wattpad as a Resource for Literary Studies. Quantitative and Qualitative Examples of the Importance of Digital Social Reading and Readers’ Comments in the Margins." Plos One (January 15, 2020). doi: 10.1371/journal.pone.0226708

Piper, Andrew. Can We Be Wrong? The Problem of Textual Evidence in a Time of Data. Cambridge UP 2020. doi: 10.1017/9781108922036

Sutherland, John. "A Hole at the Centre of Literary Sociology." Critical Inquiry 14.3 (1988): doi: 10.1086/448457

Walsh, Melanie, and Maria Antoniak. "The Goodreads Classics: A Computational Study of Readers, Amazon, and Crowdsourced Amateur Criticism." Journal of Cultural Analytics 6.2 (2021): 243-287. doi: 10.22148/001c.22221

"Wattpad Announces 80 Million Monthly User Milestone." Press Release August 15, 2019. Archived at: company.wattpad.com/archives/2019-8-15-wattpad-announces-80-million-monthly-user-milestone.

Powered by Scholastica, the modern academic journal management system