On November 9th, 2023 the NULab for Digital Humanities and Computational Social Science and the Digital Scholarship Group hosted one of our Digital Humanities Office Hours events, this one centered around the theme of measuring narrative and literature. There were a number of scholars who presented on their work: Juliana Spahr (Professor, Mills College), Stephanie Young (Adjunct Professor, Mills College), Yakov Bart (Associate Professor, D’Amore McKim School of Business), and Samsun Knight (Assistant Professor, University of Toronto Rotman School of Management).
The event began with a presentation from Dr. Spahr and Dr. Young about their project “Contemporary Literature’s Vexed Democratization” which investigates the concept of prestige within the literary field through data collection, computational analysis, archival research, and close reading. Central to their work is an exploration of the contradictions that exist within the field today, particularly the intersection of the historical understanding of literature as within the “purview of the elite” and the more recent increased access to the field that has come about due to technological advancements and shifts in the publishing market. With the substantial increase in the number of titles published annually (prior to 1990 it was about 6,000 and now it is closer to 60,000), Spahr and Young wanted to know if this decrease in gatekeeping around publishing was reflected in a broader democratization of the literary field—in particular, if this had resulted in any substantial change to the demographics of literary production.
One of the ways in which they attempted to answer this question was by looking at the changes in the path to publishing for writers. As poets themselves, Dr. Spahr and Dr. Young are both all too aware of the restrictions and exclusions that plague this journey. So, they chose to look at the MFA classroom—has this environment become more diverse as publishing has increased? What they found was that yes, MFAs’ diversity has increased but, with an important caveat: there was only a substantial increase in diversity at the “debt-generating institutions” (i.e. institutions that do not provide funding for their MFA students). Spahr and Young view this as a sort of “tax of entry,” where certain races and classes must pay more in order to gain access into the literary field.
They also looked at literary prizes, as this provided them with a more contained data set that was publicly available—as opposed to MFA classes, where not all institutions publish such demographic information. Additionally, the prize functions as a stamp of authority for the writer receiving it, so it functions as an ideal indicator of prestige and/or exclusion. First, they looked at changes to literary prizes in general and found that the number of prizes has indeed increased at basically the same rate as the increase of publication (at least within the United States). What they found is, at least on the surface, exciting: the number of female and Black writers has significantly increased over the last century. In particular, the number of Black prize winners surpassed white winners in 2017.
However, they also found that literary prizes were becoming increasingly exclusionary in another way: socioeconomic status. The percentage of prize winners with degrees from elite institutions appears to have remained relatively constant across the past one hundred years, whereas the percentage of white writers with such degrees began to decline in the 1970s. Conversely, the percentage of Black writers with elite degrees has gone up. So, while literary prizes have become more diverse in some ways, they too seem to suffer from the “tax of entry” issue that MFAs do. This is especially relevant when considering the fact that most literary prizes are judged by former prize-winners, meaning that the current judging committees tend to resemble the winners of about 20 years ago—i.e., predominantly white men with graduate degrees from elite institutions.
Despite the significant strides they’ve made in better understanding the diversity, or lack thereof, in the literary field, Dr. Spahr and Dr. Young would like to next turn to a new dataset: sales. In doing so, they will be able to investigate the ways in which this appearance of diversity affects the popularity of the writers themselves. This will be an especially telling dataset for this project as prestige literature is largely supported by philanthropy, rather than just market demand. An exploration of sales data, then, will provide yet another window into the makeup and insiderness of the field, as well as the barriers to its points of entry. This will also help the researchers better understand the ways in which broader societal events may impact the field; for instance, an increase in Black prize winners after the highly-publicized murders of Black civilians and the subsequent Black Lives Matter protest movement.
We then heard from Dr. Bart and Dr. Knight on their “BookNet” project, which works to create a new means of collecting “narrative experience” data. Their work tracks the ways in which the changes in the literary environment, particularly in publishing, are reflected in writing styles themselves. Part of the difficulty of measuring this, as they see it, is due to the intangible experience that goes into writing and reading. To explain this, Knight references the saying “the best writing happens when the author is trusting the reader to do 50% of the work.” This points to the slippery nature of attempting to measure writing styles but also suggests that there is significance to being able to track such styles across time.
So, while there is work already being done through language processing that identifies what is happening in the text, Dr. Bart and Dr. Knight are interested in filling in the gap in the data that would allow for a tracking of how a text makes a reader feel, and how this has changed across time. While this work seems inherently important, these scholars explained one of the major reasons why they feel such tracking could benefit the literary field: the narrative industries (publishing, film, etc). These industries use trend tracking in order to market the stories they are trying to sell. This means that, based on the data that is accessible right now, they will look for stories that most closely resemble economically successful stories from the past. For example, as Dr. Knight explains it, a publishing house may look through the manuscripts it receives in search of a text that they can market as “Gone Girl meets Normal People,” as they know that such a text will profit.
Such “trend tracking” practices are troubling because they have real world implications about which texts and which authors get promoted, which can often mean a perpetuation of gender and racial discriminatory practices. Knight and Bart’s data would attempt to illuminate what goes into writing styles and how they are interpreted. In doing so, they hope that they can provide the narrative industries with a more nuanced understanding of the texts they are looking for and hopefully work to eliminate, or at least reduce, the bias that currently exists.
And thanks in part to funding from an NULab Seedling Grant, Dr. Knight and Dr. Bart have partnered with a publisher and are preparing to survey readers at scale on their perceptions of a text right after they have read it. In building this dataset, they also hope to provide researchers with the necessary materials to both ask and answer a whole new set of questions: ones that may have seemed impossible before.
Both of these projects, “Contemporary Literature’s Vexed Democratization” and “BookNet,” demonstrate the potential that digital tools have in helping us to better understand the humanities, a field that has too often felt separate from technological advancement. As an educator, I am particularly excited by the work these scholars are doing to highlight injustice in the literary field and how this relates back to the academy.