On February 24, 2021, the NULab hosted a well-attended virtual panel on “The Ethics of Digital Images Analysis.” Moderated by Laura Nelson, assistant professor of Sociology at Northeastern University, the panel included four speakers: Alex Hanna, sociologist and senior research scientist on Google’s Ethical AI team; Eunsong Kim, assistant professor of English at Northeastern University; Luke Stark, assistant professor in the Faculty of Information & Media Studies (FIMS) at the University of Western Ontario; and Lauren Tilton, assistant professor of Digital Humanities and director of the Distant Viewing Lab at the University of Richmond. Nelson introduced the panelists and spoke to how important it is today “to take stock of how we’re using big data, particularly image data” with historical and contemporary perspectives. Nelson incorporates computational image analysis in her own work, and observes trends in the use of image data as a reviewer for a sociology journal.
Alex Hanna introduced her research into the “algorithmic unfairness” of facial recognition datasets, and the role of racial and gendered biases in shaping the “training data” that constitutes the infrastructure of machine learning. Facial recognition software routinely misidentifies non-white people, and this bias is connected to the image datasets that machine learning models practice on. To understand biases in machine learning, Hanna argues for the critical and historical interpretation of datasets such as ImageNet, which intended to “map the entire world of objects.” She references the findings of Kate Crawford and Trevor Paglen in “Excavating AI,” where the authors discovered that the “person” category of the ImageNet relied on many stigmatizing sub-categories to describe people in photographs. But Hanna emphasizes that making datasets “sufficiently” representative does not necessarily rectify hierarchies. She referred to the IBM “Diversity in Faces” dataset which took images “mostly without people’s consent,” revealing that in some cases “inclusion could be predatory.” Hanna concludes that examining “the genealogy of machine learning data” can reveal the motivations behind data sets and their ramifications for power, fairness, and inclusion.
Eunsong Kim presented on the colonialist structures that persist in the digital image collections of major cultural heritage institutions, such as Harvard’s Peabody Museum and the Getty Museum. Both museums in recent years have digitized their collections with the intention of accessibility, yet neither have repatriated “colonial objects” in their possession. There is an underlying tension between the image, understood as “digitally free,” and the museums’ objects, which are “colonially bound.” Kim focuses particularly on the Peabody’s collection of Harvard biologist Louis Agassiz’s ‘daguerreotypes’ of enslaved people from 1850. Daguerrotypes were an early form of photography produced on copper plates, and physical access to them is limited because of their susceptibility to fading over time. Agassiz’s daguerreotypes were meant to support his racist theory of polygenesis, or the theory that humans of different races came from distinct biological origins. These images are digitized and in the public domain, yet the Peabody has maintained “a circulatory and exclusionary power” over the physical images according to Kim. She concludes that open access projects at museums might “unsettle some questions concerning property” yet so far, they have maintained unequal power relations.
Luke Stark presented on his research with Jesse Hoey, professor of Computer Science at the University of Waterloo, on “affective computing” and the role of emotion in image analysis. Stark argued that contemporary projects using artificial intelligence to recognize emotions’ “relationship to images of human faces” are connected to eugenicist Francis Galton’s attempt to measure “intrinsic culpability and criminality” from people’s faces. Stark reviewed competing philosophies of emotion, particularly the assumptions we attach to emotions, in order to provide context for current debates about facial recognition technology. He referenced Lauren Rhue’s study that found that facial recognition software frequently misidentifies emotion in Black faces by inaccurately assigning negative emotions to them. Stark concludes that there are “broader social problems” with using AI to recognize emotions such as “aggression in the context of law enforcement,” and that simply striving for accurate data is misguided.
Lauren Tilton presented on her collaboration with colleagues at the Distant Viewing Lab to use Photogrammar, a tool which Tilton co-created with Taylor Arnold, to archive and map 170,000 photographs of the United States during the Great Depression and World War Two. These photographs were commissioned by the Farm Security Administration and Office of War Information, and many of them have become iconic (such as Dorothea Lange’s photographs of Americans during the Depression). Tilton reflected upon the methodological questions that arise in using “computational techniques to analyze visual culture on a large scale.” Because the assumptions of researchers shape the characteristics that algorithms look for to identify images, Tilton critically examines the image collection’s metadata from the Library of Congress. She is working on a distant viewing toolkit in Photogrammar which can unearth assumptions encoded in image data and measure “image similarity” in order to recommend conclusions about “visual tropes throughout the collection.”
The event had a lively Q&A where audience members and panelists discussed topics including the ethics of scraping data from the public domain, the potential for studying protest photography, and Google’s reliance on Internet users’ labor to train its image recognition software. The NULab is looking forward to similarly engaging discourse at upcoming NULab-sponsored events this Spring.