Exploring GPT3 Training for 'Black' Neural-Network Text Generation

Partially supported by a NULab Seedling Grant.

Warpland 2.0 is a creative writing project interested in exploring the potential for fine-tuned neural network natural language generation for the generation of poetry. There are already several examples of artists using machine learning large language models like GPT2 and GPT3 (GPT: Generative Pre-Trained Transformer) to generate text: from stories to lyrics to poetry to melodies. These models can benefit from a process called “fine-tuning”, in which a specific textual corpus augments the original textual training data for the model. For example, a language model fine-tuned on the corpus of Shakespeare would be more capable of generating Shakespeare-like text. Large language models come with their share of problems: they are trained on heavily biased data, massive amounts of internet text from places like reddit and Twitter. Warpland 2.0 is an attempt at shifting the model’s potential outputs by fine-tuning it on a corpus of the work of Gwendolyn Brooks. One goal is to give the model a context for discussing Black people that is more nuanced and sensitive than the out-of-the-box model. This is part of the larger goal, which is to use the fine-tuned model to generate Brooks-inflected ekphrastic poems in response to the artworks of Kara Walker.

The first iteration of this network, Warpland 1.0, was created using GPT2. Poems written in collaboration with this network were published in the journal Here. The second iteration of this network, Warpland 2.0, was created using GPT3 and with support from the NULab seedling grant to purchase access to the Open AI models and fine-tuning apparatus. In addition to being a creative project, this work explores the limits and problems that are entwined with using these kinds of proprietary models, and the limits of natural language generation for creative purposes. Example outputs from the 2.0 model have been presented at the Banff Center for Arts and Creativity, and at In2Writing: The First Workshop on Intelligent and Interactive Writing Assistants at the 60^th meeting of the Association for Computational Linguistics. At this stage, substantial text has been generated in response to prompts, but results are mixed. The next stage is to do further rounds of fine-tuning and connect with other natural language researchers on the best methods for fine-tuning and prompting.

Principal Investigator

Lillian-Yvonne Bertram, Faculty, English

Collaborators

Christie Towers, Research Assistant, UMass Boston; Ann Wilberton, Research Assistant, UMass Boston

Exploring GPT3 Training for ‘Black’ Neural-Network Text Generation

People in this story

More Stories

Beyond the Ideology: Demographic Segregation of Information Sharing on Twitter

Encoding Marginalia in the Dragon Prayer Book

Community Voices and AI: Exploring Public Perceptions in School Rezoning Efforts