Skip to content

SOTU language analysis reveals history’s ‘twists and turns’

President Barack Obama

Assistant Professor of History Ben Schmidt ana­lyzed the lan­guage of Obama’s pres­i­den­tial address—and that of every SOTU speech dating back to 1790—using Book­worm, a simple and pow­erful way to visu­alize trends in dig­i­tized texts.


Luck,” “lows,” and “Les­bian.” “Dodge,” “dusted,” and “drowning.” “Tesla,” “eBay,” and “Instagram.”

Ben Schmidt was up late Tuesday night, tweeting out these and more than 70 other words that had never appeared in a State of the Union address until Pres­i­dent Barack Obama deliv­ered his annual report to Con­gress that evening.

Some of the words were sur­prising, but many others were tac­tical and say some­thing impor­tant about the state of Amer­ican society today,” said Schmidt, an assis­tant pro­fessor of his­tory and a core fac­ulty member in the NU Lab for Texts, Maps, and Net­works, Northeastern’s center for dig­ital human­i­ties and com­pu­ta­tional social science.

Schmidt ana­lyzed the lan­guage of Obama’s pres­i­den­tial address—and that of every SOTU speech dating back to 1790—using Book­worm, a simple and pow­erful way to visu­alize trends in dig­i­tized texts. He and a Har­vard col­league cre­ated the plat­form for text analysis in 2011, and have since used it to examine the lan­guage used in every­thing from news­paper arti­cles to more than 500 episodes of The Simp­sons.

Schmidt’s latest project was made pos­sible in part by a grant from the National Endow­ment for the Human­i­ties, which enabled him to enhance his Book­worm tool through col­lab­o­ra­tion with the HathiTrust Dig­ital Library, which holds 3.9 bil­lion pages of dig­i­tized materials.

He dis­cussed the project in an inter­ac­tive article in The Atlantic, for which he used Book­worm to comb through all 224 State of the Union addresses and rank the fre­quency with which each pres­i­dent used each word. In an inter­ac­tive com­panion piece, he and Mitch Fraas of the Uni­ver­sity of Penn­syl­vania used nat­ural lan­guage pro­cessing algo­rithms to iden­tify more than 16,000 men­tions of 1,410 dif­ferent places that pres­i­dents have ref­er­enced since the very first State of the Union more than 200 years ago. For­eign policy his­to­rians including Gretchen Heefner, an assis­tant pro­fessor of his­tory at North­eastern, pro­vided his­tor­ical con­text, explaining how the speeches reflect America’s changing role in the world.

The find­ings, Schmidt wrote in The Atlantic, “reveal how the words pres­i­dents use reflect the twists and turns of Amer­ican history.”

The word “freedom,” he said, was used spar­ingly until Franklin D. Roo­sevelt placed the “Four Free­doms” at the center of his 1941 address. Since then, the word has gained pop­u­larity, par­tic­u­larly among Repub­lican pres­i­dents. George W. Bush, for example, said “freedom” more than 70 times, while Obama has used the word fewer than 10 times, including once Tuesday night.

Like “freedom,” “col­lege” fea­tures an unmis­tak­able par­tisan tilt. Demo­c­ratic pres­i­dents, such as Obama, Bill Clinton, and John F. Kennedy, have used the word far more than Repub­lican pres­i­dents, such as Bush, Ronald Reagan, and Gerald Ford. Bush, for example, said “col­lege” five times, while Obama has used the word on more than 50 occa­sions, including 12 times last night.

His speech on Tuesday evening also included the use of sev­eral words that had not been spoken at a State of the Union in more than 100 years. According to Schmidt, Obama said “vaca­tions” for the first time since Mil­lard Fill­more used the word in 1851 and uttered “mas­terful” for the first time since Theodore Roosevelt’s 1901 address.

Mean­while, his ref­er­ences to China (three) and the Middle East (two) dove­tailed with the rhetor­ical choices of recent pres­i­dents, whose lan­guage has reflected their interest in par­tic­ular coun­tries and regions. According to Schmidt and Heefner’s account in The Atlantic, the Middle East became a fix­ture of pres­i­den­tial addresses fol­lowing the 1979 Iran hostage crisis, while China—whose “men­tions in the State of the Union follow the sine wave of Amer­ican interest”— once again became a fre­quent topic of pres­i­den­tial address fol­lowing Nixon’s 1972 visit.

In many ways, the places Obama men­tioned con­tinued the trend of the past few decades,” Schmidt explained. “His speech focused on a band of coun­tries in the Middle East, while there was little dis­cus­sion of Africa, which has been char­ac­ter­istic of pres­i­den­tial address in gen­eral and Obama’s in particular.”

-By Jason Kornwitz

More Stories

Photo of the Capitol Building at night

High stakes for politics, SCOTUS in 2018

Photo of the crashed truck that was used in the October 31st attack in Manhattan.

Weaponizing Language: How the meaning of “allahu akbar” has been distorted

Northeastern logo

Why I love studying Spanish