A Data Analysis of the November Democratic Debate

Can We Learn Anything from Debate Data?

The thing that bothers me most about political commentary is the lack of objectivity. After every Democratic debate, pundits on social and traditional media offer their biased takes on who “won” the contest. Writers put forward subjective experiences of the candidates, readers search for confirmation of their prior opinions, and no one learns anything.

While I am no less conflicted than other authors, I have tried to analyze past debates from a data analysis perspective. I’m interested in seeing whether any kind of objective reality can be parsed from these performances. With this view in mind, I’ve taken a look at the transcript from the November Democratic debate to see what stands out as objectively true.

Which Candidate Got the Most Words in?

Political supporters have often accused debate moderators of favoritism. While this can be hard to prove, it is an objective fact that some candidates are favored is terms of speaking time. As with the last debate, Elizabeth Warren got the most time to talk, indicating that she is considered the frontrunner by moderators.

Interestingly, Cory Booker, Pete Buttigieg, and Amy Klobuchar also got significant speaking time. This despite their polling being much worse than the top candidates. This sort of attention is a good sign for these centrist pols and a bad one for Kamala Harris who shares similar policy positions. Finally, while Bernie Sanders had much less time to speak than his polling would predict, Sanders’ support is not reliant on traditional media sources. He should continue to do well as an outsider despite being dismissed by mainstream news outlets.

Who “Won” Over the Audience ?

In my opinion, the best way to “win” a debate is to arouse passion among supporters. This passion translates into campaign donations, media exposure, and social proof of popularity among casual viewers. By this definition, the best metric to look at when deciding who “won” a debate is to see which candidate got the most applause.

Surprisingly, Amy Klobuchar led the pack in this debate, getting 5 separate applause breaks from the audience. Other candidates also did well in this regard. The big losers here are Tom Steyer and Joe Biden, who seemed unable to connect with the crowd in any significant way.

A second, less reliable metric of doing well in a debate is audience laughter. Here Andrew Yang was the big winner with Bernie Sanders, Pete Buttigieg, Cory Booker, and Kamala Harris also doing well. Laughter is a sign that a candidate is likeable. Again Biden scored poorly. His one laugh line seeming to be at his own expense when forgetting that there have been two black women senators (one of whom was standing next to him).

What were the Candidates Messages?

When it comes to candidate policy positions, there is some measure of objective fact and a great deal of subjective interpretation. After all, no one expects candidates to carry out all of their campaign promises. Since policy positions are speculative, it can be important to notice what words the candidates use when discussing their views. These words will tell you what their real focus is.

Word Cloud for All Democratic Candidates

For most candidates, the most common word used during the debate was “people”. In their own ways, each politician is promising to address the problems of all Americans (though they may pander to specific factions of the party).

The only exceptions in this debate were Amy Klobuchar, Kamala Harris, and Pete Buttigieg. While they all used the same platitudes as other candidates, their areas of focus were slightly different. Word clouds taken from the speech of the three candidates are shown below.

From left to right: Klobuchar, Harris, Buttigieg

From left to right we see that Sen. Klobuchar is focussed on getting things “done”. Kamala Harris aims to be the candidate of “women”. Finally Pete Buttigieg wants to prove that despite his young age, he has the necessary “experience”.

When we look at the top candidates, a better measure of how they differ is in the choice of adjectives they use. These words reflect how they see the issues as well as the election. The plots below show the most common adjectives used by Joe Biden (left), Bernie Sanders (middle) and Elizabeth Warren (far right)

While both Warren and Biden talk about being “able” to achieve their goals, Bernie mercilessly focusses on the negative realities of American life. While both he and Warren are “tired” of billionaires and exploitation, it is Sanders who focusses specifically on the ‘corruption’ of government. Warren’s rhetoric is a bit more measured, discussing “big” companies controlling Washington and how it affects “little” girls and boys specifically. In the end, the differences between Sanders and Warren are reflected in how negative they are willing to be in their condemnation of American capitalism and society as a whole.

Which Candidate’s Speach Had the Highest Reading Level

As a fun exercise, I thought it would be interesting to measure the Flesch-Kincaid reading level associated with each candidate during the debate. High scores like those of Tulsi Gabbard (12th grade reading level) and Pete Buttigieg (11th grade level), indicate the use of many long words. Low scores like those of Elizabeth Warren (6th grade level) indicate the use of simpler language.

While it is tempting to be proud of complex speech from a candidate, this is usually a sign of bad communication. Most advertising in the US is targeted at a 6th grade reading level, and it has been shown that short words stick better in the mind. For instance, Donald Trump famously uses words that reflect a 5th grade reading level. It may seem shameful to communicate in such an uncomplicated way. Still, I think those candidates that do so will be better understood and have a better chance to get their messages across in a general election.

Final Thoughts

Ultimately, the only true measure of a candidate’s success is in how many votes they get. Nevertheless, before the final stages of the Democratic primary I think that there are still truths to be learned from analyzing candidate speech. By looking at the words candidates use and how they use them, we may be able to predict their successes, explain their failures and gain insights into how they think. At the very least it’s a fun approach that doesn’t rely on the same tired “takes” from the media.

Married engineer in San Francisco. Interested in words, networks, and human abstractions. Opinions expressed are solely my own.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store