Trump vs Biden US Election 2020. What the data shows us?

We're going to do a little analysis of the U.S. elections using data that has been generated over the last few months.

Over the past few months, Donald Trump and Joe Biden have traveled to different states of the USA to give lectures on their proposals to improve the United States. However, what are the main topics that each one supports in their meetings? What type of speech does each one use? Starting from that, we will use the field of semantic analysis using natural language processing techniques to try to respond to this questions.

What data has been used?

The transcript of the first and last debate, as well as the speeches of each one, have been obtained from the web

Main topics of each one

The first presidential debate was a total madness because they did not respect each other's speeches or the topic they had to talk about.

As can be seen in the following image, the words most used by both were very similar. The use of verbs like 'going' and nouns like 'people' used them for all the promises they wanted to carry out. In addition, if we look at the phrases that were repeated the most, Donald Trump made numerous (6) accusations to Hunter Biden for the 3.5 million dollars received from Russia, and the rest of the phrases were mainly recommendations for Joe Biden to investigate better. On the other hand, Joe Biden used phrases such as 'to be able to', 'doesn't know how to do' or 'that is not true' to reproach Donald Trump that most of his proposals and accusations were either unfounded or false.

First debate semantic analysis

In the last debate, they prohibited interruptions so that everyone's time would not be affected by them. They spent most of their time discussing the main topics of the debate. 

As can be seen in the semantic analysis of this debate, Donald Trump used the word 'China' on many occasions to criticize the country for Covid-19. Moreover, he believes that the best solution is based on opening restaurants, schools, establishments, and even the whole country ('we have to open')

On the other hand, Joe Biden used the word 'president' a lot to explain that the great expansion of the virus in the United States has been due to Donald Trump and his administration.

Last debate semantic analysis

After their last debate, both dedicated their days traveling around the different states giving speeches about their proposals and criticizing their opponents. Unlike Joe Biden, Donald Trump in most of his speeches opted to allow the public.

What both have said during these speeches has been very similar to what they said in the first debate. The most repeated phrases by each one summarize very well the most important points of each one.

Joe Biden has been using phrases that contain 'shut down' to refer to the fact that he is not going to shut down the economy of the country, he is going to shut down the virus. Also, he has criticized the fact that the country 'can't stand another four more years' with a president like Donald Trump.

For his part, Donald Trump explained that the United States has achieved some goals that are the best, according to the 'in the history of our country'. Besides, he has used optimism words where he indicates that 'they are going to continue to keep on' and that he is going to win to get a better economy.

Last seven speeches semantic analysis

The debate with statistics

Obviously, looking at the first and last debate the way of speaking and the respect for the turn was quite different. But was there really a noticeable change? 

During the first debate, Donald Trump had interventions with an average length of 122 words and Joe Biden 140 words. These lengths are very low due to the interruptions they made continuously, Donald Trump made about 130 interruptions. In the last debate, the average length of speeches grew by 76.1% for Trump and 105.4% for Joe Biden, who was the most affected in the first debate.

Average words per intervention

In both debates, the one who spoke most was Donald Trump with a total of 7241 words in the first and 7792 in the last debate. This represents a ~10% increase compared to Joe Biden.

Total words of each one

Although Donald Trump is the person who has spoken the most, Joe Biden said more complicated and longer sentences, as can be seen in the following histogram where is indicated the average size in characters of each sentence.

Number of chracters per sentence

It is very curious about the distribution of the times they named each other. By far, 73.3% of the time Trump named Biden, and even more times Trump named himself a reference than Joe Biden.

Percentage of mentions


Through the NLP's techniques, you can not predict who will win this election, however, it can help us to know what each one is interested in and understand their behavior.

As it has been observed, Trump defends the evolution of the United States being the president and is positive about the future results of the elections. On the other hand, Joe Biden thinks that most of the problems in the United States today are Trump's fault, and he will fight to be the President of the United States in 2020.

Your subscription could not be saved. Please try again.
Your subscription has been successful. Thank you for joining this great data world.


You'll get the latest posts delivered to your inbox.