- Salesforce partnered with The Pudding to create an app showing the topics dominating Congress’ discourse in the run-up to the 2020 U.S. elections [Click to Tweet]
- They used a highly accurate AI-powered model to analyze Congressional tweets and discover which topics representatives mention the most (Alexandria Ocasio-Cortez on climate change, Mitch McConnell on immigration, for instance)
- Updated daily, the politically neutral app offers a quick and easy way to understand what matters in Congress, and stave off information overload
- Visit here to find out who tweets about an issue more, compared to the rest of Congress, as well as activity by party and state
Imagine you’re standing in a square crowded with people on soapboxes, each one shouting into a megaphone. To your left, TV screens beam out political commentary in a roar of competing messages. To your right, pamphleteers jostle to hand out flyers that say they can help you decide how to vote. And the light aircraft flying overhead? Each one’s trailing a slogan-filled megabanner.
If that sounds strangely familiar, research shows it’s the way most of us feel when it comes to information about politics these days — that is, overloaded to the point of exhaustion. A recent survey found a sizable portion of social media users say they feel more worn out than excited by the number of political posts they encounter. Another study found almost seven in 10 Americans have news fatigue.
Recently, however, Ryan Van Wagoner, Director of Product Marketing at Salesforce, had an idea. With information overload likely to become even more pressing as Americans head to the polls next year, what if there was a way to cut through the noise? More specifically, what if Salesforce could produce an AI algorithm to analyze Congressional tweets in a way that showed where representatives stood on key topics — everything from healthcare and jobs to climate change and gun control?
Ryan is one of a small group of AI experts working on Salesforce’s AI Research team. “By applying the latest developments in AI research, I was pretty sure we could create an app to help people better understand their government officials at state, regional, and national levels — in just a few clicks,” he says.
Harnessing the power of Einstein
And so, working with interactive data visualization experts at The Pudding, the team set about creating an app that could do just that.
A collective of journalist-engineers who specialize in explaining big cultural issues through interactive visual essays, The Pudding seemed a natural fit to help.
“We use original datasets, primary research, and interactivity to explore complex topics,” explains Matt Daniels, The Pudding’s CEO. “Our goal is to advance public discourse and avoid media echo chambers.”
Daniels and his team believed that the Twitter project would tick those boxes. That’s because it was an opportunity to give people a metapicture of Congress that didn’t really exist in any other way. After all, like other social media platforms, Twitter is a fire hose of information. Currently, nearly all of the 535 members of Congress have active Twitter accounts, making it the most popular social media platform for lawmakers across both parties and age groups.
Likewise, AI experts inside Salesforce are constantly looking for examples of how Einstein — the artificial intelligence layer baked into the Salesforce Platform — can empower people with AI tools to make smarter decisions and be more productive.
Over the last few years, we’ve worked hard to research and develop a bunch of ready-made tools harnessing Einstein that can help make the world a better place. We’ve investigated using computer vision to identify cancer in images, for example, along with a tool that can count sharks off the coast, then feed that information to lifeguards. The possibilities are limitless.Michael Jones, Director of AI Research at Salesforce
The challenge for both teams was to see if Einstein could be harnessed in a new direction — to help people rapidly identify, at scale, the key topics dominating Congress’ political discourse in the run-up to the elections.
“In the end, deciding to go ahead with this project came down to our collective confidence in Einstein,” says The Pudding journalist Charlie Smart. “We realized wielding a tool like that would actually make this project easy for us. It meant we’d be able to quickly surface some really cool results without needing to bring in a roster of data scientists to pull it all off.”
Now all the team had to do was build the tool.
Training the Model
As a first step, Salesforce and The Pudding set about training an AI-powered model to recognize political topics mentioned in tweets. Their goal was to see which topics were being discussed most, and in that way grasp emerging trends. At the same time, they trained their model to recognize names and places mentioned in tweets.
The difference between AI and earlier generations of computing is that it can learn from examples, then devise its own solutions to problems. This is made possible by deep learning algorithms, which ingest large amounts of information and can handle unstructured data, such as written language, images, and speech.
These advances in deep learning make it possible for Einstein to complete tasks with much higher degrees of accuracy than would have been possible even a few years ago. That’s because it uses multilayered algorithms to analyze data more accurately, compared to older, simpler systems that relied on techniques such as key word matching. Here, each layer of processing builds on the analysis performed at the other layers, detecting patterns in data so that data can be automatically classified into a labeled dataset.
In this case, the team first trained their model on approximately 3,000 tweets that they had manually classified into 20 primary categories, including agriculture, criminal justice, healthcare, guns, and jobs. “That helped the system develop a probability that a tweet fell within a given issue,” says Jones.
Overcoming Training Challenges
The team used a couple of Einstein Language APIs to help the model parse through all the data. “One of the APIs is Einstein Intent, which works to classify tweets into a topic,” says Van Wagoner. “Take a July 24 tweet from Congressman Jim Baird, for example, about supporting the development of young innovators in STEM. Our classifier would be able to categorize that as an education-related tweet.”
Einstein’s deep learning algorithms are able to link this kind of tweet to education, even though the actual word “education” isn’t mentioned. As Van Wagoner explains, “We can feed it a thousand tweets about education and what it will learn is, ‘Oh, STEM has something to do with this topic.’ Traditional, rules-based AI techniques would completely miss that.”
However, this kind of approach is hard to perfect. That’s due to the way language keeps changing, as much as the need for contextual knowledge about the world. Sentiment analysis today is fairly good at identifying whether a statement in plain English is positive or negative, for instance. But acronyms, sarcasm, and irony — as well as just plain old poor data — are among the things that can still throw the algorithms off.
The team discovered this to their cost when they found their model was putting tweets related to holidays into “military” and “Armed Forces” categories.
“Turned out that was because there were a handful of tweets in the training data saying ‘Happy Veterans Day!’” says The Pudding journalist and designer Charlie Smart. “The classifier wasn’t reading that as a holiday-related topic, but as a military one. Adding additional category training data on holidays helped us quickly fix that problem.”
Throughout the training process, the experts ran tests to see how well their system was able to classify tweets. “Initially, these came back with only 60% accuracy levels,” says Jones. “We obviously needed to improve that — no congressperson wants their opinions misrepresented, and we have a serious responsibility to ensure that no one using this tool is in any way misled.”
The team used a few different tools and model metrics to understand where there were inaccuracies and how they could solve them.
“We discovered the system didn’t really have a good idea about certain topics — it was mixing up ‘economics and public finance’ with ‘energy,’ for example,” says Van Wagoner. “Basically, some of our training data had been misclassified, so the machine was confused.”
By refining their model in an iterative process, ultimately feeding it around 900,000 tweets as part of the training, the team got accuracy levels up to the scientifically accepted benchmark of 90%. Now, they could look at the results they were getting and know they were trustworthy.
How Congress tweets
So how does the tool work, exactly, and what has it revealed? “If you go to the project site, you can quickly find results by representative — for instance, which individual tweets about an issue more, compared to the rest of Congress They can also find out which party and state tweets about different issues more,” Daniels says.
For instance, about 12% of Mitch McConnell’s tweets were about immigration, compared to only 5% of all Congress’ tweets.
“Or you can look at specific topics — immigration, for one. Our results show just how many more members of Congress in border states like Arizona and Texas tweet about that than those in northern states. So, we have geographic patterns that we’re bringing out, as well as patterns by party.”
Other revealing insights come from looking at when lawmakers tweet about topics — and when they don’t. “A big question for us is: ‘Okay, what are the trends in topics that people tweet about over time?’” says Smart. “We can see, for example, that some members of Congress tweet more about an issue like climate change, say, when a particular incident relating to this topic has cropped up in the media. Meanwhile, others also raise the topic even when it’s not part of the news cycle.” Alexandria Ocasio-Cortez, for instance, tweets about the environment dramatically more than the Congressional average.
In short, the tool can make it clear what congressional commentary is powered by the news cycle and what is not. “Its findings aren’t tied to any media echo chamber,” says Daniels. “Instead, there’s potential to highlight the issues that will stay as relevant one year from now as they are today.”
For Van Wagoner, the fact that the tool is politically neutral makes it a particularly powerful asset. ‘“We don’t provide prescriptive answers, or our opinion on any given set of tweets,” he says. “Say we see a particular congressperson is tweeting a lot about redistributive taxation. There’s no comment at all from us or The Pudding on whether that’s a good or a bad thing. We just want to surface this information for the general population in as unbiased a way as we possibly can — then let them decide.”
Cutting through the noise
With the 2020 elections approaching, Salesforce and The Pudding plan to update their app on a daily basis as new congressional tweets come in.
“It’s been pretty exciting seeing the more finalized mock-ups and then the live link and actually being able to say, ‘Wow. This isn’t just a big spreadsheet of numbers anymore. This is something that can actually be useful to people, starting now,’” says Van Wagoner.
“The potential is amazing,” agrees Jones. “There’s also the notion that even on Capitol Hill itself, candidates and representatives might find it useful. Imagine the conversation: ‘Okay, my press kit says I’m big on immigration, but how am I coming across on the issue on Twitter, compared to others?’ Our tool could help to gauge that.”
Ultimately, however, the team agrees that the real kick they’re getting from their Twitter project is simply putting a world-class AI-powered tool in ordinary people’s hands. Jones puts it like this: “If our app can help people better understand daily commentary from political leaders and more easily engage with them as citizens, our job is done.”
To find out more about Salesforce’s AI research, check out https://einstein.ai/