Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

A Quick Guide to Identifying Twitterbots Using AI

DZone's Guide to

A Quick Guide to Identifying Twitterbots Using AI

Learn how bots affected the 2016 U.S. presidential election and see why intelligent AI algorithms must phase out spam, bots, and fake content from social media platforms.

· AI Zone ·
Free Resource

Start coding something amazing with the IBM library of open source AI code patterns.  Content provided by IBM.

In one of our last blog posts, we discussed how to identify fake accounts or potential spammers on Twitter. It is important to filter out such information to get reliable and accurate insights. A lot of firms and individuals have taken the game forward and are using Twitterbots to automate and quicken the content delivery. One study estimated that the number of active bots on Twitter may be as high as 15% of the total users.

Initially, Twitterbots were made to reduce human effort. Take Netflix Bot, for example. It tweets whenever a new show or movie is added on Netflix.

Twitterbots

Netflix Bot in action!

There are some other "extraordinary" ones, as well. For example, someone has created a very smart online version of the Big Ben that marks the passing of every hour as shown in the tweet below. Now that humanity is spending more and more of its time online, it will be just a matter of time before our monuments start having an online presence too!

Image title

There is a large herd of Twitterbots who post a large amount of malicious and spam content on the platform. I am sure you can find some on your list of followers, as well. According to Wikipedia, bots had a role to play in the U.S. Presidential Election in 2016.

Role of Twitterbots in U.S. Presidential Elections

A subset of Twitterbots programmed to complete social tasks played an important role in the United States 2016 presidential election. Researchers estimated that pro-Trump bots generated four tweets for every pro-Clinton automated account and out-tweeted pro-Clinton bots 7:1 on relevant hashtags during the final debate. Deceiving Twitter bots fooled candidates and campaign staffers into retweeting misappropriated quotes and accounts affiliated with incendiary ideals.

Twitterbots and spammers try to cloud the views of other users by constantly promoting fake news and opinions. Given that there is no human effort required, bots can tirelessly keep on tweeting about a topic and help make it trending. For a political analyst, market researcher, or anyone else seeking to do in-depth analysis using social media, it is important to identify and filter out these bots to get genuine unbiased opinions.

The Hypothesis

The idea behind our AI-driven approach to identify bots on social media is based on this hypothesis: Tweets made by bots are related to a very narrow topic/context while humans’ tweets are much more diverse.

How Did We Do It?

To use this approach to automatically identify bots, we crawled the latest tweets posted by a large sample of Twitter accounts. For each account, we converted the tweet text into vectors and calculated the similarity by checking the average distance metrics for these tweets. We made sure that the sample of accounts was diverse.

If a handle tweets about the same topic and theme, the tweets (individual data points) will be closely located in the hyperspace due to semantic similarity. These closely packed similar tweets form a cluster. We can quantify the similarity by calculating the cosine distance between any two data points.

Twitterbots

A representation of clusters

The table below represents the results of the analysis. Here, mean distance is the average of all the cosine distances between the individual data points. The lower the mean distance, the more similar the tweets. Clearly, you can infer this from the table. The aforementioned Big Ben Bot has the lowest mean distance among the chosen ones as its posts only contains the word BONG.

Twitterbots

Mean distance table

We chose a few spammer accounts, as well, to highlight the difference between a bot and a spammer. Spammers post about multiple topics time-to-time but bots post about a specific theme generally (we also did a similar analysis to detect spammer accounts). Thus, their mean distance is far greater than that of a bot’s. Notice that the mean distance of TOIIndiaNews (leading Indian news publisher) is nearer to the mean distance of the bots. Generally, such handles follow a standardized structure to post news. Therefore, it has relatively smaller mean distance.

Impacts of Bots on the Real World

Below, I list a few cases where Twitterbots were influential and why it is important to identify them.

  •  The number of followers on social media is considered a popularity metric for celebrities. But should it be, really? As mentioned earlier, around 15% of Twitter users might be bots. Thus, the number of followers is not a concrete metric for popularity. During 2012 U.S. presidential elections, it was reported that 29.9% of Barack Obama’s followers might have been bots or fake accounts. This number for Mitt Romney was around 21.9%. The number of followers after removing bots and spammers can serve as a better popularity metric.
  • Twitterbots have been said to have influenced the opinions of voters by tweeting and retweeting tons of pro-Trump content during 2016 U.S. presidential elections. As mentioned earlier, pro-Trump bots generated four tweets for every pro-Clinton automated account and out-tweeted pro-Clinton bots 7:1 on relevant hashtags during the final debate. Some of the content shared by these bots was fake and deceiving. Thus, it becomes really important to clearly identify these bots to get views and opinions of only from real people.
  • Recently concluded French Presidential Elections also saw the involvement of bots. Just before the election, a massive 9 GB of classified campaign documents related to Emmanuel Macron was posted online. Twitterbots kept posting about it and helped make it a trending topic just hours before the election, though it seemed to have had little effect on the outcome, as Macron won comfortably (which we predicted correctly using AI).
  • Suppose a brand hires a marketing agency for a publicity campaign. However, to judge the efficacy of a campaign, it is important to understand whether the virality of the campaign was due to push from spammers/bots. In that case, it might have a negative effect on the brand and the brand will have a fallaciously increased number of followers. These bots are not the real customers. Thus, it results in a loss from both ends for the brand.

These are a few notable places where bots have influenced views of the audience. Though meant for a better role in the social media, bots are now being targeted mostly as spam on Twitter. Social media platforms are constantly being optimized to fight against such menace. Like any other technology, if used ethically, bots can help you in many ways. They can help you in customer support, marketing, and general business development. Interesting times are ahead, as the future holds the door for the machine intelligence era. It is up to intelligent AI algorithms to help us phase out spam, bots, and fake content from social media platforms.

The above study was carried out by Karna AI, the Market Research division of ParallelDots, Inc.

Start coding something amazing with the IBM library of open source AI code patterns.  Content provided by IBM.

Topics:
ai ,twitter ,bots ,algorithms ,machine learning ,robotics

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}