Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Using Big Data to Spot Extremists Online Before They Post Anything Dangerous

DZone's Guide to

Using Big Data to Spot Extremists Online Before They Post Anything Dangerous

We take a look at the work of data scientists and software engineers out of MIT who are using predictive analytics and text analysis on Twitter.

· Big Data Zone ·
Free Resource

How to Simplify Apache Kafka. Get eBook.

There is a growing demand among politicians and the wider public for social networks to identify and remove hate speech from their websites. Doing so is often easier said than done however, but a new study from the Massachusetts Institute of Technology highlights how extremists can potentially be identified, even before they post any threatening content.

The scale of the problem was highlighted in 2016 when Twitter revealed that it shut down around 360,000 ISIS related accounts. Identifying most such accounts depends on users reporting them, and there is little that can be done to prevent a user from simply creating another account.

"Social media has become a powerful platform for extremist groups, ranging from ISIS to white nationalist "alt-right" groups," the study authors say. "These groups use social networks to spread hateful propaganda and incite violence and terror attacks, making them a threat to the general public."

Spotting Them Early

The research team examined Twitter data from around 5,000 users who had already been identified as ISIS members, or who were connected to known ISIS members. This data included around 4.8 million tweets, together with data on account suspensions, both of the accounts themselves and of their friends and followers.

The team used this data to create a model they believe allows them to predict new extremist users, while also identifying multiple accounts belonging to the same person. They can also use the model to predict the network connections of extremist users who had been banned but had created a new account.

What's more, the researchers were also able to accurately identify around 70% of any additional Twitter accounts used by extremists, with just 2% of the accounts identified being false positives.

"We created a new set of operational capabilities to deal with the threat posed by online extremists in social networks," the researchers explain. "We are able to predict who is an extremist before they post any content, and are then able to predict where they will re-enter the network after they are suspended. In short, we can automatically figure out who is an extremist and keep them off the social network."

Suffice to say, the study focused purely on spotting potential members of ISIS, but the team are confident that their approach can easily be used for spotting various other kinds of extremist too.

"Users that engage in some form of online extremism or harassment will have very similar behavioral characteristics in social networks," said Klausen. "They will connect to a specific set of users which form their extremist group. They will create new accounts which will resemble their old accounts after being suspended, and when they return to the social network following a suspension, there is a high probability they will reconnect with certain former friends."

12 Best Practices for Modern Data Ingestion. Download White Paper.

Topics:
big data ,text analysis ,predictive analytics ,data analysis

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}