Over a million developers have joined DZone.

Using Twitter to know where to avoid

DZone's Guide to

Using Twitter to know where to avoid

· Big Data Zone ·
Free Resource

How to Simplify Apache Kafka. Get eBook.

The early days of social media, and especially Twitter, were typified by accusations that people were merely sharing what they had for dinner.  Some researchers from the University of Rochester have taken what used to be a stick to beat social with, and turned it into something by which we can grade the quality of a nearby restaurant.

The researchers believe their monitoring system can not only aid diners looking for somewhere great to eat, but also public health officials looking for more frequent ways of checking up on restaurants.

The system, called nEmesis, analyses millions of tweets, and is on the hunt for people sharing an attack of food poisoning after visiting a restaurant.  You might think, or hope at least, that this would be a relatively small number, but over a four month period they found 480 such mentions in New York City alone from a total of 23,000 restaurant visitors.  What’s more, the data collected correlated well with public health data on those diners.

“The Twitter reports are not an exact indicator – any individual case could well be due to factors unrelated to the restaurant meal – but in aggregate the numbers are revealing,” said Henry Kautz, chair of the computer science department at the University of Rochester and co-author of the paper. In other words, a “seemingly random collection of online rants becomes an actionable alert,” according to Kautz, which can help detect cases of foodborne illness in a timely manner.

The system is pretty clever.  First of all it monitors the tweets people make.  Using the geotagging feature on our phones, it can then identify which of those tweets were made at known restaurant locations.  Once it has a target it in its sights, it then monitors that persons tweets for the next 72 hours to see if they tweet about being ill.

The researchers found that around 1/3 of the public health inspection scores could be reliably predicted based upon their Twitter data.

In addition to utilising the algorithm at the heart of nEmesis, the researchers also utilised Mechanical Turk to help train up the algorithm.  They paid human users to categorise certain tweets to help the system learn which ones to take note of, and of course which to ignore.

Suffice to say that there are many more diners that eat out without tweeting about it, so the data may not be wholly representative, but that isn’t really the point of it.  The system isn’t designed to be the silver bullet, but rather a complimentary source of information alongside public health data.  It also provides real time information that perhaps the public data lacks, whilst also providing very cost effective monitoring of the restaurant market.

Original post

12 Best Practices for Modern Data Ingestion. Download White Paper.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}