Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Using Search Data to Detect Cancer

DZone's Guide to

Using Search Data to Detect Cancer

Despite the billions that have been spent trying to beat cancer, it remains one of the deadliest diseases known to man. The last year has seen a number of attempts to use AI to better diagnose the disease in the hope that faster detection can lead to better treatment.

· Big Data Zone
Free Resource

Effortlessly power IoT, predictive analytics, and machine learning applications with an elastic, resilient data infrastructure. Learn how with Mesosphere DC/OS.

Despite the billions that have been spent trying to beat cancer, it remains one of the deadliest diseases known to man.  The last year has seen a number of attempts to use AI to better diagnose the disease in the hope that faster detection can lead to better treatment.

Earlier this year, for instance, I wrote about a project called Gene Pattern, which is designed to aid researchers in the identification of clusters of genetic variations that when pooled together could signify how cancer cells are activated, or indeed how they might respond to treatments.

The researchers have made the algorithm freely available to the scientific community via GenePattern.org in the hope that it will encourage further development by users.

“This computational analysis method effectively uncovers the functional context of genomic alterations, such as gene mutations, amplifications, or deletions, that drive tumor formation,” the researchers say.

Early Detection

A second project, undertaken by researchers at Microsoft, and documented in a recently published paper, have mined the data from their Bing search engine to try and spot cancer.

The data, from over 6.4 million users, suggests that we might be able to detect pancreatic cancer from someone's search queries.  This was possible because users were typing in symptoms of the disease some time before they actually sought treatment for it.

So, when researchers mined the data for queries related to the symptoms of the disease and then cross-referenced for other potential risk factors, they were able to detect the cancer some five months before they were officially diagnosed.

Suffice to say, the team is at pains to point out that at this stage the finding is not designed to suggest an accurate diagnostic tool is available, but that is clearly the direction they are aiming in.

“The goal is not to perform the diagnosis,” they say. “The goal is to help those at highest risk to engage with medical professionals who can actually make the true diagnosis.”

It’s part of a wider area of study that attempts to use things such as search queries and social media postings to ascertain lifestyle characteristics.  Various such studies have proved highly effective markers, but this is perhaps one of the first that shows the potential for something as serious as cancer.

The authors do explain that their approach is not without drawbacks, with the team having to undertake quite a bit of work to clean up the dataset and remove any potential outliers that would muddy the results.

It does, however, suggest that we may one day be able to use this kind of approach to get a headstart on the treatment of many of the most damaging diseases mankind faces today.  It’s a trend that’s well worth keeping an eye on.

Learn to design and build better data-rich applications with this free eBook from O’Reilly. Brought to you by Mesosphere DC/OS.

Topics:
artificial intelligence ,big data ,drones ,machine learning ,career advice ,career development

Published at DZone with permission of Adi Gaskell, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}