Over a million developers have joined DZone.

Will They Blend? Experiments in Data and Tool Blending

DZone 's Guide to

Will They Blend? Experiments in Data and Tool Blending

Today's article in the "Will they Blend" series is on "The GDPR Force Meets Customer Intelligence — Is It The Dark Side?"

· AI Zone ·
Free Resource

In this blog series, we’ll be experimenting with the most interesting blends of data and tools. Whether it’s mixing traditional sources with modern data lakes, open-source devops on the cloud with protected internal legacy tools, SQL with NoSQL, web-wisdom-of-the-crowd with in-house handwritten notes, or IoT sensor data with idle chatting, we’re curious to find out: will they blend? Want to find out what happens when IBM Watson meets Google News, Hadoop Hive meets Excel, R meets Python, or MS Word meets MongoDB?

Today: The GDPR Force Meets Customer Intelligence — Is It The Dark Side?

The Challenge

The European Union is introducing the General Data Protection Regulation on May 25, 2018. This new law has been called the world’s strongest and most far-reaching law aimed at strengthening citizens' fundamental rights in the digital age. The new law applies to any organization processing personal data about a citizen or resident in the EU regardless of that organization's location. It is definitely a force to be reckoned with as the fines for non-compliance are extremely high.

Many of the law’s articles deal with how personal data is collected and processed and includes special emphasis, guidelines, and restrictions on what the EU calls “automatic profiling with personal data.” So what has that to do with us data scientists? Many of us use methods and machine learning to create new fact-based insight from the customer or personal data that we collect so that we can take decisions, possibly automatically. So the law applies to us. But that customer intelligence we create is fundamental to the successful running of our organizations business.

Does this mean this new force will put limits on what we do? Or even worse, does it mean that we have to go over to the dark side to help our businesses? Absolutely not! The new law defines that we must gain permission, perform tests, and document. In no way does it restrict what honest data scientists can do. If anything, it provides us with opportunities.

Topic: GDPR meets Customer Intelligence

Challenge: Apply machine learning algorithms to create new fact-based insights taking the new GDPR into account

The Experiment

Distilling the GDPR down for data scientists means we need to look at six topics:

  1. Identify fields of information we will use that contain personal data
  2. Identify any fields of information that contain “special categories of data” (i.e.: those that could discriminate)
  3. If any special categories of data exists, ensure that there are no highly correlated fields that could mimic the special categories
  4. Possibly anonymize any personal data so that the GDPR regulations do not apply and …
  5. …Explain an automatic processing decision (or a model or rule in our case) in a way that is understandable
  6. Document our process that uses all of these data for “automatic processing”

Figure 1. This workflow shows an example of analyzing customer data while adhering to GDPR.

And that is exactly what we do. Since the details of each of these steps and the appropriate workflow examples go far beyond a blog, you can get a full white paper as well as KNIME public server examples.

Workflows are available on the EXAMPLES server under 50_Applications/34_GDPR_examples

Whitepaper "Taking a proactive approach to GDPR with KNIME" is available here.

If we then go on to use the KNIME Server, we can have these standard workflows available for anyone doing work on personal data.

The Results

While the GDPR does introduce new requirements that many of us will need to follow if we are using personal data, in no way does it limit us. If anything, it allows us to surface to our management and others within our organization, showing that what we do with customer intelligence using KNIME is well thought through, well documented, and fully supportive of our organization’s GDPR requirements — so yes, customer intelligence and the GDPR blend! It's definitely not the dark side at all!

data science ,artificial intelligence

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}