Over a million developers have joined DZone.

GDPR: Threat or Opportunity?

DZone's Guide to

GDPR: Threat or Opportunity?

We take a look at what can be expected when the GDPR legislation takes effect in May of this year. We also examine whether there is a way to view this as an opportunity.

· Security Zone ·
Free Resource

DON’T STRESS! Assess your OSS. Get your free code scanner from FlexeraFlexNet Code Aware scans Java, NuGet, and NPM packages.

In May 2018, the European Union’s General Data Protection Regulation (GDPR) will take effect. These are sweeping changes to data privacy laws and apply to any company dealing with the personal data of EU subjects. There are a number of core data processing and retention mandates under GDPR, which will generally require major overhauls of current enterprise data practices.

I will not cover every particular detail of the GDPR in this post mainly because I am not a lawyer. Instead, I want to discuss the general threat GDPR poses to enterprises and the urgent need to be compliant. As a solution to GDPR-compliance, I have found that the GRAKN.AI database proves very useful.

My current role is a full stack developer for the European Respiratory Society (ERS). The ERS is an international organization, based in Switzerland, that brings together physicians, healthcare professionals, scientists, and other experts working in respiratory medicine. In this capacity, I have begun to build a Proof of Concept of a system using GRAKN.AI to not only fulfill GDPR compliance for our data but also to improve our user experience across our many websites by providing content personalization and recommendation. In addition to content recommendations within our websites, I also intend to use Grakn to provide recommendations for our conferences. This system will build personalized congress itineraries, helping our delegates navigate our two hundred plus sessions over five days. Ultimately the ERS will be able to provide an overview of pulmonary medicine with its “pulmonary knowledge base.”

Now that you’re aware of my background and the general use case I am building with GRAKN.AI, it would be helpful to understand a bit more about what GDPR is and why compliance is essential.

GDPR and Compliance

GDPR is huge paradigm change. If before the GDPR area data regulation was mostly optional and one its most visible impacts was in newsletters where the unsubscribe link became mandatory, it now requires “privacy by design and by default” (Art. 25). This means that any new application has to be designed around privacy, it cannot be an afterthought, and you have to be able to demonstrate that this is, in fact, the case. The software might even need to be certified, as the GDPR encourages it:

The Member States, the supervisory authorities, the Board and the Commission shall encourage, in particular at Union level, the establishment of data protection certification mechanisms and of data protection seals and marks, for the purpose of demonstrating compliance with this Regulation of processing operations by controllers and processors. The specific needs of micro, small, and medium-sized enterprises shall be taken into account. (Art 42.1)

To be certified and able to showcase a mark or a seal for a product could be a key differentiator in order to obtain new contracts or lose current clients.

What happens when a company is a small service provider and that this company works with big international companies and that one of those companies is inspected for data compliance as someone complained that it received unsolicited emails that the small service company has sent? The big company may be fined for 4% of their annual income, but they will want their money back. Lawyers will soon be knocking on the door of the small company.

This means that big companies will only be working with companies that are GDPR-compliant as they will not want to risk 4% of their income, thus companies that fail to provide proof of their compliance might be out of business

Of course, “Privacy by design” does not exclude older software. They will need to be adapted in order to comply. That process also needs to be documented, as it is important that a company could prove that they did everything possible to be compliant

Given the size and importance of the EU market, the GDPR has serious consequences for companies across the world. Nobody knows exactly what will happen when these regulations take effect. What we know right now are the penalties for non-compliance:

Under the GDPR, organizations in breach of the GDPR can be fined up to 4% of annual global turnover or €20 Million (whichever is greater). This is the maximum fine that can be imposed for the most serious infringements, e.g. not having sufficient customer consent to process data or violating the core of Privacy by Design concepts. There is a tiered approach to fines, e.g. a company can be fined 2% for not having their records in order (article 28), not notifying the supervising authority and data subject about a breach, or not conducting impact assessment. It is important to note that these rules apply to both controllers and processors — meaning ‘clouds’ will not be exempt from GDPR enforcement. (ref: Key Changes)

GDPR is clearly a threat if you do not comply, as the minimal fine is €20 million or 4% of the annual global turnover. The 4% fine is applied if four percent of the global turnover is greater than twenty million Euros.

Given this, we should first ask what sort of data GDPR covers. The regulator defines personal data as:

Any information related to a natural person or ‘Data Subject,’ that can be used to directly or indirectly identify the person. It can be anything from a name, a photo, an email address, bank details, posts on social networking websites, medical information, or a computer IP address. (ref: FAQ)

In order to collect such data, under the GDPR, a company collecting user data has to clearly ask for consent in a clear way and state what the data will be used for:

The conditions for consent have been strengthened, and companies will no longer be able to use long illegible terms and conditions full of legalese, as the request for consent must be given in an intelligible and easily accessible form, with the purpose for data processing attached to that consent. Consent must be clear and distinguishable from other matters and provided in an intelligible and easily accessible form, using clear and plain language. It must be as easy to withdraw consent as it is to give it. (ref: Key Changes)

Moreover, the user also has to have an easy way to delete or to task to delete the data a company owns about them and to revoke access to some consent that was given. The data also needs to be portable. Which means that a user can request, at any time, that the data a company holds about them and transfer it to somebody else.

Additionally, one of the core mandates of the GDPR is that privacy needs to be by design. All new systems need to have privacy at their core and not added later as an afterthought. All these changes have a huge impact on companies as they have to basically change all their systems that collect user data to be compliant, and they have to make sure that the data of a user that has requested to be deleted is really deleted.

What Will Happen When the GDPR Takes Effect?

How far will regulators go to enforce the GDPR? As noted in the previous section, no one really knows how GDPR compliance will be enforced until some companies are caught for non-compliance and fined. Then we will know how the law is applied. At this juncture, there are still many questions that seem unclear.

Some of these questions are broadly political: for example, what sorts of companies will be the first to be fined? Will ‘examples’ be made to single out certain industries that haven’t been good data privacy practitioners? But, beyond these political questions, there are just as many important technical questions.

To give just one example: What about backups? If I have two years of backed-up data and somebody requires that their data be deleted should I modify all the backups? If I keep a “difference table” somewhere, that will re-delete the user if a backup needs to be restored, is that user really deleted since I still have a trace of that user? These questions are endless.

Some may think that they will anonymize data and it will solve all their issues as it will prevent “indirect identification.” Most likely it will not. Consider the following:

"De Montjoye and colleagues examined three months of credit card transactions for 1.1 million people, all of which had been scrubbed of any [personally identifiable information]. Still, 90% of the time he managed to identify individuals in the dataset using the date and location of just four of their transactions. By adding knowledge of the price of the transactions, he increased “reidentification” (the academic term for spotting an individual in anonymized data) to 94%. Additionally, women were easier to reidentify than men, and reidentification ability increased with income of the consumer." - Harvard Business Review
"Latanya Sweeney found that 87 percent of the population in the United States, 216 million of 248 million, could likely be uniquely identified by their five-digit ZIP code, combined with their gender and date of birth." - Wired

This example shows that it is extremely difficult to anonymize data. “Ultimately, the hallmark of both anonymization and pseudonymization is that the data should be nearly impossible to re-identify. This theory, however, has its practical and mathematical limits.” A data point on its own will be anonymous but when many data points are put together, it might lead to re-identification. Unfortunately, there are no clear guidelines on anonymization by the legislator.

Although no guidelines are available, data anonymization or pseudonymisation is very important for compliance. The GDPR describes it as follow:

‘Pseudonymisation’ means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural person... ( Art 4.5)

This pseudonymisation, according to the GDPR itself, gives liberty to the data controller to use or reuse data beyond the explicitly given consent by the user. Article 6.4.e is very clear on that point:

The controller shall, in order to ascertain whether processing for another purpose is compatible with the purpose for which the personal data are initially collected, take into account, inter alia: […] e) the existence of appropriate safeguards, which may include encryption or pseudonymisation.

Of course, the lack of clear guidelines is a problem, as in our age of Big Data and machine learning it will be very difficult to guarantee pseudonymisation. But, if a company can prove that it has taken all reasonable measures to make sure that the data is reused in the best manner, they should be on the safe side. This, of course, is not legal advice.

How Can We View the GDPR as an Opportunity?

It seems that the GDPR is clearly a challenge, as companies have to change all their systems and this costs time and money. Companies must also re-evaluate the parties with whom they work. Importantly, companies also have to change the way the whole company handles data: the change is not only technical. The GDPR is about a general approach towards data privacy and protection. In theory, an assistant keeping excel sheets on a computer could have the whole company facing at least a €20 million fine.

I would argue though that, despite the challenge inherent in this change, the GDPR is also an opportunity. It is an opportunity to review all the systems a company holds and uses, to review the flow of data and pinpoint areas where it could be improved. It is also a legal opportunity, as you are given a unique chance to review or break contracts that go against the incoming new law, thus you could go away from bad deals made earlier in the life of the company.

But, mostly, I see it as a data opportunity. Indeed, to be GDPR-compliant, you have to provide a way to easily delete all the data of a user and give the user an overview of what the company holds on them. Thus, arguably, the regulators are asking companies to create a user tracking system, as companies need to know everything about users. This has huge potential value. It certainly costs money and resources to put in place, thus we should make the most of it!

So, instead of viewing the GDPR as an obstacle that must be tackled, we should instead embrace the opportunity to provide a dashboard to users that lets them deal with their data and that will also let the company know how scattered their data is across systems, such that data can be easily kept track of and deleted if required. That’s the GDPR side of things.

But companies should build on this system and also track their content and add user behaviors — such as what they have read, what event they attended, what they bought, what they commented on, what they clicked on - imagination is the limit. When you have all this data in one system you can start improving user experience by personalizing it for them. Therefore, as much as the GDPR is a challenge for enterprises, the GDPR also offers companies an opportunity to build knowledge bases from which to reason with data and extract value with recommender systems.

GRAKN.AI is a wonderful tool to help companies get there. In the next post, we’ll see some of the specifics in using GRAKN.AI for GDPR-compliance.

Try FlexNet Code Aware Today! A free scan tool for developers. Scan Java, NuGet, and NPM packages for open source security and license compliance issues.

gdpr ,security ,data security ,data privacy

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}