Data Breaches in the Age of Cloud Data Platforms

DZone 's Guide to

Data Breaches in the Age of Cloud Data Platforms

Why security is everyone's responsibility?

· Security Zone ·
Free Resource

Between 2015 and 2018, I was leading a data engineering team for a financial services company. We were the first team in the company to use Azure, and we built a data science environment.

Leading the first cloud implementation project put us under the microscope. We spent months discussing and configuring security, networking and governance.

Who Gets Fired in the Case of a Data Breach?

“Valdas, who gets fired in case of a data breach?” – my lead engineer asked me out of the blue

“Has anything happen?!” - some words increase the cortisol (stress hormone) level and a heart rate, “data breach” is one of them

“No. I am curious. We build data pipelines. We configure network and firewall. There is no one else with Azure experience to review it

“Well… There is a security department… But we are the ones building everything.” - I mumbled

It was obvious to both of us that we would be the first to get interrogated in case of a data leakage. Leading an autonomous data team with a mandate to choose technology was no longer fun. It got me thinking:

  • Are we protected from all the possible attacks?
  • How to ensure we do not make stupid mistakes?
  • What is the security department’s responsibility?
  • How security responsibilities divide between us and a cloud provider?
  • How to bridge the security knowledge gap in the team?

Security concerns

Photo by Adi Goldstein on Unsplash

It’s good to learn from your mistakes. It’s better to learn from other peoples’s mistakes -
Warren Buffett

In 2017, Equifax, an American credit reporting agency, announced a data breach. They exposed the personal information of 147 million people.

What did the hackers find?

  • First and last names.
  • Social Security numbers.
  • Birth dates.
  • Addresses.
  • Driver’s license numbers.

The hackers looked for exposed assets. A public-facing web server without the latest patch was a perfect victim. The attackers accessed internal Equifax servers by using Apache Struts security exploit.

See, unpatched vulnerabilities is one of the methods attackers use to access internal networks. The security specialists call it an attack vector.

Equifax attack surface matrix - step 1

Table 1 Equifax attack surface matrix - step 1

Having access to an internal network does not yet mean access to data. The next attack vector used against Equifax was compromising employee credentials. Finding a server with usernames and passwords was a breeze.

Table 2 Equifax attack surface matrix - step 2 & 3

Table 2 Equifax attack surface matrix - step 2 & 3

Misfortunes Come in Pairs - an Old Polish Proverb

In fact, the attack was a combination of charges targeting specific devices and applications. The term for all the possible attack points is an attack surface. The matrix is one of the representations.

Access to internal network and weak credentials opened up the Equifax’s databases. Under the guise of an authorized user, the attackers proceeded following steps:

  • Performed 9000 scans of the databases.
  • Extracted information into small temporary archives.
  • Downloaded data from the Equifax servers.
  • Removed the temporary archives once completed.

Table 3 Equifax attack surface matrix - step 4 & 5

Table 3 Equifax attack surface matrix - step 4 & 5

Unpatched servers, weak passwords and loose network led to losing protected data. In other words, caused a data breach.

At Equifax, the data breach happened by exploiting 5 attack vectors.

“I hear you, man! I am going to focus on fixing these 5 loopholes and my servers are bulletproof!” - I hear someone shouting

Unfortunately, the list of all possible attack vectors is way longer. Hackers discover new issues. Also, each company has a unique technology landscape, different hardware and software combination. Like the combination of your wallpaper and desktop icons is unique to you.

Table 4 Attack surface matrix example

Table 4 Attack surface matrix example

The expanded table above includes more attack vectors. How does it compare to your IT landscape?

Saying “I am sorry” Is Not Enough

Actually, there are fines and settlements, depending on the data breach impact, leaked contents.

Equifax has to pay up to $700 million in fines as part of a settlement with federal authorities over a data breach.

See, it as an expensive mistake.

To date, it is the biggest penalty under The Federal Trade Commission (USA).

In Europe, there is The General Data Protection Regulation (GDPR).

GDPR sets forth fines of up to 10 million euros, or, in the case of an undertaking, up to 2% of its entire global turnover

The biggest penalty under GDPR to date is a fine of 50 million euros imposed on Google. The company didn’t clarify data processing and usage for ad targeting.

British Airways’ website diverted users’ traffic to a hacker website. This resulted in hackers stealing the personal data of more than 500 000 customers. Result? There are ongoing trials and a possible fine of 200 million euros.

Marriot exposed 339 million guest records. Fine? 110 million euros.

Both, British Airways and Marriot, operate in the COVID-19 hardest-hit industries. Hence, the EU has delayed its final decision.

Should everyone in IT worry about security?

data security meme

Photo by National Cancer Institute on Unsplash

Presumably, you work with a data warehouse or a data lake. Often, it runs on a servers in a strict security zone. In other words, you can’t simply open up Google Search or Stack Overflow there. There is no internet access. Similarly, external users can’t access the server.

I have bad news for you too:

  1. The Equifax breach shows a multistep approach. The attackers might get into the network through seemingly unrelated systems.
  2. A data warehouse or a data lake stores the most important enterprise data. What is not obvious at the first glimpse, you connect to other systems to get that data. It seems like a central place to get all the credentials also.

You should be especially careful with systems storing customer sensitive data. Under the GDPR, sensitive data is:

  • Data consisting of racial or ethnic origin
  • Political opinions
  • Religious or philosophical beliefs
  • Trade union membership
  • Genetic or biometric data
  • Data concerning health
  • Data concerning a person’s sexual orientation.

Cloud to the Rescue!?

cloud provider meme

Photo by nappy from Pexels

One of the most popular cloud storages is Amazon Web Services (AWS) S3. It is a general purpose, storage to store data, files, movies. New stories about exposed AWS S3 buckets occur regularly.

Noam Rotem and Ran Locar created one of the latest leakage report, with S3 as the main hero.

They identified a database containing highly sensitive files from several British consulting firms.

What did the white hat hackers find?

  • Full names
  • Addresses
  • Phone numbers
  • Email addresses
  • Dates of birth
  • National Insurance numbers
  • Immigration and Visa statuses
  • Nationalities
  • Salary details
  • And more

It is just the tip of the iceberg.

In this case, the files were being stored on an AWS S3 storage. It is important to note that open, publicly viewable S3 buckets are not a flaw of AWS. They are usually the result of an error by the owner of the bucket.

AWS user interface

Secure AWS S3 buckets


  • Amazon provides detailed instructions to help users secure S3 buckets.
  • A customer applies the instructions to keep the data secure.

Responsibilities Between You and Your Cloud Provider

Azure, AWS, and GCP have something they call “the shared responsibility model”. I am going to use Microsoft's approach to explain it.

As you move to Azure, some responsibilities transfer to Microsoft. The areas of responsibility between you and Microsoft depend on the deployment type.

Azure shared responsibility model

Azure shared responsibility model

Regardless of the deployment type, the following responsibilities are always retained by you:

  • Data
  • Your devices
  • Access and account management

Help Needed! Where Are the Security Experts?

Security isn't my responsibility meme

Photo by John Amachaab on Unsplash

By now, you know more about possible cyber-attacks. Also, the cloud providers do not protect you from everything. Who else can help you to avoid data breaches? The security department?

Unless you build security solutions, the security teams do not participate in development. Instead, they focus on:

  1. Keeping production systems secure
  2. Education and guidance
  3. Solution reviews
  4. Last moment saves by live systems monitoring

“Information Security is always coming up with a million reasons why anything we do will create a security hole that alien space-hackers will exploit to pillage our entire organization and steal all our code, intellectual property, credit card numbers, and pictures of our loved ones.” - The Phoenix Project by Gene Kim

Development (builders) wants to deploy solutions into production. Security and operations see new releases and updates as potential enemies. They are gate keepers.

Takeshi Castle

One of my favorite IT books is The Phoenix Project by Gene Kim. It tells a story of a fictional company and their struggles with an important IT project.

The lessons I learned from “The Phoenix Project”:

  • Teams own the product they develop
  • Integrate security into your daily work
  • Strive for trust between development, operations and security

Solution? Development teams need a facilitator role between development, operations and security. Someone who understands the new system, potential threats, infrastructure & networking requirements.

You need a DevOps engineer in your team!

What’s Next? My Three Recommendations

The cloud computing is the future. Cloud services slash the development time, enable novel possibilities. And at the same time, expose to new risks.

The cloud providers integrate advanced security mechanisms to keep you safe. Some of it works by default, some needs extra effort. In fact, enabling data encryption, patching your servers or preventing DDoS has never been easier.

Don’t be lazy, and take care of your IT systems security

First, understand security threats and be able to mitigate them. Do not rely blindly on a cloud provider or the security department.

Every team should have at least one person understanding firewall, encryption, networking, etc.

JFK security meme

Secondly, Minimum Viable Products (MVPs) are not the best designed pieces of software. MVPs are tiny and small in functionality, but often run in production environments.

adding temporary feature meme

In another blog post, I shared a standard process to run a Big Data prototype.

Remember, running an MVP is not an excuse to overlook your security best practices!

keep MVPs on short leash

Third, understand potential threats and make sure you configure:

  • Firewall.
  • Encryption at rest.
  • Encryption in transit.
  • Authorization.
  • Authentication.
  • Password and key management.
  • Patching and updates.
  • Azure configuration.
  • Networking.
  • Cross-site scripting.
  • Deployment.
  • Hundreds of other nitty-gritty details.

Hopefully, you don’t forget about something. That would be expensive… (see the Equifax story above).

To ensure I don’t forget about tiny configuration details, I always follow my security checklist:

Finally, Who Gets Fired After a Breach?

One question raised at the beginning of this post still stays unanswered - who gets fired after a breach?

In 2017, McAfee, an American global computer security software company, did a survey among IT security leaders. They asked the same question:

Who is fired after security breach

What is obvious, whenever “sh*t hits the fan”, it affects not only business and technology leaders. Surprise, surprise! Engineers are responsible for their implementations too.

cloud security, data privacy and data security, data security

Published at DZone with permission of Valdas Maksimavičius . See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}