Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Security in Enterprise-Ready Data Lake

DZone's Guide to

Security in Enterprise-Ready Data Lake

What about Hadoop? Follow along with Neeraj Sabharwal as he takes you through security in an enterprise-ready data lake.

· Database Zone ·
Free Resource

Discover Tarantool's unique features which include powerful stored procedures, SQL support, smart cache, and the speed of 1 million ACID transactions on a single CPU core!

Customer: Is Hadoop enterprise-ready? 

Me *Standing next to the white board*:  Yes, and that's why we use the term "Enterprise Ready Data Lake." 

Imagine that there are 3 points.

  1. You need to prove your identity to get access to Lake and then need permissions or authority to access data.
  2. Once you proved your authenticity then demands comes to manage the lifecycle of data from it's requirement to retirement "Automated process."
  3. Life Cycle Management process needs to be integrated with a Governance solution to manage data of data "metadata," data lineage, auditing and more to fulfill security and compliance requirement.


Point 1

Entry Point: You must have strong Authentication in place to get into the system and more users will be coming in to access data as we move away from silos of data to a centralized repository. The access management must be easier to manage, i.e Security solution should have a centralized place to Admin (create, define and manage) security policies. Once users get in and have access then we need to track their actions and that's Auditing. At last, Data Encryption is in motion & at rest.

Point 2

Security is in place and now we know that Data ingestion is occurring with full security. Now, business wants to manage the lifecycle of data in one common place "Data replication, retention, handling late data arrival rules, data mirroring and visualize the complete data pipeline."

Point 3

Once data lifecycle management in place then we will be generating more data of data "metadata" and there is existing legacy metadata that need to be exchanged with the Hadoop system. This generates the requirement of Data Governance solution. This solution should provide complete data lineage, exchange, search functionality 


Customer:
 Yes, this is exactly what we are looking for. All this must be well integrated and please provide this as 100% open source but enterprise ready solution.

Solution:

Security

Data Lifecycle Management

Data Governance







HCC https://community.hortonworks.com

Happy Hadooping!!!

Kerberos is a must in production.

Join me.

Discover Tarantool's unique features such as powerful stored procedures, SQL support, smart cache, and the speed of 1 million ACID transactions on a single CPU.

Topics:
hadoop ,security ,big data

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}