Over a million developers have joined DZone.

Security in Enterprise-Ready Data Lake

What about Hadoop? Follow along with Neeraj Sabharwal as he takes you through security in an enterprise-ready data lake.

· Database Zone

Build fast, scale big with MongoDB Atlas, a hosted service for the leading NoSQL database. Try it now! Brought to you in partnership with MongoDB.

Customer: Is Hadoop enterprise-ready? 

Me *Standing next to the white board*:  Yes, and that's why we use the term "Enterprise Ready Data Lake." 

Imagine that there are 3 points.

  1. You need to prove your identity to get access to Lake and then need permissions or authority to access data.
  2. Once you proved your authenticity then demands comes to manage the lifecycle of data from it's requirement to retirement "Automated process."
  3. Life Cycle Management process needs to be integrated with a Governance solution to manage data of data "metadata," data lineage, auditing and more to fulfill security and compliance requirement.


Point 1

Entry Point: You must have strong Authentication in place to get into the system and more users will be coming in to access data as we move away from silos of data to a centralized repository. The access management must be easier to manage, i.e Security solution should have a centralized place to Admin (create, define and manage) security policies. Once users get in and have access then we need to track their actions and that's Auditing. At last, Data Encryption is in motion & at rest.

Point 2

Security is in place and now we know that Data ingestion is occurring with full security. Now, business wants to manage the lifecycle of data in one common place "Data replication, retention, handling late data arrival rules, data mirroring and visualize the complete data pipeline."

Point 3

Once data lifecycle management in place then we will be generating more data of data "metadata" and there is existing legacy metadata that need to be exchanged with the Hadoop system. This generates the requirement of Data Governance solution. This solution should provide complete data lineage, exchange, search functionality 


Customer:
 Yes, this is exactly what we are looking for. All this must be well integrated and please provide this as 100% open source but enterprise ready solution.

Solution:

Security

Data Lifecycle Management

Data Governance







HCC https://community.hortonworks.com

Happy Hadooping!!!

Kerberos is a must in production.

Join me.

Now it's easier than ever to get started with MongoDB, the database that allows startups and enterprises alike to rapidly build planet-scale apps. Introducing MongoDB Atlas, the official hosted service for the database on AWS. Try it now! Brought to you in partnership with MongoDB.

Topics:
hadoop ,security ,big data

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}