DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Software Design and Architecture
  3. Security
  4. What’s New in HDP 2.6 for Enterprise Data Governance and Security? (Part 1)

What’s New in HDP 2.6 for Enterprise Data Governance and Security? (Part 1)

By building security and data governance into the platform, we ensure that capabilities are administered consistently across all the components or data engines.

Srikanth Venkat user avatar by
Srikanth Venkat
·
Syed Mahmood user avatar by
Syed Mahmood
·
May. 15, 17 · News
Like (3)
Save
Tweet
Share
4.72K Views

Join the DZone community and get the full member experience.

Join For Free

Hortonworks continues to advance the Hortonworks Data Platform (HDP) as an integrated portfolio of enterprise security and governance products for big data. By building security and data governance into the platform, we ensure that these capabilities are administered consistently across all the components or data engines, and when new engines are added to the platform they inherit the same level of security and governance.

This is why we’ve responded to customer requests for more enterprise capabilities with substantial investment in security and data governance options. Hortonworks recently announced the general availability of HDP version 2.6. This is a review of the new features and functionality that are introduced as part of Apache Atlas, Apache Ranger, and Apache Knox in HDP 2.6.

Data Governance

Hortonworks has been working alongside the Apache community on critical advancements for open metadata and governance via Apache Atlas. The vision for Apache Atlas project is to provide core metadata-driven governance services for Hadoop and enterprise data ecosystems. Key enterprise metadata and governance features in Atlas include:

  • Data lineage/provenance visualization.
  • Data classification.
  • Metadata catalog and search.
  • Enterprise ready real-time metadata and lineage ingestion with Hive, Sqoop, and Storm/Kafka.
  • Extensible APIs for custom metadata ingestion and APIs to register custom models.
  • Apache Ranger integration for classification based security.
  • Robust Metadata Repository to provide a flexible metamodel to capture technical, business, operational metadata.
  • Out-of-box metadata models for Hive, Storm, Sqoop, HDFS, Kafka, and HBase.

Atlas 0.8.0 included with HDP 2.6 offers the following key enhancements.

Structured Higher-Level API (ATLAS-1223, ATLAS-1241, ATLAS-1234, ATLAS-1308)

Apache Atlas APIs have been re-platformed to v2 and enhanced significantly to make them easier for community and partners to consume. The new easy-to-use and streamlined API makes it easier for user and partners to build extensions as well as accomplish more with a more succinct API set. The community has also added Swagger-based API documentation that will help improve onboarding of new users and help the community develop faster by making it easier for developers to understanding how to use APIs. The earlier v1 version of the APIs that were available in releases until HDP2.5 (Apache Atlas 0.7.x) are deprecated as of HDP2.6 and support for the older version will be terminated in a future release.

Revamped Search User Experience With Basic/Advanced Search (ATLAS-1630)

The metadata catalog search experience for users has been streamlined to offer performant and efficient search interface. Atlas metadata supports both a basic search functionality that will allow users to perform a search using a combination of entity type, classifications, and names (including wildcard support) as well as advanced search using Apache Atlas SQL-like query language DSL.

Separation of Lineage and Impact in Visualization (ATLAS-1667)

In HDP 2.6, the distinction between lineage and impact is shown clearly and visually for data assets. Lineage which answers the question where an entity originated from (source/provenance) is represented by the upstream path through all data assets and processes leading up to the current data asset. On the other hand, Impact answers the question of how a specific data is being used and what other data assets (derivative/dependent) does it impact. The impact is shown via the downstream path through all data assets and processes leading out of the current data asset. Lineage and Impact analysis, which are valuable enterprise features for forensic analysis, auditing, and compliance, are now even better in HDP 2.6. Lineage is shown visually with green arrows and impact is shown with red arrows in the Lineage and Impact widget on an asset detail page.

Classification (Tag)-Based Policy Support for HDFS, Kafka, HBase: (ATLAS-1309)

Building on the classification based security framework introduced in HDP 2.5 for Hive, the community has extended classification based security workflow coverage across the ecosystem. HDFS, Kafka, and HBase can now have classification policies applied via the integration of Atlas tagging with Apache Ranger’s tag-based policy. This new capability provides unified policy authoring and eases the security administration overhead in large enterprises by providing a simple authoring but extensive security policy framework that can be applied uniformly across multiple Hadoop components.

Knox SSO for Atlas UI (ATLAS-1244)

HDP 2.5 introduced enterprise ready SSO capabilities for the Hadoop ecosystem, by adding SAML v2 based SSO authentication via Apache Knox for Apache Ambari and Apache Ranger UIs. In HDP 2.6 this framework has been extended to include Atlas UI which also participates in SSO via Apache Knox.

Manually Create/Update Entities to Support HDFS, HBase, Kafka, and Custom entities (ATLAS-1193)

In HDP 2.6, the community has added the capability to update and create different types of entities in UI. This feature will enable manual addition, metadata maintenance and curation of data assets especially those for which built-in connectors or hooks are not yet available. Users can now register or update types (including custom types) with a REST call and subsequently define and manage all of the metadata for entities of that type via a manual form based UI in Apache Atlas. Once those entities are created they can be classified and tag-based Ranger policies can be applied for those entity types in Atlas.

Getting Started

That’s just a brief overview of the new features. Please check out the following links to learn more about HDP 2.6 data governance features and how to get started:

  • Product web pages: Apache Atlas and Security and Governance

  • Product documentation

  • Apache Atlas V2 API docs

  • Recent DataWorks Summit presentations on Apache Atlas

  • HDP 2.6 Sandbox

Hierarchical Dirichlet process Data (computing) Data governance security Metadata repository Database

Published at DZone with permission of Srikanth Venkat, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • REST vs. Messaging for Microservices
  • How to Submit a Post to DZone
  • HTTP vs Messaging for Microservices Communications
  • Full Lifecycle API Management Is Dead

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: