Waterline Data Brings Automated Data Cataloging to Hortonworks Data Platform through Integration with Apache Atlas

DZone 's Guide to

Waterline Data Brings Automated Data Cataloging to Hortonworks Data Platform through Integration with Apache Atlas

With Rapid Discovery, Governance and Time to Value for All Data Lake Assets, Customers Can Dramatically Accelerate Self-Service Analytics

· Big Data Zone ·
Free Resource

Waterline Data, The Smart Data Catalog Company, today announces the integration of the company’s Smart Data Catalog with Apache Atlas within Hortonworks Data Platform (HDP). This announcement is being made from the Hadoop Summit being held in San Jose, June 28-30.

The Apache Atlas project provides data governance framework and capabilities for Hadoop that effectively address many compliance requirements. With the addition of Waterline Data’s Smart Data Catalog, Apache Atlas users can replace manual tagging of metadata with an automated process that rapidly classifies the data assets in their data lake, including new data even as it’s created. Unlike catalogs that scan historical SQL logs, Waterline Data automatically catalogs every field of data in the data lake while capturing and learning from tribal knowledge.

HDP is the industry's only true secure, enterprise-ready open source Apache Hadoop distribution based on a centralized architecture (YARN). HDP addresses the complete needs of data-at-rest, powers real-time customer applications and delivers robust analytics that accelerate decision-making and innovation.

“We are very excited that Waterline Data has integrated their automated smart data cataloging capabilities with Apache Atlas, which brings added value to Waterline and Hortonworks users,” said Matt Morgan, Vice President of Product and Alliance Marketing, Hortonworks. “This helps customers rapidly organize their data lake, enabling more secure, compliant and optimal use of their data through Atlas.”

With this announcement, Waterline Data has now earned the Governance Ready badge. Previously, Waterline Data has earned HDP Certification and YARN integration certification. 

This new integration allows common customers to:

  • Accelerate data discovery, governance and time to value through smart data discovery capabilities
  • Provide data engineers, data scientists and business analysts with secure self-service access to trusted, high quality data for faster understanding and use 
  • Automatically update Atlas with all the metadata Waterline uncovers
  • Facilitate data compliance and trust by discovering sensitive data and data lineage

Furthermore, as part of the company’s integration with Apache Atlas via HDP, Waterline Data will begin importing the data lineage information captured in Apache Atlas.

“No data lake can be opened up without proper data governance,” said Alex Gorelik, CEO of Waterline Data. “If compliance isn’t assured, the data simply isn’t usable. That’s why our new integration with Apache Atlas is so significant. As soon as organizations begin to realize they can replace manual tagging with rapid, automatic cataloging, we expect to see a dramatic rise in the adoption and expanded use of Hadoop.”

HDP certification, YARN integration certification, apache atlas, big data, data lake assets, hadoop, hortonworks, metadata

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}