Data Lakes: Managed Ingestion [Video]
Check out this video to learn about the variation across data sources and the Hadoop distribution chosen.
Join the DZone community and get the full member experience.Join For Free
The classic method for data ingestion in Hadoop relies on a number of different technologies each with its own configuration and scaling issues. These technologies require expertise to correctly ingest the data and ensure the ingest meets the SLAs of the organization.
The video below shows the variation across data sources and the Hadoop distribution chosen.
Managed ingestion is more than just using scripts to automate the movement of data into your Data Lake. This means not only having a defined repeatable process for data ingest but also having a way to manage the ingestion.
A managed process of ingestion gives you the tooling to programmatically address a consistent process of data movement into your system. In cases where that process has issues, it provides the means for delving into the root cause of the failure.
Published at DZone with permission of Adam Diaz, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.