Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

How Microservices is Imperative to Bedrock Data Lake Management

DZone's Guide to

How Microservices is Imperative to Bedrock Data Lake Management

Zaloni’s Bedrock Data Lake Management platform is built modularly, using Microservices, to enable, orchestrate, and govern a Hadoop infrastructure.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Microservices are a key enabler of large, scalable implementations making optimum use of hardware. Companies such as Netflix have pioneered their use and have shown the promise of what they can achieve. Netflix has used Microservices to support over 35% of all internet download traffic in North America.

Leveraging Microservices requires developing software in an architecture that allows deployment of small, modular programs or services that communicate in a well-defined way. By deploying the services on separate hardware, the software can quickly scale to handle higher loads. Additionally, by developing well defined communication channels, high availability can be achieved. Basically, in case a process dies, there is fault tolerance to assure that the process starts again.

Implementing Hadoop holds the promise of working with ingesting, analyzing, and managing large datasets. Data can be pulled from existing infrastructure (Mainframes, Data Warehouses (DWs), flat files, etc.) or from new sources (SaaS products like Salesforce, Feeds from services like Twitter, etc.) As data needs increase, this requires a software infrastructure that provides scalability across the entire data management process and Microservices are an ideal way of achieving that goal. Although, legacy programs can implement some modularity, it is a major architectural change that can require prohibitively significant investment to achieve much benefit. The ideal approach is to build a product from the ground up with a modular architecture.

Zaloni’s Bedrock Data Lake Management platform is built modularly, using Microservices, to enable, orchestrate, and govern a Hadoop infrastructure. It is built to manage any velocity, variety, and volume of data as long as there is hardware to support it. Bedrock leverages two key Microservices – the Bedrock Data Collection Agent (BDCA) and the Workflow Executor (WFE), among others. The BDCA helps to manage the movement in and out of the Data Lake as well as within the Lake itself. The WFE manages jobs that are run in Hadoop and communicates with Bedrock on execution status. These two Microservices can be deployed on the same Edge node that Bedrock resides. As business grows and needs change, the BDCA and WFE can be run on as many edge nodes as necessary. This allows more ‘pipes’ into the heart of the data processing engine and ultimately increased performance.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
big data ,microservices ,hadoop ,Bedrock Data

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}