Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

How Microservices is Imperative to Bedrock Data Lake Management

DZone's Guide to

How Microservices is Imperative to Bedrock Data Lake Management

Zaloni’s Bedrock Data Lake Management platform is built modularly, using Microservices, to enable, orchestrate, and govern a Hadoop infrastructure.

· Big Data Zone
Free Resource

Learn best practices according to DataOps. Download the free O'Reilly eBook on building a modern Big Data platform.

Microservices are a key enabler of large, scalable implementations making optimum use of hardware. Companies such as Netflix have pioneered their use and have shown the promise of what they can achieve. Netflix has used Microservices to support over 35% of all internet download traffic in North America.

Leveraging Microservices requires developing software in an architecture that allows deployment of small, modular programs or services that communicate in a well-defined way. By deploying the services on separate hardware, the software can quickly scale to handle higher loads. Additionally, by developing well defined communication channels, high availability can be achieved. Basically, in case a process dies, there is fault tolerance to assure that the process starts again.

Implementing Hadoop holds the promise of working with ingesting, analyzing, and managing large datasets. Data can be pulled from existing infrastructure (Mainframes, Data Warehouses (DWs), flat files, etc.) or from new sources (SaaS products like Salesforce, Feeds from services like Twitter, etc.) As data needs increase, this requires a software infrastructure that provides scalability across the entire data management process and Microservices are an ideal way of achieving that goal. Although, legacy programs can implement some modularity, it is a major architectural change that can require prohibitively significant investment to achieve much benefit. The ideal approach is to build a product from the ground up with a modular architecture.

Zaloni’s Bedrock Data Lake Management platform is built modularly, using Microservices, to enable, orchestrate, and govern a Hadoop infrastructure. It is built to manage any velocity, variety, and volume of data as long as there is hardware to support it. Bedrock leverages two key Microservices – the Bedrock Data Collection Agent (BDCA) and the Workflow Executor (WFE), among others. The BDCA helps to manage the movement in and out of the Data Lake as well as within the Lake itself. The WFE manages jobs that are run in Hadoop and communicates with Bedrock on execution status. These two Microservices can be deployed on the same Edge node that Bedrock resides. As business grows and needs change, the BDCA and WFE can be run on as many edge nodes as necessary. This allows more ‘pipes’ into the heart of the data processing engine and ultimately increased performance.

Find the perfect platform for a scalable self-service model to manage Big Data workloads in the Cloud. Download the free O'Reilly eBook to learn more.

Topics:
big data ,microservices ,hadoop ,Bedrock Data

Published at DZone with permission of Aashish Majethia. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}