Apache Gearpump: An Introduction

DZone 's Guide to

Apache Gearpump: An Introduction

Let's take a look at Apache's real-time, big data streaming engine, Gearpump. After a quick introduction, we dive into how to get started with using the platform.

· Big Data Zone ·
Free Resource

Apache Gearpump is a real-time big data streaming engine. It was conceived at Intel in mid-2014 as an open-source project on GitHub from the start and entered Apache incubation on March 8, 2016. The name Gearpump is a reference to the engineering term “gear pump,” which is a super simple pump that consists of only two gears but is very powerful at streaming water. Different than other streaming engines, Gearpump’s engine is event/message based. Per initial benchmarks, we are able to process 18 million messages per second (message length is 100 bytes) with an 8ms latency on a four-node cluster.

The Highlights

  • Extremely high throughput and low latency stream processing.
  • Configurable message delivery guarantee (at least once, exactly once).
  • Application hot redeployment.
  • Comprehensive dashboard for application monitoring.
  • Native Storm application compatibility.
  • Native Samoa application compatibility.
  • Friendly and extensible APIs.

Before you can submit and run your first Gearpump application, you will need a running Gearpump service. There are multiple ways to run Gearpump: local mode, standalone mode, YARN mode, or Docker mode.

The easiest way is to run Gearpump in local mode. Any Linux, MacOSX, or Windows desktop can be used with zero configuration.

In the example below, we assume you are running in local mode. If you're running Gearpump in one of the other modes, you will need to configure the Gearpump client to connect to the Gearpump service by setting the gear.conf configuration path in the classpath. Within this file, you will need to change the parameter gearpump.cluster.masters to the correct Gearpump master(s).

Steps to Submit Your First Application

Step 1: Submit Application

After the cluster is started, you can submit an example wordcount application to the cluster.

Open another shell:

### To run WordCount example
bin/gear app -jar examples/wordcount-2.11-0.8.4-assembly.jar org.apache.gearpump.streaming.examples.wordcount.WordCount

Step 2: View Application Status and Metrics

To view the application status and metrics, start the Web UI services and browse to to check the status. The default username and password is “admin:admin” — you can check UI Authentication to find how to manage users.


apache gearpump, big data, real-time data, streaming data, tutorial

Published at DZone with permission of Furkan Kamaci , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}