Over a million developers have joined DZone.

Concord.io: Processing Data in Motion

Tim Spann interviews Concord co-founder Shinji Kim about their distributed stream processing framework and lists some quick specs about the technology.

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

Concord.io is very fast, is written in C++, and has an interesting new streaming framework that scales. There are a lot of stream processing frameworks, but this looks to be the only one written in fast C++ with APIs for multiple languages like Go and Java.

Image title

For code snippets, check out the walkthrough.

Image title

I spoke with Shinji Kim, Co-Founder of Concord Systems in NYC on what the fast deal is going on with Concord. Concord currently runs on Apache Mesos, but could run on other environments like YARN. The main use is for processing data in motion as it sits between message queues and databases. It's used for fast data enrichment, aggregation, filtering, and deduplication. Since it is written in C++ there is no garbage collection cycle to slow down processing and no worries about JVM heap memory management. It supports popular fast data sources like Kafka and Kinesis and data sinks like HDFS and MySQL. It uses the Pub/Sub Operator Model (composable jobs by metadata). Users can manage local state on a Key-Value store backed by RocksDB.

  • It supports writing applications in Python, Ruby, Go, Java, Scala and C++ using it's API. 

  • Containerized execution environment.

  • Uses Open Source Twitter Zipkin for monitoring.

  • Concord systems reports benchmarks of Concord (500K QPS/node at 10ms/event) vs Spark Streaming (100K QPS/node at 1s batch window).

The API is only missing the ubiquituous hipster NodeJS and R. If this product becomes open source, those APIs are very likely to appear.

It's an interesting product to consider if you are hitting some performance walls in streaming.

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:
streaming ,big data ,jvm ,c++

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}