Apache Kafka (kafka.apache.org) is a "distributed publish-subscribe messaging system" with some very interesting features for handling massive volumes of data without the traditional management and scale issues of JMS implementations. Written in Scala, it runs on the JVM and is easily integrated into Java applications.
From the project's website:
"It is designed to support the following:
Persistent messaging with O(1) disk structures that provide constant time performance even with many TB of stored messages.
High-throughput: even with very modest hardware Kafka can support hundreds of thousands of messages per second.
Explicit support for partitioning messages over Kafka servers and distributing consumption over a cluster of consumer machines while maintaining per-partition ordering semantics."
This presentation will give an introduction to Apache Kafka and present a few use cases for using it as an integration point between different systems including standalone Java applications and a Hadoop cluster.
Chris Curtin is the head of Technical Research at Silverpop. He has over 20 years of experience developing large scale applications in Java, C++, Linux/Unix and relational databases.