Things are constantly changing in Big Data, from Apache Spark to Apache Flink to a dozen projects and versions that are constantly changing. It helps to see some presentations from experts.
Excerpts from the abstracts or descriptions of the respective talks.
ElasticSearch, Hadoop and Friends
"A practical overview of using Elasticsearch within a Hadoop environment to perform real-time indexing, search and data analysis."
Apache MetaModel for Data Insights
"In this presentation, Alberto Rodríguez will talk about using Metamodel as an abstraction layer that allows querying different data sources with a SQL-like language to get the most of your data."
Time Series applications with Apache Spark and HBase
"This hands-on tutorial will help you get a jump-start on scaling distributed computing by taking an example time series application and coding through different aspects of working with such a dataset. We will cover building an end to end distributed processing pipeline using various distributed stream input sources, Apache Spark, and Apache HBase, to rapidly ingest, process and store large volumes of high-speed data."
Stream Processing with Apache Kafka
"The concept of stream processing has been around for a while and most software systems continuously transform streams of inputs into streams of outputs. Yet the idea of directly modeling stream processing in infrastructure systems is just coming into its own after a few decades on the periphery."
Introduction to Apache Spark (Fast Data)
"From Big Data to Fast Data with Functional Reactive Containerized Microservices and AI-driven Monads in a galaxy far far away..."
Real-time programming with SparkML, Cassandra, AKKA
"Banks are innovating. The purpose of this innovation is to transform bank services into meaningful and frictionless customer experiences. A key element in order to achieve that ambitious goal is by providing well tailored and reactive APIs and provide them as the building blocks for greater and smoother customer journeys and experiences. For these API’s to work, internal processes have to evolve as well from batch processing to real-time event processing."
Real-time ingestion of AVRO Data
"Last year, during BDS14, two Allegro engineers shared their experience by presenting pitfalls and mistakes that were made when implementing data ingestion pipelines in our company. This time, Maciej Arciuch presents a brand new design and shows how accepting good advices can result in a drastic design change and how making mistakes can teach us a lot."
Securing Big Data at rest for Hadoop, Cassandra and MongoDB
"This session shows how to secure different Big Data sensitive data items such as log files, metastore databases, control files, config files, data directories or data files for different Big Data technologies."
Hello World with Spark
"To see how Spark works, we are going to create a website that displays a letter we wrote to our friends when we were on vacation. As we learn how to do more complicated things with Spark, we will add to our letter and make it more dynamic."