Building a real-time application starts with connecting the pieces of your data pipeline.
To make fast and informed decisions, organizations need to rapidly ingest application data, transform it into a digestible format, store it, and make it easily accessible — all at sub-second speed.
A typical real-time data pipeline is architected as follows:
- Application data is ingested through a distributed messaging system to capture and publish feeds.
- A transformation tier is called to distil information, enrich data, and deliver the right formats.
- Data is stored in an operational (real-time) data warehouse for persistence, easy application development, and analytics.
- From there, data can be queried with SQL to power real-time dashboards.
As new applications generate increased data complexity and volume, it is important to build an infrastructure for fast data analysis that enables benefits like real-time dashboards, predictive analytics, and machine learning.
At this year’s Spark Summit East, MemSQL Product Manager, Steven Camina shared how to build an ideal technology stack to enable real-time analytics.
Video: Building the Ideal Stack for Real-Time Analytics
Use Cases Featured in the Presentation
Pinterest: Monitoring A/B Experiments in Real-Time
Learn how Pinterest built a real-time experiment metrics pipeline, and how they use it to set up experiments correctly, catch bugs, and avoid disastrous changes. More in this blog post from the Pinterest Engineering Team.
Energy Company: Analyzing Sensor Data
Learn how a leading energy company built a real-time data pipeline with Kafka and MemSQL to monitor the status of drill heads using sensor data. Doing so has dramatically reduced the risk of drill bit breakage allows for more accurate forecasting for drill bit replacement.