2nd Annual Apache Flink Survey Reveals Enterprises Are All-In on Stream Processing
87% of devs plan to deploy more data streaming apps in 2018, with machine learning being the most popular type of app developers are building or planning to build.
Join the DZone community and get the full member experience.Join For Free
Enterprises are investing heavily in stream processing technology, according to the second annual Apache Flink® user survey data Artisans announced a few days ago. The vast majority (87%) of organizations surveyed are planning to deploy more applications powered by the Apache Flink software in 2018.
Of the dozens of new application types developers are building or planning to build, machine learning (64%) (both for model scoring [34%] and model training [30%]), anomaly detection/system monitoring (27%), and business intelligence/reporting (25%) are the most popular, followed by recommendation/decision engines (22%) and security/fraud detection (19%), to round out the top five.
Most respondents (70%) say their team or department is growing and hiring in 2018. Nearly as many (59%) expect their team or departmental budget to increase.
Drawing on insights from 217 IT leaders, software engineers, application developers, and data/systems architects from 28 countries, the survey shows that the ability to react to data in the moment is becoming a top priority among enterprises of all sizes, from small organizations earning under $1 million in annual sales (10% of respondents) to very large enterprises with over $1 billion in earnings (18% of respondents). By adopting Flink and a data streaming architecture, enterprises can get insights from their data in milliseconds.
Current and Future Use
Global companies such as Alibaba, ING, Netflix, SK Telecom, Telefonica, and Uber use Flink as their stream processing platform of choice for large-scale stateful applications that manage high volumes of data. One-quarter of respondents are processing at least one billion events per day, with 1% processing at least one trillion events per day:
- 1% process one trillion or more events per day
- 24% process between one billion and 999 billion events per day
- 18% process from 100-999 million events per day
- 43% process up to 99 million events per day
The volume of events is expected to grow exponentially as organizations implement more live data applications in the coming years. Of those who are planning to deploy more Flink applications in 2018:
- 62% expect to deploy one to five more applications
- 11% say six to ten more applications
- 8% say 10+ applications
- 7% expect to deploy a whopping 20+ additional applications in 2018
Apache Flink’s streaming execution model can be used for processing both continuous (streaming) datasets and static (finite or batch) datasets to cover a broad range of data processing use cases within a single platform. Today, 46% of respondents use Flink only for continuous (streaming) data, while 47% use it for a mixture of continuous data and static (finite) datasets and 6% use it only for static datasets.
“This year’s survey presents clear evidence that stream processing is becoming widely adopted across enterprises of all sizes and in a variety of industries outside of technology, with financial services, insurance, real estate, and telecommunications leading the pack,” says Kostas Tzoumas, Co-Founder and CEO of data Artisans and a PMC member of Apache Flink. “The market is expected to reach upwards of $13 billion USD by 2021, and we’re seeing a range of new applications being put into production, including machine learning, security and fraud detection, systems monitoring, and the Internet of Things. We are privileged to be part of such a vibrant community, and data Artisans is committed to ensuring Flink is constantly evolving to meet future use cases and that we are providing the training, services, and support infrastructure to enable users to maximize the full potential of their data applications.”
Since implementing Flink, respondents have seen many benefits. 46% of respondents reported that high volumes of data are now available in real-time (enabling them to move beyond batch processing) and 46% also said it is easier for them to build distributed applications. Other benefits include:
- 37% report improved scalability of applications
- 35% have seen improved performance of applications
- 32% cite simplified application design
- 29% report reduced application complexity
Apache Flink has also been credited with driving tangible business benefits that transcend the realm of the IT team by accelerating the innovation cycle, helping to keep systems up and running, and boosting revenue, areas that will likely increase as companies expand their use of live data applications:
- 23% are able to bring new applications online faster
- 20% see the improved reliability of applications
- 15% say their systems are more resilient
- 11% report cost savings
- 4% have seen an increase in revenue
Satisfaction and Areas of Focus and Development
92% of respondents expressed satisfaction with Flink, of which 58% were very or completely satisfied. Diving into specific areas that rank highest (very or completely satisfied), Flink’s strength in managing high-volume, high-velocity streaming datasets is evident in the top four areas of satisfaction:
- 76% for event time handling
- 74% for DataStream API (stream processing)
- 72% for throughput and latency
- 71% for windowing and watermarks
Apache Flink is among the fastest growing Apache Software Foundation projects. As more companies adopt and configure Flink to their organization’s specific needs, more user support will be needed. The top requests for new features or developments among the survey respondents were additional documentation, programming guides, and resources for getting started (55%); better tooling for non-engineering users (43%); and more support for programming languages beyond Java, Scala, and SQL (34%).
About the Survey
To better understand how organizations are using and plan to use Apache Flink software and to learn which features they like best and what features and improvements they would like to see in the future, data Artisans commissioned Researchscape International to conduct an online survey of 217 IT leaders, software engineers, application developers, data/systems architects, data scientists and analysts. The survey was fielded from November 6 to December 1, 2017.
Out of 28 countries, respondents were most often from the United States (24%), China (13%), and Germany (12%). 57% worked at organizations with 100-9,999 employees, one-third at organizations with up to 999 employees, and 14% worked at organizations with 10,000 or more employees. One-third of the organizations had annual sales of $1-100M, nearly two-fifths (18%) topped $1B in annual sales, while 10% earned under $1M and 8% earned between $100-500M.
Opinions expressed by DZone contributors are their own.