Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

DataWorks Summit 2017 (DWS17) Wrap-Up

DZone's Guide to

DataWorks Summit 2017 (DWS17) Wrap-Up

Get a summary of the most interesting news in big data from the recent open summit on big data, DataWorks Summit 2017 (DWS17).

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

First off, this event really expanded on what was done in Munich earlier this year, adding a lot of cool new announcements, a new partnership, and some new projects.

The biggest non-technical news was that IBM will be using HDP as their official Hadoop. This is a big deal, as this helps consolidate the Hadoop market. IBM was already using and is part of ODPi, so this made a lot of sense. EMC Pivotal, Microsoft, IBM, and Hortonworks all being on the same Open Apache version of Hadoop and having guidance from the ODPi makes for a very open release of this critical enterprise platform. To make it even better, IBM Apache committers can now work even closer with Hortonworks to ensure the continued fast development of critical enterprise projects like Apache Ranger and Knox.

HDF 3.0 was released and this includes the new projects of Streaming Analytics Manager (SAM), Apache NiFi 1.2, and Schema Registry. This included revolutionary software the builds real-time streaming applications on top of a Unified API that generates, builds, and deploys Storm applications. The Unified API allows for other streaming options in the near term like Apache Beam, Spark, and Flink.

Silicon Angle has a number of great interviews from their event:

The keynotes provide a lot of direction in the future of big data, AI, and streaming.   

Another great talk is by Verizon around their self-service data lake.


There was a great talk by the team on seamless access control with Apache Spark and Ranger.

Here are my Top 10 Talks of the event

  1. Cloudy With a Chance of Hadoop

  2. Apache Hadoop 3.0 Updates

  3. Caffe on Spark Updates (Deep Learning)

  4. Model as a Service

  5. Best Practices for Spark R

  6. Data Profiling With Apache Calcite

  7. Connecting the Drops With Apache NiFi and Apache MiNiFi

  8. Complex Streaming Application in Under 10 Minutes With No Code

  9. Apache Hadoop YARN: Past, Present, and Future

  10. Meet HBase 2.0 and Phoenix 5.0

If you are in Australia, check out the next Data Works summit there. Otherwise, you are waiting for Berlin in 2018. See you there!

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:
big data ,hortonworks ,hadoop ,dataworks summit ,conference

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}