DZone
Big Data Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Big Data Zone > This Week in Hadoop and More: Spark, NiFi, and Events

This Week in Hadoop and More: Spark, NiFi, and Events

Weekly wrap up on Hadoop, Big Data, Spark, NiFi, and more.

Tim Spann user avatar by
Tim Spann
CORE ·
Jun. 07, 16 · Big Data Zone · Opinion
Like (5)
Save
Tweet
3.55K Views

Join the DZone community and get the full member experience.

Join For Free

Image title


Welcome back to "This Week in Hadoop and More".  There always seems to be a lot more.  There are a lot of interesting things coming up: I'll have an article on Concord.io's interesting new distributed streaming framework and an interesting article on using SnappyData with Zeppelin, Spark, and Hortonworks HDP.  Those articles came out of talks I had last week with principals from those two interesting new Big Data startups.  These are definitely companies and products to watch.  Based on last weeks snafu with Calcite, I will sit down with some contributors to this awesome Apache project to find out more and present it to you.  Safe to say, Calcite is being used by a number of other Big Data Projects you know and love (Apache Drill, Apache Flink, Apache Hive, Apache Kylin, Apache Phoenix, Apache Samza and Apache Storm).

Image title

Image title

Also coming soon is my new meetup in Princeton.  Hopefully I can meet some DZone readers in person and we can continue sharing information!  Join us at the amazing TigerLabs.

Image title

Image title

There are a number of great new presentations on Apache Spark and specifically streaming with Spark Streaming.  These are coming from Strata Hadoop London, which gives us a ton of great content this week.

  • Spark Streaming Tips for Devs and Ops

  • Real-Time Processing with Kafka and Spark

  • Lambda Architecture with Apache Spark

  • Another Great Advanced Spark Deck by Chris Fregly

  • Strata Hadoop in London has released a ton of the videos from their very recent event.   These are all worth watching to learn what's going on and what's coming in the world of Big Data.

  • Here are some great keynotes from Strata/Hadoop 2016 London, along with a number of slide decks available for viewing or download

  • This talk using the excellent machine learning library from H2O is a must watch: Jo-fai Chow (H2O.ai)

    • Introduction to generalized low-rank models and missing values

Why is my Hadoop job slow?
Bikas Saha (Hortonworks Inc)
This is a cool talk about finding out why your job is not performing well.  

Watermarks: Time and progress in streaming dataflow and beyond

Slava Chernyak (Google Inc.)
This is a really interesting streaming talk by Google!!

Triggers in Apache Beam (incubating): User-controlled balance of completeness, latency, and cost in streaming big data pipelines in Apache Beam (Google Data Flow).
Kenneth Knowles (Google)

For Hadoop Programmers

Apache Storm Debugging
Storm 1.0 Enhanced Debugging.  For the die-hard Storm developers, this will be a godsend in being able to find problems easier.
http://hortonworks.com/blog/whats-new-apache-storm-1-0-part-1-enhanced-debugging/

Open BDRE
Bigdata Ready Enterprise Open Source Software is an interesting new open source Workload management that works with Spark and Hadoop.   Seems like a great tool to try in most enterprises.
http://wiproopensourcepractice.github.io/openbdre/

Hadoop Development Testing 
Bite Sized HDP Clusters in your IDE for development
https://github.com/sakserv/hadoop-mini-clusters
Genius!    This is a great way to do integration testing for Hadoop projects.

Apache NiFi for Rapid DataFlow
Excellent Tutorial on using Apache NiFi
http://hortonworks.com/blog/apache-nifi-not-scratch/

More Great Recent Presentations

  • Streaming SQL

  • XLDB 2016 Conference Presentations

Related Refcard:

Apache Hadoop Deployment

hadoop Big data Machine learning Event Open source

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • How a Low-Code API Platform Delivers Developer Productivity
  • How to Add a Blank Directory to Your Git Repository
  • Implementing One and Two Way SSL (Mutual Authentication) for MuleSoft Application
  • Build Cloud-Native Apps with AWS App Runner, Redis, and AWS CDK

Comments

Big Data Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo