Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

MongoDB Aggregation: How to Work with 30 Years of NBA Data

DZone's Guide to

MongoDB Aggregation: How to Work with 30 Years of NBA Data

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

If you've been waiting for the day when MongoDB and basketball would finally intersect, I have good news for you: A recent post by Valeri Karpov at the Coding Barbarian has crunched 30 years worth of NBA data with MongoDB aggregation.

Karpov begins by exploring the structure of the NBA data - the dataset alone would be deeply exciting to any basketball fan - and then provides a general overview and sanity checks of MongoDB's aggregation framework and aggregation methods. Using these techniques, Karpov produces a few statistics:

  • Teams with the most wins in a given season
  • Correlating wins with particular stats
  • Multiple stats compared vs. win percentage

Ultimately, it's all just for fun, testing out techniques to find gems in the data. For example:

An interesting factoid: the team that recorded the fewest defensive rebounds in a win was the 1995-96 Toronto Raptors, who beat the Milwaukee Bucks 93-87 on 12/26/1995 despite recording only 14 defensive rebounds.

Regardless, though, Karpov's experiments provide a fantastic overview of MongoDB's aggregation framework, how it works, some key methods, and what it can do. Check out Karpov's full post to find some truly obscure NBA statistics.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}