Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Big Data and Python: Utilizing Python for Large-scale Datasets

DZone's Guide to

Big Data and Python: Utilizing Python for Large-scale Datasets

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

This recent video from NewCircle Training discusses the use of Python for querying mass quantities of data. Despite the slow speed of Python, at least compared to languages like C++, AdRoll demonstrates how it can perform very efficiently even with large-scale datasets.

Ville Tuulos, Principle Engineer at AdRoll, a company producing tons of big data, demonstrates how AdRoll uses Python to squeeze every bit of performance out of a single high-end server. They manage it with Numba, a new NumPy aware dynamic Python compiler based on LLVM, and thanks to Python, the system can provide a very expressive and developer-friendly API. Find out more in this informative talk from the San Francisco Python Meetup Group.            
- YouTube Page  

You can also check out the slides by themselves if you don't have time for the whole video.

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}