Over a million developers have joined DZone.

Blaze: A Python Compiler for Big Data

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

Python developers working with NumPy or Big Data in general might be interested in Blaze, a Python library created by Continuum Analytics and referred to by Stephen Diehl as "the next generation of NumPy." Blaze expands on NumPy's array structures by utilizing a variety of table and array-like structures and supporting a number of new features. According to Diehl:

...Blaze is designed to handle out-of-core computations on large datasets that exceed the system memory capacity, as well as on distributed and streaming data. Blaze is able to operate on datasets transparently as if they behaved like in-memory NumPy arrays.

We aim to allow analysts and scientists to productively write robust and efficient code, without getting bogged down in the details of how to distribute computation, or worse, how to transport and convert data between databases, formats, proprietary data warehouses, and other silos.

Basically, it looks like NumPy, but a bit more flexible and efficient. If you're looking for something different in the world of Python and Big Data, check out the GitHub and the docs.

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}