MXNet and Gluon: Modern Deep Learning at Scale
MXNet and Gluon: Modern Deep Learning at Scale
Today, machine learning happens at scale and is deep, distributed, and multidimensional. As a result, requirements for modern deep learning frameworks have evolved, too.
Join the DZone community and get the full member experience.Join For Free
Machine learning is not just a buzzword nor hype anymore — it is everywhere. Speech recognition, image classification, image understanding, natural language processing, and autonomous vehicles are not the future; they are our present. Today, machine learning happens at scale, and it is deep, distributed, and multidimensional. As a result, requirements for modern deep learning frameworks have evolved, as well.
Selection criteria include not only how easy it is for an engineer to create deep learning models but also how to minimize concerns around parallelization and scalability (with multiple GPUs, multiple machines, or both). Engineers do not want to explicitly deal with communication overhead related to data parallelization. They expect frameworks to scale up/out to the number of GPUs/machines and give them that efficiency out-of-the-box. In other words, they want it to be near linear scalable (i.e. once the number of GPUs is doubled, the corresponding throughput doubles, too).
State-of-the-art neural networks have recently made great strides, as well. They have become much harder to define since they consist of 100+ hidden layers, and a modern deep learning framework makes it easier to define such complex architectures.
And last, but not least, a modern deep learning framework needs to support bindings to programming languages. Python is a de facto standard language for deep learning front-end programming; however, the more flexibility given via a framework by way of a programming language makes the choice of a framework all that much easier. The more engineers can work with a programming language that they are already familiar with, like R, C++, Scala, Go, JS, or Julia, the less the on-ramp will be to getting down to the real work.
MXNet is such a framework in this engineer’s opinion. It is a modern deep learning framework that meets all the selection criteria I’ve laid out here. MXNet stands for a mixed style of programming, both imperative and declarative (symbolic). According to the imperative style, engineers write a program the way they think, i.e. line-by-line. There is nothing wrong with that; it is very straightforward and flexible since it may take advantage of a language native features, such as a loop. But it’s an engineer’s responsibility to parallelize and optimize its execution. So, no good performance can be expected out-of-the-box.
Furthermore, all engineers know that writing parallel programs can be painful. On the other hand, neural networks are suited for parallel computation. In declarative or symbolic style, engineers have to define their computation flow as a graph — declare variables, relations between them, etc. The declarative style has more room for optimization, but it is obviously less flexible since an engineer needs to stick to a platform's specific DSL. Luckily, MXNet supports both styles, together with such an interesting feature as auto-parallelization. It traverses a symbolic graph and tries to understand what parts of a program can be executed in parallel doing that parallelization instead of engineers behind the scenes.
In addition to that, there is Gluon: a clear, concise, and simple high-level API with an MXNet back-end that combines best features of the previous frameworks. Now, engineers can write simple imperative code, and Gluon automatically generates a symbolic program using one
.hybridize() instruction for free and seamlessly performs many automatic optimizations, like operator-fusion and memory optimizations (for instance, an intermediate result does not require memory allocation). Gluon combines the best of both worlds: the simplicity of an imperative style and the efficiency of the symbolic style. Initialization on multiple devices is very simple, too. It is achieved by a one-liner by passing in an array of device contexts. Have a look at some high-quality Gluon Notebooks for more details.
Keep in mind that Amazon has chosen MXNet as its deep learning framework of choice at AWS, which means easier deploy of large-scale deep learning on AWS cloud, predefined CloudFormation templates, etc. Also, for existing Keras/Tensorflow users, there is a possibility to migrate from TensorFlow to MXNet as Keras can run with MXNet back-end now.
Summing up, Gluon and MXNet democratizes modern deep learning and makes it accessible not only for expert scientists but also for non-experts to design, train, and re-use state-of-the-art deep neural networks. So, nowadays, options for deep learning framework vary, i.e. Keras/Tensorflow, Gluon/MXNet, Keras/MXNet, and many others. The choice is yours, so give it a try.
Opinions expressed by DZone contributors are their own.