Over a million developers have joined DZone.

Artificial Intelligence and Neural Nets on Small GPUs

How neural net simulations are becoming easier and less powerful to perform. So easy they can be implemented on a new mobile device chip called Eyeriss.

· IoT Zone

Access the survey results 'State of Industrial Internet Application Development' to learn about latest challenges, trends and opportunities with Industrial IoT, brought to you in partnership with GE Digital.

Lately we've heard a lot about convolutional neural networks which are large virtual networks of simple processing units. And most of you are familiar with the implementation of neural nets using GPUs taking advantage of their massive parallel structure to simulate a large number of neurons. Today even your mobile phone may have as many as 200 GPU cores, all processing in parallel. And while these computations are much more efficient and speedy in the GPU version versus the CPU, it now is possible to do these neural net simulations at 1/10 the power of a GPU! Ideal for moving artificial intelligence into the mobile device. Researchers from MIT unveiled a new chip that is dedicated to neural network simulation. They call their new chip “Eyeriss".

Neural nets have been around for some time. Marvin Minsky (the recently lost and sorely missed) artificial intelligence pioneer at MIT was a seminal figure in the development of early neural nets. His book "Perceptrons" in 1969 (co-authored with Seymour Papert) was at the center of the research at that time. Note: the original perceptron algorithm was invented in 1957 by Frank Rosenblatt. And it wasn't a program it was a most ingenious piece of hardware, so in some sense more like a real neuron.

Today complex neural nets are used for object, speech, and face recognition, but as this capability becomes more ubiquitous it will certainly be applied in many other domains.  Developers are already suggesting applications in the context of vehicles, appliances, civil-engineering structures, manufacturing equipment, and even on the farm. The idea is that smaller or remote devices would be delegated higher level decisions that they would make locally rather than sending large amounts of data through the Internet. I imagine it's sort of like a mission control scenario for a rocket launch: the launch director is not deluged with a flood of data but rather trusted "go no-go" decisions from each of the agents monitoring the complex subsystems. It is my belief that assembling collections of these simpler agents is what will lead us closer to Artificial General Intelligence (AGI). This too was imagined by Marvin Minsky in the next book he wrote "The Society of Mind".

With today's modern digital devices large memory complements, the greatest energy expense on the chip is the movement of data. On modern GPU chips a very large array of data is available for access by all of the GPU cores. But for neuron simulation most of the data and control of that data is local, just like in a real neuron. The real neuron does its job for the most part completely oblivious to what's going on around it. And so the new Eyeriss chip (which was presented at the International Solid State Circuits Conference in San Francisco this week) takes advantage of this. Each core has its own memory, each core can communicate only with its nearby neighbor (like a real neuron), all transmissions of data even at the local level are compressed (less data is moved). A further characteristic of neural nets is that they manage two types of data:

  1. Input data from the environment to be evaluated

  2. Scalar data that represents the input weights to and from the other neighboring neurons.

This new chip has specialized allocation methods for each type of data (e.g. a simulated neuron may be used several times with different input data so retaining the weights while testing various inputs avoids unnecessary data movement).

An actual demonstration was done at the conference based on an image recognition task. “This work is very important, showing how embedded processors for deep learning can provide power and performance optimizations that will bring these complex computations from the cloud to mobile devices,” says Mike Polley, a senior vice president at Samsung’s Micro Plasma Ion Lab. 

[Author's note: Just a final mention of Marvin Minsky. I had the honor of interacting a bit with him, it was in the context of being a student/fanboy. But I remember him fondly has being a friendly, helpful fellow, he even loaned me a few books on occasion. He was obviously smart and insightful, but he was also very open and collaborative. I admired that he released the texts of his books to the community before publishing them and always accepted observations and suggestions warmly. If you haven't read "The Society of Mind" or "The Emotion Machine " I would highly recommend them. Marvin you are missed.] 

The IoT Zone is brought to you in partnership with GE Digital.  Discover how IoT developers are using Predix to disrupt traditional industrial development models.

ai,neural networks,gpus,mobile performance,iot hardware

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}