Making Deep Learning More Efficient
Making Deep Learning More Efficient
Researchers are finding new ways to power the AI revolution, including a way to speed up rapid data lookup and save time and energy in performing deep learning.
Join the DZone community and get the full member experience.Join For Free
Insight for I&O leaders on deploying AIOps platforms to enhance performance monitoring today. Read the Guide.
I wrote recently about the work undertaken by supercomputing giant Cray to develop machines specifically for powering the kind of data intensive AI algorithms so popular today. Whilst providing more computational grunt is one way to power our AI revolution, researchers from Rice University believe more efficient coding can also play a big role.
In a recent paper, they describe a new technique for making rapid data lookup faster and more efficient, thus saving a lot of time and energy when performing deep learning.
“This applies to any deep-learning architecture, and the technique scales sublinearly, which means that the larger the deep neural network to which this is applied, the more the savings in computations there will be,” the researchers say.
Making AI Smarter
The approach adapts the traditional hashing method used in data indexing to significantly reduce the computational overhead required for deep learning. Hashing is a process whereby smart hash functions are used to convert data into more manageable chunks, known as hashes. These then act as a kind of index for the data.
“Our approach blends two techniques — a clever variant of locality-sensitive hashing and sparse backpropagation — to reduce computational requirements without significant loss of accuracy,” the team says. “For example, in small-scale tests, we found we could reduce computation by as much as 95 percent and still be within 1 percent of the accuracy obtained with standard approaches.”
Most AI is based on "neurons" that become specialized as they are trained on vast quantities of data. Low-level neurons usually perform very simple tasks, with the output from these then passed to the next layer who perform their own specialized searches. Current models only require a few of these layers to perform reasonable feats such as image recognition.
In theory, there is no limit to the scale of these neural networks, but the larger they get, the more computational power is required to run them.
“Most machine learning algorithms in use today were developed 30-50 years ago,” the authors say. “They were not designed with computational complexity in mind. But with ‘big data,’ there are fundamental limits on resources like compute cycles, energy, and memory. Our lab focuses on addressing those limitations.”
With the more efficient method outlined in the paper, however, it will be possible for researchers to work with extremely large deep networks.
“The savings increase with scale because we are exploiting the inherent sparsity in big data,” they conclude. “For instance, let’s say a deep net has a billion neurons. For any given input — like a picture of a dog — only a few of those will become excited. In data parlance, we refer to that as sparsity, and because of sparsity, our method will save more as the network grows in size. So while we’ve shown a 95 percent savings with 1,000 neurons, the mathematics suggests we can save more than 99 percent with a billion neurons.”
Published at DZone with permission of Adi Gaskell , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.