Over a million developers have joined DZone.

Neuroph on Hadoop: Massive Parallel Neural Network System?

DZone 's Guide to

Neuroph on Hadoop: Massive Parallel Neural Network System?

· Big Data Zone ·
Free Resource
On the Apache mailing list there is an interesting Google Summer of Code project proposal—to implement neural networks with back propagation learning on Hadoop. The idea is to create support for a massivley parallel neural network system that will be able to work with huge amounts of data. Possible applications would be typical neural network problems involving:

  • clasification
  • prediction
  • recogniion
  • association
  • statistical modeling

All of these could be useful in personalization and improving search technologies!

Apache Hadoop is a Java software framework that supports data-intensive distributed applications and enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s MapReduce and Google File System (GFS) papers.

The proposal is inspired by the design of existing neural network framework Neuroph, since it is inuitive, easy to use and provides great flexibility for extensions:

"This architecture is inspired from that of the open source Neuroph neural network framework (http://imgur.com/gDIOe.jpg). This design of the base architecture allows for great flexibility in deriving newer NNs and learning rules. All that needs to be done is to derive from the NeuralNetwork class, provide the method for network creation, create a new training method by deriving from LearningRule, and then add that learning rule to the network during creation."

An interesting approach could be to extract an existing Neuroph interface and to provide implementations of it on top of Hadoop. That way, all neural network models, and learning rules that are currently supported by Neuroph, and that will be developed in future, could be easily ported. This approach would also provide a lightweight development environment, where all algorithms could first be tested and tweaked.

There are some positive comments on the proposal at the moment,  but we'll see whether it is going to be accepted!

One interesting thing to note is that the IDE for Hadoop is based on the NetBeans Platform and since we have recent announcements that Neuroph will also be ported to the NetBeans Platform, it looks like some powerfull tools are coming out in this area. This is a very good example of how the NetBeans Platform can provide synergy between different tools and projects.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}