Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Using Apache Ignite for Hadoop Acceleration

DZone's Guide to

Using Apache Ignite for Hadoop Acceleration

Learn how to speed up Hadoop data access with in-memory caching.

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

Hadoop is fast, but in-memory is faster.   To speed up your data access all around you need to speed up your HDFS file access (helps Hive, HBase, etc...). You can up your hardware game with uber fast EMC Isilon or high-end servers from HPE, Dell or IBM. The easiest, nearly free if you have some RAM available, is to use the excellent Apache Open-Source project, Apache Ignite.   I will also look at Apache Geode, Redis, SnappyData and other in-memory accelerators in future How-To articles.

To use the first follow the instructions from the project site. I installed mine on the freely available Hortonworks HDP 2.4 Sandbox. Make sure you choose the "In-Memory Hadoop Accelerator", as this is the correct product for using with Hadoop:

wget https://dist.apache.org/repos/dist/release/ignite/1.7.0/apache-ignite-hadoop-1.7.0-bin.zip

unzip apache-ignite-hadoop-1.7.0-bin.zip 

Create /etc/default/hadoop configuration file, make sure you have Java, Ignite, and the Hadoop environment variables setup properly for your environment.

[root@sandbox apache-ignite-hadoop-1.7.0-bin]# 
cat /etc/default/hadoop

export JAVA_HOME=/usr/lib/jvm/java-1.7.0

export IGNITE_HOME=/opt/demo/ignite/apache-ignite-hadoop-1.7.0-bin

export HDP=/usr/hdp/current

export HADOOP_HOME=$HDP/hadoop-client/

export HADOOP_COMMON_HOME=$HDP/hadoop-client/

export HADOOP_HDFS_HOME=$HDP/hadoop-hdfs-client/

export HADOOP_MAPRED_HOME=$HDP/hadoop-mapreduce-client/

To Run Acceleration, you merely need to:

cd  /opt/demo/ignite/apache-ignite-hadoop-1.7.0-bin

bin/ignite.sh

You will also need to set some YARN and HDFS configuration from Ambari using the including instructions, but those work as described.   You will then need to restart those nodes with Ambari.   Then all your calls will be faster!!!

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.

Topics:
hadoop ,hortonworks ,ignite ,cache ,in-memory

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}