DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Culture and Methodologies
  3. Career Development
  4. How to Fix an Alluxio Error for Spark, MapReduce, and Hive Jobs

How to Fix an Alluxio Error for Spark, MapReduce, and Hive Jobs

This article explains the reason for the 'Class alluxio.hadoop.FileSystem not found' error and the solution to the issue when it occurs.

Bin Fan user avatar by
Bin Fan
CORE ·
Feb. 20, 19 · Tutorial
Like (2)
Save
Tweet
Share
6.94K Views

Join the DZone community and get the full member experience.

Join For Free

From time to time, a question pops up on the Alluxio user mailing list referencing job failures with the error message "java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found". This article explains the reason for the failure and the solution to the issue when it occurs.

Why Does This Happen?

This error indicates the Alluxio client is not available at runtime. This causes an exception when the job tries to access the Alluxio filesystem but fails to find the implementation of Alluxio client to connect to the service.

An Alluxio client is a Java library and defines the class alluxio.hadoop.FileSystem to invoke Alluxio services per user requests (such as creating a file, listing a directory, etc). It is typically pre-compiled into a jar file named alluxio-1.8.1-client.jar (for v1.8.1), and distributed with the Alluxio tarball. To work with applications this file should be located on the JVM classpath so that it can be discovered and loaded into the JVM process. If the application JVM fails to find this file on the classpath, it does not know the implementation of class alluxio.hadoop.FileSystem and will therefore throw the exception.

How to Address This Problem

The solution is to ensure the Alluxio client jar is distributed on the classpath of applications. There are several factors that should be considered when troubleshooting.

If the application is distributed across multiple nodes, this jar should be distributed to all these nodes. Depending on the compute framework, this configuration can be very different:

  • For MapReduce or YARN applications, one can append the path to Alluxio client jar tomapreduce.application.classpath or yarn.application.classpath to ensure each task can find it. Alternatively, you can supply the path as an argument of -libjars like:
     $ bin/hadoop jar \
     libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount \
     -libjars /<PATH_TO_ALLUXIO>/client/alluxio-1.8.1-client.jar \
     <INPUT FILES> <OUTPUT DIRECTORY>
    Depending on the Hadoop distribution, it may also help to set $HADOOP_CLASSPATH:
     export HADOOP_CLASSPATH=/<PATH_TO_ALLUXIO>/client/alluxio-1.8.1-client.jar:${HADOOP_CLASSPATH}
  • For Spark applications, set in spark/conf/spark-defaults.conf on every node running Spark and restart the long-running Spark server processes:
     spark.driver.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.1-client.jar
     spark.executor.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.1-client.jar
  • For Hive, set environment variable HIVE_AUX_JARS_PATH in conf/hive-env.sh:
     export HIVE_AUX_JARS_PATH=/<PATH_TO_ALLUXIO>/client/alluxio-1.8.1-client.jar:${HIVE_AUX_JARS_PATH}

In some cases, one compute framework relies on another. For example, a Hive service can use MapReduce as the engine for the distributed query. In this case, it is necessary to set the classpath for both Hive and MapReduce to be configured correctly.

Summary

  • For applications to work with Alluxio, they must append the Alluxio client jar file into their classpath.
  • How to configure Alluxio client jar file to the classpath can be case-by-case based on the compute framework.
Alluxio MapReduce career application

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Top 5 Java REST API Frameworks
  • How to Secure Your CI/CD Pipeline
  • A Brief Overview of the Spring Cloud Framework
  • Stream Processing vs. Batch Processing: What to Know

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: