Analyze Streaming Data With Nashorn in Java 9
Create an app that streams data records read from a CSV file to a custom JavaScript file while enabling at-will logic changes and using Java 9 features.
Join the DZone community and get the full member experience.
Join For FreeNashorn, JDK's built-in JavaScript engine, has been around for some time now. It was first released as part of Java 8 in March 2014. While Nashorn can address a broad range of use cases, its usage falls primarily under three areas:
Command line interfaces (CLI) and scripting. CLIs were traditionally written using shell scripts or other dynamic/interpretive languages like Perl and Python. With Nashorn, you can use JavaScript for your scripting needs all the while seamlessly tapping into the Java ecosystem.
Isomorphic JavaScript application development. Isomorphic JavaScript applications are ones that share code across the server and the client. This is another area in which Nashorn is gaining traction.
Dynamic code execution. Oftentimes you want to run a custom piece of logic that can be changed without recompiling and redeploying the application. Business rules, application configuration, and manipulation of streaming data are all examples of dynamic code that can be modified on the fly.
In this article, we will focus on creating an application that uses Nashorn to execute dynamic code. The application is able to read a given CSV file and stream its data to a user-defined JavaScript file. The JavaScript file will hold the business logic to analyze these streaming records. The advantage of using this approach is the ability to change the logic at will without recompiling the app.
To build and package the application, we will use the new module system and the jlink tool —both features introduced in Java 9. You can head over to GitHub to get the source code and the build script that were used in this article.
How it Works
The application expects two arguments that represent full paths to a CSV file and the user-defined JavaScript file respectively. Consider the following sample CSV file that represents the walking and running activities tracked by a fitness device:
Day, Activity, Calories, Miles, Start Time, End Time
Monday, Running, 120.0, 1.2, 06:00 AM, 06:20 AM
Monday, Walking, 80.50, 1.5, 04:30 PM, 05:15 PM
Tuesday, Running, 100.0, 1.0, 06:00 AM, 06:15 AM
Tuesday, Walking, 112.20, 1.65, 04:45 PM, 05:20 PM
The first line in the CSV file is expected to represent the field headers. As you can see, the first line in the sample file above defines six fields. The field names are used to create the JavaScript objects that are passed to a user-defined streaming function called onRecord.
In order to run dynamic JavaScript code from within a Java application, you have to make use of the ScriptEngine interface provided by Java.
ScriptEngine nashorn = new ScriptEngineManager().getEngineByName("nashorn");
Once you get the script engine (Nashorn in our case), you can execute a dynamic piece of code by calling the eval method on this interface.
nashorn.eval(new FileReader(new File("myfile.js")));
The eval method is overloaded and can accept a string or a reader. The next steps are to stream a CSV file, create a JavaScript object for each line in the CSV file and invoke the user-defined logic in the evaluated JavaScript file. To invoke JavaScript methods, we have to cast the script engine to the Invocable interface and call its invokeFunction method.
Invocable invocable = (Invocable) nashorn;
bufferedReader
.lines()
.skip(1)
.forEach((String line) -> {
try {
Object record = invocable.invokeFunction("createRecord", line);
invocable.invokeFunction("onRecord", record);
} catch (Exception ex) {
logger.severe(ex);
}
});
As you can see from the code snippet above, each line in the CSV file is converted into a record object and the onRecord method that is defined in the JavaScript file is called. The users of the application have full control on analyzing these records. For example, to calculate the total miles and total calories, one can use the following logic:
var totalMiles = 0;
var totalCalories = 0;
function onRecord(record) {
totalMiles += Number(record.miles);
totalCalories += Number(record.calories);
}
function onEnd() {
print(' Miles: ' + totalMiles);
print('Calories: ' + totalCalories);
}
Java Module System and Custom Runtime Image
I consider Project Jigsaw to be one of the most exciting features of Java 9. Under this project, the Java Standard Module System was designed and implemented. Modularizing code is not a new concept and it is not specific to Java. Modularization entails two approaches: segmenting or decomposing a large system into self-contained modules that can be connected in meaningful ways; or designing and building a system of individual, independent modules that communicate with each other over well defined public interfaces. Modules promote maintainability since the implementation details are hidden. Java 9 provides first class support for developers to develop and maintain modular libraries and applications all the while improving security and enabling better application performance. For more information on the java module system click here.
The first step in defining a Java module is to create a module-info.java file. This file declares the module name along with all its dependencies. This file can be used to declare the public interfaces that are exposed by the module. The module-info file for this application looks like the one shown below. It declares JDK's scripting, logging and Nashorn modules as its dependencies.
module tech.mubee.nashornDataStreamer {
requires java.scripting;
requires java.logging;
requires jdk.scripting.nashorn;
}
The next step is to use the new options available for creating modular JAR files. The latest versions of Maven and Gradle provide options and plugins to do this for us. Since we use Gradle as the build tool for this application, we can use the experimental Jigsaw plugin to build modular JARs.
plugins {
id 'java-library'
id 'org.gradle.java.experimental-jigsaw' version '0.1.1'
}
group 'tech.mubee'
version '1.0-SNAPSHOT'
javaModule.name = 'tech.mubee.nashornDataStreamer'
sourceCompatibility = 1.9
When you run "gradlew clean build," a modular JAR file is created by Gradle. For more information, check out this Gradle guide.
When you start developing applications using the Java module system, you can take advantage of the module resolution strategy to create special distributions of the Java Runtime Environment (JRE). These custom distributions or runtime images contain only those modules that are required to run your application. JDK 9 introduced a new linking tool called jlink that can be used to create custom runtime images. Our application's modular JAR can be packaged as a custom image using the following command:
jlink --module-path build/libs/:$JAVA_HOME/jmods \
--add-modules tech.mubee.nashornDataStreamer \
--launcher nds=tech.mubee.nashornDataStreamer/tech.mubee.nashorn.data.streamer.Main \
--output nds-image
As can be seen, jlink provides multiple options to create a custom image.
"--module-path" is used to tell jlink to look into specific folders that could contain java modules.
"--add-modules" is used to tell jlink which user-defined modules are to be included in the custom image.
"--launcher" is used to specify the name of the script that will be used to start your application and the full path to the class that contains the main method of the application.
"--output" is used to specify the folder name that holds the newly created custom image.
A welcome side-effect of creating a custom runtime image is that the attack surface of your application may be reduced. Of course, it depends on the number of JDK and custom modules that your application uses.
Summary
So there you have it. We were able to create an application that streams data records read from a CSV file to a custom JavaScript file. The JavaScript logic can be changed at will without recompiling and redeploying the application. We also made use of the new features available in Java 9 to create a modular JAR and a custom runtime image that can be distributed and run on any machine without the need to install JDK/JRE. Happy coding!
Opinions expressed by DZone contributors are their own.
Comments