Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Finder Bot With GraalVM and TensorFlow.js

DZone 's Guide to

Finder Bot With GraalVM and TensorFlow.js

Check out a fun experiment that looks at building a command line bot with limited NLU capabilities that can convert regular English sentences into executable commands.

· AI Zone ·
Free Resource

Introduction

Utilities like grep, awk, and sed are very powerful command line tools to have in one's arsenal when dealing with text search and text processing tasks. Having said that, I seldom use them since most times, my favorite IDE gets the job done for me. On occasion, I google for grep, awk, and sed commands or their respective manual pages to get the command and options right. I've been wondering if we can leverage the amazing advances made in NLP and NLU and add a splash of Machine Learning to hide these commands behind a command line bot (CLB) and handle common use cases through natural text.

For example, it would be nice to say "Find me all JavaScript files under buildpal/ui that contain the word 'Mady'". It is equivalent to executing the following command:

grep --include=\*.js -rnwl buildpal/ui -e "Mady"

Of course, it is a matter of personal choice when it comes to using tools. The command above is very succinct and does the job well once you get all the options right.

Let's talk about some high-level aspects of building a finder command line bot. The CLB should take a simple english utterance like "Find files under foo/bar that contain 'world'", turn it into a command, execute it, and return the results. We will look at how GraalVM and Tensorflow.js can potentially be used for this experiment.

Intent Classification and Entity Extraction

The current state of ML and NLU made intent classification and entity extraction a relatively easy thing compared to, say, 3 years ago. An intent, in simple terms, is something that a user is trying to accomplish or the intention behind the user's query or statement. In our finder CLB domain, one intent would be to "find" or "search" files. The job of our trained model is to classify a given utterance into one of these specific intents.

Entity extraction is the process of identifying keywords that carry additional information related to an intent. For example, in the utterance "Find me all JavaScript files under buildpal/ui that contain 'Mady'", the entities could be the folder path "buildpal/ui", the file type "JavaScript" and the search term "Mady". Entities can be extracted using regular expressions or through more sophisticated techniques like part of speech tagging or named entity recognition. Before we begin, we need to classify the intents and train our model. In a follow-up article, I will go through the setup, training, and validation of the model. In the meantime, please refer to this great article.

GraalVM

Oracle has recently released a polyglot virtual machine, GraalVM, which can execute programs written in languages like Java, C, Scala, JavaScript, and Python. We will see how it can be leveraged to build the finder CLB. The first step is to install GraalVM and set up the relevant environment variables. The installed bin directory contains additional launchers for JavaScript and Node.js. If you already have Node.js installed like I do, you can use aliases to point to Graal's version.

alias gnode='/path/to/graal/bin/node'alias gnpm='/path/to/graal/bin/npm'

TensorFlow.js

As you may know, TensorFlow framework comes in many flavors. While we can use TensorFlow for the Java API, let's tinker with the JavaScript version to see how it works with GraalVM as node packages.

First, initialize your project with "gnpm init". Next, install the required TensorFlow packages with these commands:

gnpm install @tensorflow/tfjsnpm install @tensorflow/tfjs-node

Note that I wasn't able to install the native TensorFlow package using Graal's npm. There is an issue with "libtensorflow.so". Therefore, as a workaround, I used the regular npm that was available on my system.

Finder Bot

At a higher level, let's see how we can bring the various pieces together. We load the trained model using the TensorFlow object:

const tf = require('@tensorflow/tfjs');require('@tensorflow/tfjs-node');
tf.loadModel('file://clb/model.json').then(model => {...});

The model will be used to identify the user's intent.

Either Java or Node.js can be used to implement the remaining parts of the functionality — reading the user's sentence from the command line, finding files using NIO, or if grep is available, using the process builder to execute the grep command. In order to mix Java and JavaScript, we have to pass additional flags to Graal:

gnode --polyglot --jvm index.js

When the model identifies the "find" intent with reasonable confidence (we can configure the error threshold), we can proceed to extract the entities. Simple pattern matching with regular expression will work just fine for our use-case. Once that's done, we move on to construct and execute the grep command and read its output. If grep is not available, we may choose to implement the find functionality using Java or Node.js.

Conclusion

That was our short, fun experiment. We discussed some aspects of building a command line bot with limited NLU capabilities that can convert regular English sentences into executable commands.

I will end with one final note. For NLU and serving the model, either locally or remotely, there are definitely other matured ways — using your framework of choice and/or through a server protocol like GraphPipe.

In part 2 of this article, we'll do a deep dive into the intent classification model.

Topics:
artificial intelligence ,graalvm ,tensorflow.js ,finder bot ,natural language understanding ,tutorial ,machine learning

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}