Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Build a Fully Functioning App Leveraging Machine Learning With TensorFlow.js

DZone's Guide to

Build a Fully Functioning App Leveraging Machine Learning With TensorFlow.js

In this article, I want to describe my experience in building an App (that I recently published on Google Play Store) with this javascript ML library.

· AI Zone ·
Free Resource

Bias comes in a variety of forms, all of them potentially damaging to the efficacy of your ML algorithm. Read how Alegion's Chief Data Scientist discusses the source of most headlines about AI failures here.

TensorFlow.js is a JavaScript library for training and deploying Machine Learning (ML) models in the browser (client-side) and on Node.js (server-side). In this article, I want to describe my experience in building an App (that I recently published on Google Play Store) with this javascript ML library.

However, instead of just going straight into how I built this app using TensorFlow.js, I want to first describe the conditions/needs that led to this choice. I also want to touch a little bit on some other approaches that I realized could be possible that you can take to build an App leveraging machine learning. Finally, I want to leave behind some lessons I learned through this experience and would love your thoughts if you have been experimenting with ML in Apps.

The Why

When we think of Machine Learning, we often think of very large datasets, complex algorithms and huge computing capacity. Of course, some real cases are really huge (example, Health and Science use cases), but it is definitely incorrect to say that if it is not large, it is not Machine Learning. Point in case is there can be use cases with small datasets and a limited number of features that yet qualify as ML use cases.

[A] In this case, the App in question works with small use case pertaining to learning the preferences of an individual based on very few features.

Another common belief about the process of Machine Learning is that it takes hundreds and thousands of iterations to learn reasonably which equates to very long hours of training. Such a long training process is often carried out on large servers in the background, thus rendering it unusable in an online user experience. Although this is mostly true, there could be processes that could be shorter and manageable without impacting the user experience (if correctly designed).

[B] In my case, I was hoping really to prove that with optimal hyperparameters, the model could be trained effectively with fewer iterations and can be saved locally on the mobile device itself.

Lastly, one concern with ML is about data security and privacy. If the data is huge and training a model takes huge computing resources and time, the most obvious means to handle this situation is to move the data. That calls for all sorts of engineering, cost and compliance needs.

[C] On of the important challenge I wanted to take on was to prevent any kind of data movement outside of the standard App architecture.

So, to summarize, I wanted to build an App that can work effectively with small datasets [A], minimal training iterations, i.e. shorter training time [B] and where I did not need to move data outside of my standard App architecture just for training [C].

The App: Headlines

Headlines is a News App that learns from the user its likes and dislikes and eventually starts figuring out what kind of news articles the user may like. Specifically, the App provides news headlines from multiple sources on multiple topics. For example, the App provides news headlines from publishers such as CNN, TechCrunch, Bloomberg, etc. on topics such as Business, Entertainment, Science, Health, Technology, etc.

News headline showing like-ness probability along with Like Dislike buttons

Core Idea

Headlines allows you to rate the news represented as two labels in the dataset - "Like" and "Dislike". Based on the ratings gathered over multiple news headlines, the core idea is to predict the rating of a new headline and even sort the news headlines in the order of probability of like-ness.

Such recommendation is generated based on learning news preferences for every individual seperately. One can argue that better learning can occur on a wider dataset comprising of all users' preferences. However, to be able to utilize such a model, that model also needs to take into account similarities and dissimilarities in the user profiles as well. This was de-scoped for later. Hopefully that will provide a much better dataset and better probability results based on broader learning.

The How

Considering we now know our objectives (listed as [A], [B] and [C]) and a sense of what the Headline App is expected to do (serving most relevant news headlines to the user), we will now dive a little deeper into the mechanics of developing the ML model. Here we will look at core design aspects of the ML model (not the App itself).

Dataset

The core dataset relevant for trainng this model is user-specific records of news articles that s/he liked or did not like. Focusing only a specific user's data leads to smaller manageable chunk of normalized data that maintains consistent user context and hence is a richer dataset to train on.

Features

In terms of useful features in the dataset, I focused on a couple attributes that provide useful and differentiable context to each news headline. The news publication source (such as CNN, TechCrunch, Bloomberg, etc.) and news category (such as Business, Science, Health, Technology, etc.) seemed to provide that well. In addition to these categorical features I also made a choice to add a feature with continuous numeric value. For this, we construct a feature using sentiment analysis score of the news headline between -1.0 and 1.0. You should take a look at this page from Google's Machine Learning Crash Course that talks about some of the good qualities of a feature. To mention the least about the target labels, they are simply one hot encoded values of 1.0 (representing a Like) and 0.0 (representing a Dislike).

Neural Network and Training

Based on the features discussed above, in our deep neural network, the least we would be quickly able to identify is it will have a 3-node Input layer and a 2-node Output layer. Designing the rest of the hidden layers and nodes in each hidden layer is where science meets art. This is where a lot of experimentation occurs. Adjusting the neaural network and running the training is an iterative process.

Source: Wikimedia Commons

Throw in hyperparameters, such as learning rate, batch size, epochs, etc., and various optimizers that evaluate the loss at each iteration — these are essentially your knobs to control the training process.

There are many great articles on each of these topics, so I am a bit shy to cover them in this article. However, here is a glossary of most common terms used in Machine Learning.

As far as the code construct goes, TensorFlow.js uses a Keras-like construct to build the neural network. As mentioned in the beginning, TensorFlow.js code can be written in both client-side JavaScript files as well in a server-side Node.js module and the code construct is same in either case.

Below is an excerpt from the code leveraged in the Headlines App to Build the model...

// -- build model
let model = tf.sequential();
let optimizer = tf.train.rmsprop(0.01);// RMSProp optimizer with learning rate of 0.01
let epochs = 32;
let batchSize = 96;

// Input layer
model.add(tf.layers.dense({ units: 5, inputDim: 3, activation: 'relu' })); 
// Hidden layer
model.add(tf.layers.dense({ units: 2, activation: 'relu' }));
// Output layer
model.add(tf.layers.dense({ units: 2, activation: 'softmax' })); 

// Compile the model with optimizer and loss function
model.compile({ optimizer: 'sgd', loss: 'categoricalCrossentropy', metrics: ['accuracy'] });

... and to Train the model.

// -- train
model.fit(
  // X tensors
  tf.cast(tf.tensor2d(data), "float32"), 
  // Y tensors
  tf.cast(tf.oneHot(labels, 2), "float32"), {
epochs: epochs,
    batchSize: batchSize,
    callbacks: {
      onEpochEnd: async (n, logs) => {
          logs.acc = parseFloat((logs.acc * 100).toFixed(2));
          logs.loss = parseFloat((logs.loss * 100).toFixed(3));
          console.log({ 
            epochs: { current: n + 1, total: epochs }, 
            batchSize: batchSize, 
            loss: logs.loss, 
            accuracy: logs.acc });
      } // onEpochEnd
    } // callback
}).then((history) => {
// -- save model in localStorage
model.save('localstorage://MLModel').then(() => {})
    .catch((err) => console.log(err));
});

Prediction

As you can see above, the entire training (called by the .fit() method) occurs on the device itself. Also, once the training is complete, we save the trained model in localStorage from where it can be loaded back later when making predictions.

// -- predict
return new Promise((resolve, reject) => {
  // Load model from localStorage
  tf.loadModel('localstorage://MLModel')
  .then((savedModel) => {
  let score = savedModel.predict(
          tf.cast(tf.tensor2d([[cat, src, senti]]), "float32"))
        ['dataSync'](); // dataSync extracts values from tensor
  resolve({ 
          dislike: score[0], 
          like: score[1], 
          verdict: score[0] > score[1] ? 'dislike' : 'like' });
}).catch((err) => reject(err));
}); // Promise

Continuous Learning (Transfer Learning/Fine Tuning?)

Going back to the original objectives of keeping dataset smaller for each subsequent training, we can leverage transfer learning where part of an existing model and its pre-trained weights are leveraged to train new dataset. Google has a great example of transfer learning using Convolutional Neural Networks (CNN) on images. In our example here, the previously trained model can be leveraged as a pre-trained model for new inputs generated since last training. I am not covering this piece here only because the application logic to manage the delta between last training and new data is tangential to the core topic of this article.

Conclusion

My primary motivation for writing this article was derived from how I started learning Machine Learning. Most examples are usually isolated and the process of developing ML models seems to be on-its-own. Machine Learning does not have to be an isolated task. It can be integrated into the mainstream App development process.

Secondly, I think the ability to train on the client-side is a very useful concept. I would like to, however, caution that because training ML models is a resource-intensive task per se, one needs to make sure the use case justifies this approach.

A hybrid option where training can occur in the cloud while the model can be downloaded from a cloud storage device is a great approach as well.

Yet another extreme approach is to train the model in the cloud and serve the prediction capability as an API endpoint. This is a great approach to provide Machine Learning as a blackbox, or MLaaS (Machine Learning as a Service). Google, AWS and many other cloud providers are already providing this service with common use cases such as image detection, text extraction, translation, sentiment analysis, etc.

Overall, Machine Learning is a very wide and deep area of study. However, with many modern tools such as TensorFlow.js, applying Machine Learning in everyday life is becoming a possibility. Practically, the sky-is-the-limit for what can be conceived with ML.

Your machine learning project needs enormous amounts of training data to get to a production-ready confidence level. Get a checklist approach to assembling the combination of technology, workforce and project management skills you’ll need to prepare your own training data.

Topics:
machine learning ,tensorflow ,tensorflow.js ,web app development ,deep neural network ,headlines app

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}