DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Putting Streaming ML Into Production

Putting Streaming ML Into Production

What options are there for deploying ML and DL models into production?

Tim Spann user avatar by
Tim Spann
CORE ·
Dec. 18, 18 · Tutorial
Like (5)
Save
Tweet
Share
7.35K Views

Join the DZone community and get the full member experience.

Join For Free

Productionize Streaming ML

So you've done all the work in your hosted Jupyter or Zeppelin notebooks and you are ready to deploy for real production Machine Learning and Deep Learning use cases. What do you need to do and think about beforehand? There are some common features that every system will need.

  • Classification/Run Your ML
  • REST API
  • Security
  • Automation
  • Data Lineage
  • Data Provenance
  • Scripting
  • Integration with Kafka

An optional feature is to run dockerized workloads, which is becoming more of a need.

So what are my options if I don't want some proprietary or vendor locked in solution? Let's get beyond the black box and get into a multi-cloud/hybrid cloud solution in pure open source.

Here are some options, that I have used.

Native Apache NiFi Processors to Process Machine Learning Workloads In Stream

Note: No company supports these processors yet, they are community releases by me.

TensorFlow

You can use my TensorFlow processor to easily classify images as they pass through a NiFi dataflow.

I am working on ones for Deep Learning 4 J and H2O.ai. They are straightforward if you wish to use them.

My solution has all the basic requirements and NiFi can run in Dockerized.

Using a Library/Framework Specific Tool

The MXNet model server works with Apache MXNet and ONNX models. ONNX is supported by a number of other frameworks and there are converters out there. This is an easy to setup REST service that could be hosted on K8 or YARN 3.1. Check out my article for an example.

Tensorflow Serving

TensorFlow Serving can host classification on in HDP 3.1.

The data provenance, data lineage and other features can be added through Apache Atlas or Apache NiFi.

Run Your Classification on YARN 3.1

TensorFlow on YARN 3.1 Example

Apache NiFi — Apache Livy — Apache Spark — Tensorflow

It is very easy to use the processor in Apache NiFi to execute Spark workloads that can run Tensorflow.

Image title

Image title

MLeap or OpenScoring (PMML) With Apache Atlas

This looks like the best option with governance, security and a toolkit. Greg Keys amazing work orchestrating this process has the brightest future for success.

Deploy Your Model to the Edge

NVidia, Hortonworks, Cloudera, and other companies are making tools to smooth out this process.

If your model doesn't change often, pushing a binary to box(es) isn't rocket science. There's also specialized IoT cameras and devices that you can use. I will have an article on the three affordable AI Camera Options: Jevois, PixyCam 2, and OpenMV H7 cam.

Deploy Your Model as PMML to Hortonworks Streaming Analytics Manager

SAM provides an easy way to include PMML execution in your complex event processing and streaming. This works very well with fast Kafka workloads.

None of these are wrong options, it depends on your environment and your needs.

Please comment for suggestions and questions.

Machine learning Production (computer science)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Distributed SQL: An Alternative to Database Sharding
  • Understanding gRPC Concepts, Use Cases, and Best Practices
  • Upgrade Guide To Spring Data Elasticsearch 5.0
  • Iptables Basic Commands for Novice

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: