DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Building Models With AutoML in IBM Watson Studio

Building Models With AutoML in IBM Watson Studio

This article describes a technique called AutoML, which can be used by developers to build models without having to be data scientists.

Niklas Heidloff user avatar by
Niklas Heidloff
CORE ·
Sep. 13, 18 · Tutorial
Like (3)
Save
Tweet
Share
6.25K Views

Join the DZone community and get the full member experience.

Join For Free

many developers, including myself, want to use ai in their applications. building machine learning models, however, often requires a lot of expertise and time. this article describes a technique called automl, which can be used by developers to build models without having to be data scientists. while developers only have to provide the data and define the goals, automl figures out the best model automatically.

there are several ways for developers to use ai without having to be a data scientist:

cognitive services

cognitive services are provided by most cloud providers these days. for example, ibm offers as part of the watson developer cloud services for speech recognition, natural language understanding, visual recognition, and assistants. developers can use these services out of the box or customize them declaratively. the services can be accessed via rest apis or language libraries.

reusable models

cognitive services like the watson services cover common ai scenarios. for more specific scenarios, developers can sometimes use existing models that have been open sourced. the visual recognition models for mobile devices from google are a good example. they can be customized via transfer learning without having to write code.

another example is the ibm model asset exchange , which comes with two types of models: models that can be re-used directly and models with instructions on how to train and customize them. the models are put in docker containers and can be invoked via rest apis.

automl

while cognitive services and reusable models cover many scenarios, sometimes you need to build your own models for your individual requirements, and that is often not a trivial task. personally, i took some ml/dl classes, understand the basics, and can run the tutorials, but i have a hard time creating my own models for my own specific requirements.

this is where automl comes in. basically, automl is a set of capabilities that allows developers and data scientists to provide data, to define potential features (input), and to define the labels (output). automl takes care of the heavy lifting and figures out the best features, the best algorithms, and the best hyperparameters.

to learn more about automl, i encourage you to watch the tensorflow dev summit 2018 keynote and the talk from andreas mueller . i also like the recent series of blog entries on fast.ai .

there are several different automl open-source libraries and commercial offerings available, which use different approaches to find the best model. for example, ibm provides watson machine learning to identify the best algorithm. additionally, with watson deep learning , hyperparameters can be identified.

auto-sklearn

there seem to be several promising open-source libraries. unfortunately, i couldn't use a lot of them for license reasons. one automl library that looks interesting is auto-sklearn , which won the 2016 kdnuggets competition . there seems to be an improved successor of this library, which won the 2018 competition , but i couldn't find that code, which is why my sample below uses the publicly available version.

running auto-sklearn in ibm watson studio

auto-sklearn comes with a hello world sample . you can use a slightly different version of this sample in a notebook in watson studio.

first, you need to define a custom anaconda-based environment with auto-sklearn — see screenshot .

when i tried to run the unmodified sample, i ran into permission issues when accessing the file system. it turned out that the library uses absolute paths to which the notebooks don't have access. fortunately, auto-sklearn let me change these directories so that i could use relative directories to which notebooks have full access. here is the modified code:

import autosklearn.classification
import sklearn.model_selection
import sklearn.datasets
import sklearn.metrics
import os
os.makedirs('tmp')
os.makedirs('output')
x, y = sklearn.datasets.load_digits(return_x_y=true)
x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x, y, random_state=1)
automl = autosklearn.classification.autosklearnclassifier(shared_mode=true, tmp_folder='tmp', output_folder='output', delete_tmp_folder_after_terminate=false, delete_output_folder_after_terminate=false)
automl.fit(x_train, y_train)
y_hat = automl.predict(x_test)
print("accuracy score", sklearn.metrics.accuracy_score(y_test, y_hat))
automl.sprint_statistics()
automl.cv_results_

this is a screenshot of the notebook:

want to run this sample yourself? all you need to do is to get a free ibm cloud account and create a notebook in watson studio .

Data science Machine learning Open source dev

Published at DZone with permission of Niklas Heidloff, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Secure APIs: Best Practices and Measures
  • Seamless Integration of Azure Functions With SQL Server: A Developer's Perspective
  • Create a CLI Chatbot With the ChatGPT API and Node.js
  • Old School or Still Cool? Top Reasons To Choose ETL Over ELT

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: