Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Automating Operating Thresholds With WhizzML and the BigML Python Bindings

DZone's Guide to

Automating Operating Thresholds With WhizzML and the BigML Python Bindings

Learn about two ways to automate the use of operating thresholds in your predictions (either single or batch) and evaluations.

· AI Zone ·
Free Resource

Insight for I&O leaders on deploying AIOps platforms to enhance performance monitoring today. Read the Guide.

This blog post, the fifth of our series of posts about operating thresholds, focuses on two ways to automate the use of operating thresholds in our predictions (single or batch) and evaluations. The first way involves WhizzML, BigML's domain-specific language for machine learning workflow automation. It allows you to execute complex tasks that are computed completely on the server side (with built-in parallelization), where all resources involved are treated as first-class citizens by BigML.

The second way to automate operating thresholds involves the set of bindings that BigML maintains, which allow developers to work in their favorite programming language as they interact with the BigML platform. The Python binding is already updated to handle the most recent platform capabilities we're covering, so we'll use it as an example in this post. However, there are other bindings options, such as Java, C#, or Node.js.

Operating thresholds can be used with any of your classification models: decision tree models, ensembles, logistic regressions, and Deepnets (neural networks). They can improve the quality of your single and batch predictions and your evaluations.

Let's see how to automate operating thresholds for a single model using WhizzML. In our example, we'll deal with a classifier that has two possible outputs: good or bad. We're interested in improving the classification of the instances that fall into the good category, so we'll set this one as the positive class. To ensure that we reduce misclassifications, we'll consider that our model predicts good only if the probability of this prediction goes over a 60% threshold. To see whether this improves the model's performance, we can create an evaluation via WhizzML.

;; creates an evaluation setting an operating point 
;; for a single model
(define my-evaluation 
  (create-evaluation {
    "model" my-model
    "dataset" test-dataset
    "operating_point" {
        "kind" "probability"
        "positive_class" "good"
        "threshold" 0.6
    }}))

Remember that almost all BigML resources (except predictions and projects) are asynchronous, so if you want to use the evaluation my-evaluation or its components, like the accuracy, precision, etc., you will need to ensure that the creation process has actually finished. That's what the next snippet of code does.

;; creates an evaluation with an operating point 
;; for a deepnet and retrieve its precision
(define my-eval-precision 
  (get-in
    (create-and-wait-evaluation {
      "deepnet" my-deepnet
      "dataset" test-dataset
      "operating_point" {
        "kind" "probability"
        "positive_class" "good"
        "threshold" 0.6
      }}) ["result" "model" "precision"]))

If you prefer the BigML Python bindings, the equivalent code is:

from bigml.api import BigML
api = BigML()
args = {
    "operating_point": {
        "kind": "probability", 
        "positive_class": "good",
        "threshold": 0.6
    }
}
my_evaluation = api.create_evaluation(
    "deepnet/59b0f8c7b95b392f12000003",
    "dataset/59b0f8c7b95b392f12000001",
    args)
api.ok(my_evaluation)
precision = my_evaluation["object"]["result"]["model"]["precision"]

For more details about these and other evaluation properties, please check the dedicated API documentation.

Single predictions can also be computed by using this operational threshold. In fact, these evaluations are computed by averaging the matches of single predictions whose outputs are already known. Let's see how to set a confidence threshold of 0.66 for a prediction made with a logistic regression by using WhizzML.

;; creates a single prediction setting a confidence threshold
(define my-prediction 
  (create-prediction {
    "logisticregression" my-logistic
    "input_data" {"000001" 0.35,"000002" 1.2}
    "operating_point" {
        "kind" "confidence"
        "positive_class" "good"
        "threshold" 0.66
    }}))

These, plus other prediction properties, are also explained in the API documentation. Using the BigML Python bindings, the equivalent code would be as follows:

from bigml.api import BigML
api = BigML()
args = {
    "operating_point": {
        "kind": "confidence", 
        "positive_class": "good",
        "threshold": 0.66 
    }
}
input_data = {"000001": 0.35,"000002": 1.2}
logistic_id = "logisticregression/59b0f8c7b95b392f12000000"
my_prediction = api.create_prediction(logistic_id, input_data, args)

As we mentioned above, you can also use operating thresholds for your batch predictions. Below, you can see how to create a batch prediction in WhizzML by using an ensemble of ten models, where we specify that at least six models should return the positive class in their predictions.

;; creates a batch prediction setting
;; a threshold based on the number of votes
(define my-batchprediction
  (create-batchprediction {
    "ensemble" my_ensemble
    "dataset" my_dataset
    "operating_point" {
        "kind" "votes"
        "positive_class" "good"
        "threshold" 6
    }}))

With the BigML Python bindings, you should write this code as:

from bigml.api import BigML
api = BigML()
args = {
    "operating_point": {
        "kind": "votes", 
        "positive_class": "good",
        "threshold": 6
    }
}
ensemble = "ensemble/59b0f8c7b95b392f12000001"
dataset = "dataset/59b0f8c7b95b392f12000003"
my_batch_prediction = api.create_batch_prediction(ensemble, dataset, args)

Easy, right? If it's your first time using our domain-specific language for machine learning, we recommend that you check this series of blog posts to get familiarized with WhizzML.

TrueSight is an AIOps platform, powered by machine learning and analytics, that elevates IT operations to address multi-cloud complexity and the speed of digital transformation.

Topics:
ai ,classification ,automation ,operating thresholds ,python ,bindings ,machine learning ,tutorial ,predictive analytics

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}