DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. AdaBoost Algorithm For Machine Learning

AdaBoost Algorithm For Machine Learning

What is AdaBoost? Let's check out what it is, what it does, and some examples of how to train a model.

Rinu Gour user avatar by
Rinu Gour
·
Oct. 26, 18 · Opinion
Like (3)
Save
Tweet
Share
9.83K Views

Join the DZone community and get the full member experience.

Join For Free

AdaBoost Algorithm

What Is AdaBoost?

First of all, AdaBoost is short for Adaptive Boosting. Basically, Ada Boosting was the first really successful boosting algorithm developed for binary classification. Also, it is the best starting point for understanding boosting. Moreover, modern boosting methods build on AdaBoost, most notably stochastic gradient boosting machines.

Generally, AdaBoost is used with short decision trees. Further, the first tree is created, the performance of the tree on each training instance is used. Also, we use it to weight how much attention the next tree. Thus, it is created should pay attention to each training instance. Hence, training data that is hard to predict is given more weight. Although, whereas easy to predict instances are given less weight. 

Learning: AdaBoost Model

Learn the AdaBoost Model from Data

  • Ada Boosting is best used to boost the performance of decision trees and this is based on binary classification problems.
  • AdaBoost was originally called AdaBoost.M1 by the author. More recently it may be referred to as discrete Ada Boost. As because it is used for classification rather than regression.
  • AdaBoost can be used to boost the performance of any machine learning algorithm. It is best used with weak learners.

Each instance in the training dataset is weighted. The initial weight is set to weight(xi) = 1/n Where xi is the i’th training instance and n is the number of training instances.

How to Train One Model

A weak classifier is prepared on the training data using the weighted samples. Only binary classification problems are supported. So each decision stump makes one decision on one input variable. And outputs a +1.0 or -1.0 value for the first or second class value. The misclassification rate is calculated for the trained model. Traditionally, this is calculated as error = (correct – N) / N Where error is the misclassification rate. While correct is the number of training instance predicted by the model. And N is the total number of training instances.

Example 1

If the model predicted 78 of 100 training instances the error. This is modified to use the weighting of the training instances: error = sum(w(i) * terror(i)) / sum(w) Which is the weighted sum of the misclassification rate. where w is the weight for training instance I terror is the prediction error for training instance i. Also, which is 1 if misclassified and 0 if correctly classified?

Example 2

If we had 3 training instances with the weights 0.01, 0.5 and 0.2. The predicted values were -1, -1 and -1, and the actual output variables in the instances were -1, 1 and -1, then the terrors would be 0, 1, and 0. The misclassification rate would be calculated as: error = (0.01*0 + 0.5*1 + 0.2*0) / (0.01 + 0.5 + 0.2) or error = 0.704 A stage value is calculated for the trained model. As it provides a weighting for any predictions that the model makes. The stage value for a trained model is calculated as follows: stage = ln((1-error) / error) Where stage is the stage value used to weight predictions from the model. Also, ln() is the natural logarithm and error is the misclassification error for the model. The effect of the stage weight is that more accurate models have more weight. The training weights are updated giving more weight to predicted instances. And less weight to predicted instances.

Example 3

The weight of one training instance (w) is updated using: w = w * exp(stage * terror) Where w is the weight for a specific training instance, exp() is the numerical constant e or Euler’s number raised to a power, a stage is the misclassification rate for the weak classifier and terror is the error the weak classifier made predicting the output and evaluated as: terror = 0 if(y == p), otherwise 1 Where y is the output variable for the training instance and p is the prediction from the weak learner. This has the effect of not changing the weight if the training instance was classified. Thus, making the weight slightly larger if the weak learner misclassified the instance. To learn machine learning applications, follow the below link

AdaBoost Ensemble

  • Basically, weak models are added sequentially, trained using the weighted training data.
  • Generally, the process continues until a pre-set number of weak learners have been created.
  • Once completed, you are left with a pool of weak learners each with a stage value.

Making Predictions with AdaBoost

Predictions are made by calculating the weighted average of the weak classifiers. For a new input instance, each weak learner calculates a predicted value as either +1.0 or -1.0. The predicted values are weighted by each weak learners stage value. The prediction for the ensemble model is taken as a sum of the weighted predictions. If the sum is positive, then the first class is predicted, if negative the second class is predicted.

For example: 5 weak classifiers may predict the values 1.0, 1.0, -1.0, 1.0, -1.0. From a majority vote, it looks like the model will predict a value of 1.0 or the first class. These same 5 weak classifiers may have the stage values 0.2, 0.5, 0.8, 0.2 and 0.9 respectively. Calculating the weighted sum of these predictions results in an output of -0.8. And which would be an ensemble prediction of -1.0 or the second class.

Data Preparation for AdaBoost

This section lists some heuristics for best preparing your data for AdaBoost. Quality Data: Because of the ensemble method attempt to correct misclassifications in the training data. Also, you need to be careful that the training data is high-quality. Outliers: Generally, outliers will force the ensemble down the rabbit hole of work. Although, it is so hard to correct for cases that are unrealistic. These could be removed from the training dataset. Noisy Data: Basically, noisy data, specifical noise in the output variable can be problematic. But if possible, attempt to isolate and clean these from your training dataset.

Conclusion

We have studied the Boosting Algorithm and have learned about an Ada boost example. We have also learned about Adaboosting applications. I hope this article will help you understand the concept of Boosting — Ada boost. Furthermore, if you have any questions, feel free to ask in the comments section.

Machine learning Algorithm Data (computing)

Published at DZone with permission of Rinu Gour. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Spring Cloud: How To Deal With Microservice Configuration (Part 1)
  • Hackerman [Comic]
  • 9 Ways You Can Improve Security Posture
  • Learning by Doing: An HTTP API With Rust

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: