DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • The Perceptron Algorithm and the Kernel Trick
  • Beyond Code Coverage: A Risk-Driven Revolution in Software Testing With Machine Learning
  • Accelerating AI Inference With TensorRT
  • AI's Dilemma: When to Retrain and When to Unlearn?

Trending

  • A Modern Stack for Building Scalable Systems
  • Kubeflow: Driving Scalable and Intelligent Machine Learning Systems
  • Building Enterprise-Ready Landing Zones: Beyond the Initial Setup
  • It’s Not About Control — It’s About Collaboration Between Architecture and Security
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. SVM RBF Kernel Parameters With Code Examples

SVM RBF Kernel Parameters With Code Examples

In this post, you will learn about SVM RBF (Radial Basis Function) kernel hyperparameters with the python code example.

By 
Ajitesh Kumar user avatar
Ajitesh Kumar
·
Jul. 28, 20 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
20.0K Views

Join the DZone community and get the full member experience.

Join For Free

In this post, you will learn about SVM RBF (Radial Basis Function) kernel hyperparameters with the python code example. The following are the two hyperparameters which you need to know while training a machine learning model with SVM and RBF kernel:

  • Gamma
  • C (also called regularization parameter)

Knowing the concepts on SVM parameters such as Gamma and C used with RBF kernel will enable you to select the appropriate values of Gamma and C and train the most optimal model using the SVM algorithm. Let's understand why we should use kernel functions such as RBF.

Why Use RBF Kernel?

When the data set is linearly inseparable or in other words, the data set is non-linear, it is recommended to use kernel functions such as RBF. For a linearly separable dataset (linear dataset) one could use linear kernel function (kernel="linear"). Getting a good understanding of when to use kernel functions will help train the most optimal model using the SVM algorithm. We will use Sklearn Breast Cancer data set to understand SVM RBF kernel concepts in this post. The scatter plot given below represents the fact that the dataset is linearly inseparable and it may be a good idea to apply the kernel method for training the model.


Fig 1. Linearly inseparable data set


The above plot is created using first two attributes of the sklearn breast cancer dataset as shown in the code sample below:

Java
 




x
20


 
1
import pandas as pd
2
import matplotlib.pyplot as plt
3
from sklearn import datasets
4

           
5
# Load the breast cancer dataset
6
#
7
bc = datasets.load_breast_cancer()
8
df = pd.DataFrame(data=bc.data)
9
df["label"] = bc.target
10

           
11
# Scatter plot shown in fig 1
12
#
13
plt.scatter(df[0][df["label"] == 0], df[1][df["label"] == 0], 
14
            color='red', marker='o', label='malignant')
15
plt.scatter(df[0][df["label"] == 1], df[1][df["label"] == 1], 
16
            color='green', marker='*', label='benign')
17
plt.xlabel('Malignant')
18
plt.ylabel('Benign')
19
plt.legend(loc='upper left')
20
plt.show()



Given that the dataset is non-linear, it is recommended to use kernel method and hence kernel function such as RBF.

SVM RBF Kernel Function and Parameters

When using the SVM RBF kernel to train the model, one can use the following parameters:

Kernel Parameter - Gamma Values

The gamma parameter defines how far the influence of a single training example reaches, with low values meaning 'far' and high values meaning 'close'. The lower values of gamma result in models with lower accuracy and the same as the higher values of gamma. It is the intermediate values of gamma which gives a model with good decision boundaries. The same is shown in the plots given in fig 2.

The plots below represent decision boundaries for different values of gamma with the value of C set as 0.1 for illustration purposes. Note that as the Gamma value increases, the decision boundaries classify the points correctly. However, after a certain point (Gamma = 1.0 and onwards in the diagram below), the model accuracy decreases. It can thus be understood that the selection of appropriate values of Gamma is important. Here is the code which is used.

Java
 




xxxxxxxxxx
1


 
1
svm = SVC(kernel='rbf', random_state=1, gamma=0.008, C=0.1)
2
svm.fit(X_train_std, y_train)



Fig 2. Decision boundaries for different Gamma Values for RBF Kernel


Note some of the following in the above plots:

  • When gamma is very small (0.008 or 0.01), the model is too constrained and cannot capture the complexity or "shape" of the data. The region of influence of any selected support vector would include the whole training set. The resulting model will behave similarly to a linear model with a set of hyperplanes that separate the centers of a high density of any pair of two classes. Compare with the diagram in the next section where the decision boundaries for a model trained with a linear kernel is shown.
  • For intermediate values of gamma (0.05, 0.1, 0.5), it can see on the second plot that good models can be found.
  • For larger values of gamma (3.0, 7.0, 11.0) in the above plot, the radius of the area of influence of the support vectors only includes the support vector itself and no amount of regularization with C will be able to prevent overfitting.

Kernel Parameter - C Values

Simply speaking, the C parameter is a regularization parameter used to set the tolerance of the model to allow the misclassification of data points in order to achieve lower generalization error. Higher the value of C, lesser is the tolerance and what is trained is a maximum-margin classifier. Smaller the value of C, larger is the tolerance of misclassification and what gets trained is a soft-margin classifier that generalizes better than maximum-margin classifier. The C value controls the penalty of misclassification. A large value of C would result in a higher penalty for misclassification and a smaller value of C will result in a smaller penalty of misclassification. With a larger value of C, a smaller margin will be accepted if the decision function is better at classifying all training points correctly. The model may overfit with the training dataset. A lower C will encourage a larger margin, therefore a simpler decision function, at the cost of training accuracy. 

The diagram below represents the decision boundary with different values of C for a model trained with a linear kernel and Sklearn Breast Cancer dataset. Take note of the decision boundary for different values of C. Note that as the value of C increases, the model accuracy increases. This goes in line what we learnt earlier that a smaller value of C allows for greater misclassification and hence the model accuracy will be lower. However, after a certain point (C=1.0), the accuracy ceases to increase.


Fig 3 Decision boundaries for different C Values for Linear Kernel


Let's take a look at different values of C and the related decision boundaries when the SVM model gets trained using RBF kernel (kernel = "rbf"). The diagram below represents the model trained with the following code for different values of C. Note the value of gamma is set to 0.1 and the kernel = 'rbf'.

Java
 




xxxxxxxxxx
1


 
1
svm = SVC(kernel='rbf', random_state=1, gamma=0.1, C=0.02)
2
svm.fit(X_train_std, y_train)



Fig 4. Decision boundaries for different C Values for RBF Kernel


References

Here are some other posts on similar topics:

Conclusion

Here are some of the key points that is covered in this post.

  • Gamma and C values are key hyperparameters that can be used to train the most optimal SVM model using RBF kernel.
  • The gamma parameter defines how far the influence of a single training example reaches, with low values meaning 'far' and high values meaning 'close'
  • Higher value of gamma will mean that radius of influence is limited to only support vectors. This would essentially mean that the model tries and overfit. The model accuracy lowers with the increasing value of gamma.
  • The lower value of gamma will mean that the data points have very high radius of influence. This would also result in model having lower accuracy.
  • It is the intermediate value of gamma which results in a model with optimal accuracy.
  • The C parameter determines how tolerant is the model towards misclassification.
  • Higher value of C will result in model which has very high accuracy but which may fail to generalize.
  • The lower value of C will result in a model with very low accuracy.
Machine learning Kernel (operating system)

Published at DZone with permission of Ajitesh Kumar, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • The Perceptron Algorithm and the Kernel Trick
  • Beyond Code Coverage: A Risk-Driven Revolution in Software Testing With Machine Learning
  • Accelerating AI Inference With TensorRT
  • AI's Dilemma: When to Retrain and When to Unlearn?

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!