DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • XAI: Making ML Models Transparent for Smarter Hiring Decisions
  • Exploring Decision Trees: A Beginner's Guide
  • The Equivalence Rationale of Neural Networks and Decision Trees: Towards Improving the Explainability and Transparency of Neural Networks
  • When To Use Decision Trees vs. Random Forests in Machine Learning

Trending

  • Docker Base Images Demystified: A Practical Guide
  • How Kubernetes Cluster Sizing Affects Performance and Cost Efficiency in Cloud Deployments
  • Scaling DevOps With NGINX Caching: Reducing Latency and Backend Load
  • Manual Sharding in PostgreSQL: A Step-by-Step Implementation Guide
  1. DZone
  2. Coding
  3. Languages
  4. Decision Tree Classifier Python Code Example

Decision Tree Classifier Python Code Example

In this post, you will learn about how to train a decision tree classifier machine learning model using Python.

By 
Ajitesh Kumar user avatar
Ajitesh Kumar
·
Jul. 29, 20 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
15.1K Views

Join the DZone community and get the full member experience.

Join For Free

In this post, you will learn about how to train a decision tree classifier machine learning model using Python. The following points will be covered in this post:

  • What is decision tree?
  • Decision tree python code sample

What Is a Decision Tree?

Simply speaking, the decision tree algorithm breaks the data points into decision nodes resulting in a tree structure. The decision nodes represent the question based on which the data is split further into two or more child nodes. The tree is created until the data points at a specific child node is pure (all data belongs to one class). The criteria for creating the most optimal decision questions is the information gain. The diagram below represents a sample decision tree.


Fig 1. Sample Decision tree


Training a machine learning model using a decision tree classification algorithm is about finding the decision tree boundaries.

Decision trees build complex decision boundaries by dividing the feature space into rectangles. Here is a sample of how decision boundaries look like after model trained using a decision tree algorithm classifies the Sklearn IRIS data points. The feature space consists of two features namely petal length and petal width. The code sample is given later below.


Fig 2. Decision boundaries created by a decision tree classifier


Decision Tree Python Code Sample

Here is the code sample which can be used to train a decision tree classifier.

Python
 




xxxxxxxxxx
1
15


 
1
import pandas as pd
2
import numpy as np
3
import matplotlib.pyplot as plt
4
from sklearn import datasets
5
from sklearn.model_selection import train_test_split
6
from sklearn.tree import DecisionTreeClassifier
7

           
8
iris = datasets.load_iris()
9
X = iris.data[:, 2:]
10
y = iris.target
11

           
12
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1, stratify=y)
13

           
14
clf_tree = DecisionTreeClassifier(criterion='gini', max_depth=4, random_state=1)
15
clf_tree.fit(X_train, y_train)



Visualizing Decision Tree Model Decision Boundaries

Here is the code which can be used to create the decision tree boundaries shown in fig 2. Note that the package mlxtend is used for creating decision tree boundaries.

Python
 




xxxxxxxxxx
1
12


 
1
from mlxtend.plotting import plot_decision_regions
2

           
3
X_combined = np.vstack((X_train, X_test))
4
y_combined = np.hstack((y_train, y_test))
5

           
6
fig, ax = plt.subplots(figsize=(7, 7))
7
plot_decision_regions(X_combined, y_combined, clf=clf_tree)
8
plt.xlabel('petal length [cm]')
9
plt.ylabel('petal width [cm]')
10
plt.legend(loc='upper left')
11
plt.tight_layout()
12
plt.show()



Visualizing Decision Tree in the Tree Structure

Here is the code which can be used visualize the tree structure created as part of training the model. plot_tree function from sklearn tree class is used to create the tree structure. Here is the code:

Python
 




xxxxxxxxxx
1


 
1
from sklearn import tree
2

           
3
fig, ax = plt.subplots(figsize=(10, 10))
4
tree.plot_tree(clf_tree, fontsize=10)
5
plt.show()



Here is how the tree would look after the tree is drawn using the above command. Note the usage of plt.subplots(figsize=(10, 10)) for creating a larger diagram of the tree. Otherwise, the tree created is very small.


Fig 3. Decision tree visualization


In the follow-up article, you will learn about how to draw nicer visualizations of a decision tree using package. Also, you will learn some key concepts in relation to decision tree classifier such as information gain (entropy, gini, etc).

Decision tree Tree (data structure) Python (language)

Published at DZone with permission of Ajitesh Kumar, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • XAI: Making ML Models Transparent for Smarter Hiring Decisions
  • Exploring Decision Trees: A Beginner's Guide
  • The Equivalence Rationale of Neural Networks and Decision Trees: Towards Improving the Explainability and Transparency of Neural Networks
  • When To Use Decision Trees vs. Random Forests in Machine Learning

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!