DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • The One-Pixel Threat: How Minuscule Changes Can Fool Deep Learning Systems
  • Navigating the Complexities of Text Summarization With NLP
  • Search for Rail Defects (Part 3)
  • Understanding the Deployment of Deep Learning Algorithms on Embedded Platforms

Trending

  • Microsoft Azure Synapse Analytics: Scaling Hurdles and Limitations
  • Docker Model Runner: Streamlining AI Deployment for Developers
  • AI Meets Vector Databases: Redefining Data Retrieval in the Age of Intelligence
  • Recurrent Workflows With Cloud Native Dapr Jobs
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. YOLOv10: SOTA Real-Time Object Detection

YOLOv10: SOTA Real-Time Object Detection

Going through the test process of YOLOv10, introduced by Tsinghua University on May 23, offers a significant improvement over YOLOv9.

By 
Alex Smith user avatar
Alex Smith
·
Jun. 12, 24 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
2.8K Views

Join the DZone community and get the full member experience.

Join For Free

YOLOv10 (You Only Look Once v10), introduced by Tsinghua University on May 23, offers a significant improvement over YOLOv9. It achieves a 46% reduction in latency and uses 25% fewer parameters, all while delivering the same level of performance.

2. YOLOv10 Visual Object Detection: Overview

2.1 What Is YOLO?

YOLO (You Only Look Once) is an object detection algorithm based on deep neural networks, designed to identify and locate multiple objects in images or videos in real time. YOLO is renowned for its fast processing speed and high accuracy, making it ideal for applications that require rapid object detection, such as real-time video analysis, autonomous driving, and smart healthcare.

Before YOLO, the dominant algorithm was R-CNN, a "two-stage" approach: first, generating anchor boxes and then predicting the objects within those boxes. YOLO revolutionized this by allowing "one-stage" direct, end-to-end output of objects and their locations.

  • One-stage algorithms: These models perform direct regression tasks to output object probabilities and their coordinates. Examples include SSD, YOLO, and MTCNN.
  • Two-stage algorithms: These first generate multiple anchor boxes and then use convolutional neural networks to output the probability and coordinates of objects within those boxes. Examples include the R-CNN series.

2.2 YOLO's Network Structure

YOLOv10 is an enhancement of YOLOv8. Let's take a brief look at the network structure of YOLOv8:

network structure

3. YOLOv10 Visual Object Detection: Training and Inference

3.1 Installing YOLOv10

3.1.1 Clone the Repository

Start by cloning the YOLOv10 repository from GitHub:

Shell
 
git clone https://github.com/THU-MIG/yolov10.git


3.1.2 Create a Conda Environment

Next, create a new Conda environment specifically for YOLOv10 and activate it:

Shell
 
conda create -n yolov10 python=3.10
conda activate yolov10


3.1.3 Download and Compile Dependencies

To install the required dependencies, it's recommended to use the Tencent pip mirror for faster downloads:

Shell
 
pip install -r requirements.txt -i https://mirrors.cloud.tencent.com/pypi/simple
pip install -e . -i https://mirrors.cloud.tencent.com/pypi/simple


3.2 Model Inference With YOLOv10

3.2.1 Model Download

To get started with YOLOv10, you can download the pre-trained models:

  • YOLOv10-N
  • YOLOv10-S
  • YOLOv10-M
  • YOLOv10-B
  • YOLOv10-L
  • YOLOv10-X

3.2.2 WebUI Inference

To perform inference using the WebUI, follow these steps:

  1. Navigate to the root directory of the YOLOv10 project. Run the following command to start the application:

    Shell
     
    python app.py
  1. Once the server starts successfully, you will see a message indicating that the application is running and ready for use.

3.2.3 Command Line Inference

For command line inference, you can use the Yolo command within your Conda environment. Here's how to set up and execute it:

Activate the YOLOv10 Conda environment: Ensure you have activated the environment you created earlier for YOLOv10.

Shell
 
conda activate yolov10


Run inference using the command line: Use the yolo predict command to perform predictions. You need to specify the model, device, and source image path as follows:

Shell
 
yolo predict model=yolov10n.pt device=2 source=/aigc_dev/yolov10/ultralytics/assets


  • model: Specifies the path to the downloaded model file (e.g., yolov10n.pt).
  • device: Specifies which GPU to use (e.g., device=2 for GPU #2).
  • source: Specifies the path to the images you want to detect objects in.

Default paths and results:

  • By default, the images to be detected should be placed in the yolov10/ultralytics/assets directory.
  • After detection, the results will be saved in a directory named yolov10/runs/detect/predictxx, where xx represents a unique identifier for each run.

Benchmark on CoCo dataset.

dataset

results

3.3 Training the YOLOv10 Model

In addition to inference, YOLOv10 also supports training on custom datasets. Here's how you can train the model using the command line:

To initiate training with YOLOv10, use the following command:

Shell
 
yolo detect train data=coco.yaml model=yolov10s.yaml epochs=100 batch=128 imgsz=640 device=2


Here's a breakdown of the command options:

  • detect train: This specifies that you want to perform training for object detection.
  • data=coco.yaml: Specifies the dataset configuration file. The default dataset (COCO) is downloaded and stored in the ../datasets/coco directory.
  • model=yolov10s.yaml: Specifies the configuration file for the model you want to train.
  • epochs=100: Sets the number of training iterations (epochs).
  • batch=128: Specifies the batch size for training, i.e., the number of images processed in each training step.
  • imgsz=640: Indicates the image size to which all input images will be resized during training.
  • device=2: Specifies which GPU to use for training (e.g., device=2 for GPU #2).

Example Explanation

Assuming you have set up the YOLOv10 environment and dataset properly, running the above command will start the training process on the specified GPU. The model will be trained for 100 epochs with a batch size of 128, and the input images will be resized to 640x640 pixels.

Steps To Train YOLOv10

Prepare Your Dataset

  • Ensure your dataset is properly formatted and described in the coco.yaml file (or your own custom dataset configuration file).
  • The dataset configuration file includes paths to your training and validation data, as well as the number of classes.

Configure the Model

  • The model configuration file (e.g., yolov10s.yaml) contains settings specific to the YOLOv10 variant you are training, including the architecture and initial weights.

Run the Training Command

  • Use the command provided above to start the training process. Adjust parameters like epochs, batch, imgsz, and device based on your hardware capabilities and training requirements.

Monitor and Evaluate

  • During training, monitor the progress through logs or a visual tool if available.
  • After training, evaluate the model performance on a validation set to ensure it meets your expectations.

A demo case of using Yolo10 for real-time online object detection:

Python
 
import cv2
from ultralytics import YOLOv10
model = YOLOv10("yolov10s.pt")
cap = cv2.VideoCapture(0)
while True:
        ret, frame = cap.read()
        if not ret:
                break  
        results = model.predict(frame)
        for result in results:
                boxes = result.boxes  
                for box in boxes:
                        x1,y1,x2,y2 = map(int, box.xyxy[0])
                        cls = int(box.cls[0])
                        conf = float(box.conf[0])
                        cv2.rectangle(frame, (x1, y1), (x2, y2), (255, 0, 0), 2)
                        cv2.putText(frame, f'{model.names[cls]} {conf:.2f}', (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0), 2)
        cv2.imshow('YOLOv10', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
                break
 
cap.release()
cv2.destroyAllWindows()


You can build your workstation to run/train an AI system. To save money, you can also find cheap components like GPUs online.

Algorithm neural network

Opinions expressed by DZone contributors are their own.

Related

  • The One-Pixel Threat: How Minuscule Changes Can Fool Deep Learning Systems
  • Navigating the Complexities of Text Summarization With NLP
  • Search for Rail Defects (Part 3)
  • Understanding the Deployment of Deep Learning Algorithms on Embedded Platforms

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!