The Most Insightful Computer Vision Project
The Most Insightful Computer Vision Project
Check out this open source project to start learning computer vision and machine learning.
Join the DZone community and get the full member experience.Join For Free
YOLO Multi-Object Detection and Classification
I am a tech-enthusiast, and I love exploring new technologies, so I was looking forward to learning some new ideas and innovations about computer vision and machine learning.
Computer Vision is a field of study traditionally reserved for researchers or engineers with advanced degrees, and because that, in the beginning, I was feeling very intimidated. As I was researching, I’ve become a lot more comfortable with the topic. While uncovering some of the mystery behind the “magic”, it was giving me a lot of insights, and I think lots of people could benefit from knowing a bit more about how computer vision works and especially know that there is no reason to be intimidated. And that’s when I decided to kick off this little project — an open source project containing a collection of different solutions for different real problems. This project has one major goal, share insights and empower developers to leverage from computer vision in their projects.
The core application for computer vision is image understanding. This also implies videos, as it is technically a collection of images (frames). Understanding an image is quite a complex and lengthy problem.
There are several tasks in image understanding; some are low-level tasks that are used in various others, while some are high-level tasks.
Some of the low-level tasks are:
- Image cleaning
- Image segmentation
- Histogram analysis
- Image color space translation
- Image transformation
- Image edge detection and contours, lines approximation
- Image convolutions
Some of the high-level tasks (that usually uses the low-level ones) are:
- Object detection
- Object recognition
- Object segmentation and localization
- Object tracking
- Feature extraction
- Feature, colour correction
- Feature reconstruction, approximation
The application of CV is usually based on high-level tasks. Some of the apps are:
- Face detection in cameras
- Pedestrians, cars, road detection in smart (self-driving) cars
- Terrain detection in drones and airplanes
- Vehicle license plate scanners at security checkpoints
However, due to advances in machine learning most of the CV applications are now using deep learning to get better accuracy. Even then, CV’s low-level tasks are being used for image pre-processing (processing images before feeding into deep learning networks).
The Open Source Project
This project will be addressing all the points mentioned above, and it is structured to help you on the journey of learning the potential of the computer vision and how we can leverage from it on our projects. It’ll begin solving real problems using Image Processing and then move onto tasks that require Machine Learning models.
During our journey, we will also have projects exploring some critical concept of computer vision and ML, such as: what is an image; what are convolutions; how to implement a vanilla neural network; how back-propagation works; how to use transfer learning and more
All examples are written in Python and Jupyter notebooks with tons of comments to help you to follow the implementation. Even if you don’t know Python well, you will be able to follow the code and learn from the examples.
The advanced part of this project will require GPU but don’t worry because those examples are ready to run on Google Colab with just one click, and it is free! You will only need to have a Google account. Because some examples will require access to your camera(video), we can not use Colab to all examples; therefore, you will need to set up a Python environment on your machine for those.
In the notebooks, you will find links to some articles, and I have also prepared some videos providing an overview of the project. Everything to make your life easier!
One of the most powerful types of AI is computer vision which you’ve almost surely experienced in any number of ways without even knowing it. Thanks to advances in artificial intelligence and innovations in deep learning and neural networks, the field has been able to take great leaps.
My mission with this project is to make AI accessible. To raise digital awareness and competence.
Following this project until the end will give you insights and you will feel empowered to leverage from all recent innovations in this field to improve the experience of your projects.
Published at DZone with permission of Alexsandro Souza . See the original article here.
Opinions expressed by DZone contributors are their own.