Over a million developers have joined DZone.

Recognizing Hand-Written Shapes Programmatically: Find the Key Points of a Rectangle

DZone's Guide to

Recognizing Hand-Written Shapes Programmatically: Find the Key Points of a Rectangle

A walkthrough on using computers to recognize drawn shapes.

Free Resource

Effortlessly power IoT, predictive analytics, and machine learning applications with an elastic, resilient data infrastructure. Learn how with Mesosphere DC/OS.

A far-fetched goal I have is using sketching on a whiteboard as a way to define programs. I mean formal programs that you can execute. Of course through your sketches you would define programs in a high level domain specific language (for example describing a state machine or an Entity-Relationship diagram).

To do so I would like to start recognizing rectangles. Then I will move to recognize other shapes, connecting lines, and recognizing text present in the diagram. For now let’s focus on recognizing rectangles.

My general approach would be the following:

  1. recognize the meaningful lines
  2. recognize key points among those lines
  3. classify those key points using AI
  4. find shapes by combining the classified key points

Ok. This is not going to be something I complete over a week-end.

The Input Images

We will use 3 images: two have them have been drawn on a whiteboard by me, under different light conditions. The third one was found on the Internet. It has the particularity that the sketch was done on a graph paper (i.e., there is a grid on the paper).

Whiteboard (natural light)

Whiteboard (artificial light)

Graph paper

Let’s see how we can process these images. We will use Java and the BoofCV image processing library.

Gray Scale

As first thing we convert the image to gray scale. Here we get a problem with the image taken under artificial light:

Screenshot from 2016-04-02 14-51-23

We want to remove that giant gray blob on bottom right corner. To do so we will use derivatives.


We blur the image, to reduce the effect of noise and calculate the derivates. This is a way to capture the sharp variations of colors which happens vertically or horizontally.

We would got something like this for the image taken under natural light:

Screenshot from 2016-04-02 14-44-43

However for the image taken under artificial light we see the noise:

Screenshot from 2016-04-02 14-53-42

At this point we take each point of the image and look to see if there is a high number of points with a high derivative (either horizontal or vertical). We keep the points satisfying the condition and we set all the other points to white. We do that a couple of times.

This is the result:

Screenshot from 2016-04-02 14-56-12


We do some additional filtering and then we invoke a function to find the contours inside the image. We draw the external contours in red and the internal ones in blue.

Screenshot from 2016-04-02 14-58-19

We then remove the short contours:

Screenshot from 2016-04-02 14-59-01

Key Points

The contours we get are drawn as a list of segments which are very short. Let’s draw the extremes of the segments in blue.

Screenshot from 2016-04-02 15-00-19

Yes, they are very short: you just see a continuous set of extremes, very close one to each other. We want to get fewer segments and much longer.

To do that we use basically two strategies:

  1. We simply merge consecutive extremes, which are very close.
  2. We take sequences of three consecutive points: A, B, C. If B is very close to the line between A and C we just remove B.

We apply two times both these strategies and get much simpler contours. This is the final result.

Screenshot from 2016-04-02 15-03-35Screenshot from 2016-04-02 15-03-19Screenshot from 2016-04-02 15-05-50

What Next

Now we have a reasonable number of relevant points. I want to now proceed to classify them through machine learning techniques. For example I want to recognize single points to be a top left corner of a rectangle or a point as part of an arrow. Then I will proceed to combine those recognized points to obtain entire shapes (my rectangles!).

Right now I am generating the images to classify and I am thinking about which features to use for machine learning. I have some ideas, but we will see them in one of next posts.

Training images looks like this:

Image title

Learn to design and build better data-rich applications with this free eBook from O’Reilly. Brought to you by Mesosphere DC/OS.

artificial intelligence ,image recognition

Published at DZone with permission of Federico Tomassetti, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}