Extracting Text from Images: Google a Notch Better than Azure and AWS!

DZone 's Guide to

Extracting Text from Images: Google a Notch Better than Azure and AWS!

While all three platforms offer similar such services, how do they compare to one another? We take a look using a few tests.

· Cloud Zone ·
Free Resource

Extracting text from images has been worked on for many years now and finds applications in many domains like banking, legal, healthcare, education, and entertainment!

With the advent of machine learning, text extraction from images is being offered as a Cognitive API by many AI/ML providers like AWS Rekognition, Azure Computer Vision, and Google CloudVision.

While all three do a good job when it comes to default text detection, we used the Cognitive API Integrator tocompare the responses of these 3 major cognitive API providers on 3 parameters for the English language:

  • Different orientation
  • Different fonts
  • Reverse order text

While there are no clear winners here, Google does perform a notch better than Azure and AWS in the 3 parameters we compared them for.

Here is a brief summary:

  • Google does a great job at detecting vertical text irrespective of the top-down or bottom-up orientation
  • Google and Azure both give reverse order text (upside down text) a good shot, whereas AWS is never able to decipher it.
  • AWS does a great job detecting texts written in different fonts.
  • Azure needs handwritten mode on in order to detect different fonts.

Let's take a look at a few examples.

Example 1: Vertical Text in Bottom-Up Orientation


  • AWS totally misses detecting the vertical text
  • Google and Azure are able to detect the text correctly.

Example 2: Vertical Text in Top Down Orientation


  • Google gives the best result
  • AWS again gives it a miss
  • Azure is also unable to read vertical text in top-down orientation.

Example 3: Bottom-Up Text



  • Clearly Google does the best job here
  • Azure gives it a try and AWS misses it completely

Example 4: Mixed Orientation



  • While none of these three providers is able to hand mixed orientations correctly Google plays is safe and reads only one orientation but reads that correctly.
  • Azure tries to read all the orientations and reads one of the two orientations incorrectly.
  • AWS can only read the default orientation correctly.

Example 5: Mixed Fonts



  • While all providers detect different fonts AWS seems to be doing a better job than the other two!

Check out the Findings page for various similar conclusions drawn by the community while working with these APIs. Send us your findings and feedback at daksh@cennest.com.

About the Cognitive API Integrator

The Cognitive API Integrator aggregates cognitive services across major providers (currently Microsoft Azure, Amazon Web Services & Google Cloud). Use it to compare responses for various Cognitive APIs before making your selection of which provider you will integrate with.

Note: The Cognitive API Integrator does not aim to promote or downplay any Cognitive API Provider. Cognitive Analysis is a machine learning exercise where results are bound to improve with more data and usage. Conclusions drawn here can be subjective and users are encouraged to use the tool to form their own conclusions.

ocr ,text ,images ,cloud ,aws ,azure ,gcp ,results ,performance ,ml

Published at DZone with permission of Anshulee Asthana , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}