DZone
AI Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > AI Zone > Improve Data Accuracy With OCR

Improve Data Accuracy With OCR

There are many OCR solutions available, but not all of them are sufficient in terms of data accuracy. See the difference between general and advanced OCR.

David Hoffman user avatar by
David Hoffman
·
Nov. 13, 18 · AI Zone · Opinion
Like (1)
Save
Tweet
4.39K Views

Join the DZone community and get the full member experience.

Join For Free

There are many OCR solutions available, but not all of them are sufficient in terms of data accuracy. General OCR solutions have an accuracy of up to around 98% of the data, and while that may sound like a lot, that means that out of 1,000 characters, about 1 of them will be incorrect. More advanced OCR solutions can help.

The Difference Between General and Advanced OCR

Basic OCR solutions are able to identify characters through the use of character recognition and pattern matching. OCR solutions will compare each character to known characters to determine which character they most likely are, and then use pattern matching solutions to compare the word as a whole to a dictionary. For most applications, this is more than enough to get an idea of the content.

But this type of general OCR isn't always the best for a company. Some companies, such as legal companies, may need accurate OCR in order to scan documents for keywords. Companies that are using OCR for more complex tasks — such as mobile solutions that read characters on-the-fly - may need more accurate results faster. This is where more advanced OCR solutions come in.

OCR and Image Appearance

General OCR solutions can be confused by the appearance of characters, as the appearance of characters may often be indistinct. In real-time video environments, OCR solutions may falter in low light or during movement. In still photos, OCR may not be able to complete its translations due to low contrast images or obscured text.

Being able to alter image appearance through the use of workflows and advanced image transformations enables the pre-processing of OCR imagery, which increases the chances of a more accurate reading.

OCR and Content Specialization

As mentioned, OCR solutions will generally confirm their results against a dictionary or even against grammatical syntax, thereby increasing the likelihood that they'll have an accurate reading of any given text. However, that does not always work if the text is specialized in nature, such as text from an engineering company or a medical company.

Advanced OCR solutions are able to. Organizations are able to modify their dictionaries, and over time their machine learning algorithms are able to detect the most likely text for the specific company rather than on a general purpose level. This significantly cuts down on the number of errors that occur when analyzing text, and the OCR only improves over time.

OCR is an extremely powerful tool, but not all OCR tools are made equal. General purpose OCR tools provide some basic OCR capabilities, but they may not be as accurate as an organization really needs.

Data (computing) Machine learning

Published at DZone with permission of David Hoffman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • What Is ERP Testing? - A Brief Guide
  • How to Test JavaScript Code in a Browser
  • How to Hash, Salt, and Verify Passwords in NodeJS, Python, Golang, and Java
  • The Engineer’s Guide to Creating a Technical Debt Proposal

Comments

AI Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo