DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Enhance Your Communication Strategy: Deliver Multimedia Messages With AWS Pinpoint
  • Accelerate Innovation by Shifting Left FinOps: Part 4
  • Exploring the Architecture of Amazon SQS
  • Serverless Patterns: Web

Trending

  • Why Stable RAG Answers Can Still Hide Unstable Evidence
  • Identity in Action
  • Building Threat Intelligence Pipelines Using Python, APIs, and Elasticsearch
  • Building a Zero-Cost Approval Workflow With AWS Lambda Durable Functions
  1. DZone
  2. Coding
  3. Tools
  4. Leverage AWS Textract Service for Intelligent Invoice Processing in Account Payables Testing

Leverage AWS Textract Service for Intelligent Invoice Processing in Account Payables Testing

In modern ERP systems, invoice creation and processing is a well-defined process and automated to an extent across organizations.

By 
Sridhar Potladurti user avatar
Sridhar Potladurti
·
Jan. 04, 24 · Opinion
Likes (1)
Comment
Save
Tweet
Share
2.9K Views

Join the DZone community and get the full member experience.

Join For Free

In modern ERP systems, invoice creation and processing is a well-defined process and automated to an extent across organizations. Despite that, every organization has its own set of standard operating procedures that require tweaks/enhancements. There are intelligent document recognition software like Kofax, Oracle Intelligent Document Recognition (IDR), which offer the built features using AI and ML capabilities to identify the text and images on the invoices sent by the vendors to be paid, which are usually in the pdf format. There is no specific invoice layout or template that is followed, which leads to multiple variations in terms of layout and template differences, adding up the complexity for the built-in features not able to identify certain fonts, data inside the HTML tables, etc., leading to partial data entry causing the invoice entry operator to manually review and update the missing fields for each invoice. 

Invoice process workflow

Invoices are emailed as PDF attachments to a dedicated Inbox setup to receive the incoming invoices from the vendors directly. It is advisable to segregate the inboxes based on the geographic location of the vendor and business units so they can be scheduled to auto-import and create invoices with the invoice details auto-populated.

QA teams, while testing this process, can’t automate the end-to-end testing due to challenges in the invoice text extracts, and even the semi-automated testing yields inconsistent results due to the varied invoice formats across the different vendors. The confidence rate achieved from the machine learning of the invoices does take a significant amount of time to reach the desired level.

AI and ML tools can be leveraged to perform the end-to-end testing without manual intervention using a mix of traditional automation tools like Selenium to automate the UI screens and Amazon Textract service, which automatically extracts information from printed text, forms and tables in the pdf document and the invoice details are stored as strings which are fed to the selenium script as an input data to check if the auto invoice details populated by the Kofax or IDR is correct and also anything that was not identified or left blank can be filled with the selenium auto script.

To analyze invoice and receipt documents, use the AnalyzeExpense API operations and pass a document file as input. AnalyzeExpense is a synchronous operation that returns a JSON structure that contains the analyzed text.

The results will vary depending on the number of different invoices that are used for testing as the AWS Textract service model will improve over a period of time in identifying even difficult fonts, data from tables, and geo-specific information. There is an associated license font for the AWS Textract service, but considering the effectiveness and accuracy in extracting the text from the invoices, it is a good model to utilize both in the testing and also as part of the business process improvement. Even the payables invoice entry users would save a lot of time in terms of manual review and reconciliation of the entered data. This is also a definite business use case to save on the manual data invoice entry hours and improve the operational efficiencies of the organization.

AWS PDF Processing

Opinions expressed by DZone contributors are their own.

Related

  • Enhance Your Communication Strategy: Deliver Multimedia Messages With AWS Pinpoint
  • Accelerate Innovation by Shifting Left FinOps: Part 4
  • Exploring the Architecture of Amazon SQS
  • Serverless Patterns: Web

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook