Text Extraction - Case Study
The Client
A Canadian process outsourcing and data services company.
Industry
Data Services
The Challenge
Xilligence was asked to build a text recognition product that could pull out both text and data from physical document records.
What made the requirement additionally challenging was that there was a lot of data that were in forms and tables and the expectation was that this data would then be able to be organized post digitization which meant that the ability to insert and extract data into a database would be necessary and to subsequently be used in applications. Key elements and contextual relationships would have to be maintained
The Solution
Xilligence built a process to scan and digitize the data. From there we created a process that used image recognition and machine learning (ML) technologies to extract the data and then digitize and populate into databases. On top of that we used Natural Language Processing (NLP) to ensure the contextual and data relationship information remained intact. Subsequent manual checks ensured that the data was digitized correctly.
Key Features
- Table and Form Data Maintained
- Contextual Data Extraction
- Manual Review
Technologies Used
- AWS Cloud Services
- AI/ML
- Amazon Comprehend
- Amazon Textract
- Database
- MySQL
Xilligence Services Used
Mobile + Apps
Cloud + Web
QA + Support
Design + UI / UX