Triumph Business Capital ("Triumph") is a leading transportation factoring business that needed help with its workflow and document processing systems. Triumph was processing almost 15,000 images per day but was running into issues growing its business without a technology solution for handling these submissions. Triumph's existing solution was to have its team members manually process each incoming invoice which became very tedious and repetitive work.
We assessed the situation and were able to deploy a lightweight Rails application that utilized OCR and Machine Learning technologies to predict document types and identify the debtor based on the raw image file. The application utilizes a number of open-source and commercial tools to produce consistent, high-quality OCR results regardless of the document quality.
Triumph's team has an extensive set of existing IT tools that it utilizes for its core workflow. We produced a simple JSON API for their developers to send new documents and access prediction results. Given the time lags for the OCR and prediction process, we built webhooks for asynchronous communication with Triumph.
For the Machine Learning service, we utilized Amazon Web Services ("AWS") as they provide a powerful framework at a manageable price. We trained the models using 50K historical images and by trying several different approaches on what features to expose to the ML model. We settled on providing the OCR text split into thirds (higher priority to text at the top of the page) as well as some relative scoring for the keyword terms found on the page (i.e. 'Invoice', 'Rate Confirmation', etc.).
Overall, the project took six weeks to complete and we were able to build an app that accurately predicts document types 95% of the time and appropriately assign images to their respective invoices with 90%+ accuracy. The scalability of the solution has given Triumph the opportunity to process more invoices faster and offer shorter funding cycles to its clients. The application performance has allowed Triumph to reposition valuable team members to more important workflows.