Abstract: Extracting key-fields from a variety of document types remains a challenging problem. Services such as AWS, Google Cloud and open-source alternatives provide text extraction to "digitize" images or pdfs, returning phrases, words and characters. Processing these outputs is unscalable and error-prone as varied documents require different heuristics, rules or models and new types are uploaded daily. In addition, a performance ceiling exists as downstream models rely on good yet imperfect OCR algorithms upstream.
We propose an end-to-end solution utilizing image-based deep learning to automatically extract important text-fields from documents of various templates and sources. Computer vision algorithms utilizing deep learning produce state-of-the-art classification accuracy and generalizability through training on millions of images. We compare the in-house model accuracy, processing time and cost with 3rd party services and found favorable results to automatically extract important fields from documents.
Bill.com is working to build a paperless future. We process millions of documents a year ranging from invoices, contracts, receipts and a variety of others. Understanding those documents is critical to building intelligent products for our users.
Bio: Eitan is the Chief Data Scientist at Bill.com and has many years of experience as a researcher. His recent focus is on machine learning, deep learning, applied statistics and software engineering. Before, he was a Postdoctoral Scholar at Lawrence Berkeley National Lab, received his PhD in Physics from Boston University and B.S. in Astrophysics from University of California Santa Cruz. Eitan holds 4 patents and 11 publications to date and has spoken about data at various conferences around the world.