FuseExtract
Intelligent Character Recognition system for automated document processing
2020FuseExtract is an Intelligent Character Recognition (ICR) system built at Fusemachines. It automates the extraction of structured data from handwritten and printed documents using deep learning.
Problem
Manual data entry from physical documents is slow, error-prone, and expensive. Financial institutions, government offices, and enterprises in Nepal process thousands of forms daily.
Approach
Built a pipeline combining document detection, text region segmentation, and sequence-to-sequence character recognition. Trained on locally-sourced Devanagari and English document datasets.
Outcome
Deployed as an internal service at Fusemachines, significantly reducing manual data entry time for pilot clients. Demonstrated feasibility of document AI for Nepali-language documents.