INDUSTRY

FuseExtract

Intelligent Character Recognition system for automated document processing

2020

FuseExtract is an Intelligent Character Recognition (ICR) system built at Fusemachines. It automates the extraction of structured data from handwritten and printed documents using deep learning.

Problem

Manual data entry from physical documents is slow, error-prone, and expensive. Financial institutions, government offices, and enterprises in Nepal process thousands of forms daily.

Approach

Built a pipeline combining document detection, text region segmentation, and sequence-to-sequence character recognition. Trained on locally-sourced Devanagari and English document datasets.

Outcome

Deployed as an internal service at Fusemachines, significantly reducing manual data entry time for pilot clients. Demonstrated feasibility of document AI for Nepali-language documents.