Hits:
Indexed by:会议论文
Date of Publication:2006-08-16
Included Journals:EI、CPCI-S
Volume:4113
Page Number:274-279
Abstract:An OCR system is presented to understand mathematical formulas in binary printed document images. The system utilizes a novel component-labeling algorithm for extracting local maximum components from image, and uses these components to locate the mathematical formulas. A character recognition algorithm based on neural networks is then adopted. For segmenting merged characters in the image, a novel segmentation algorithm based on a modified SOM neural network was introduced into the system. With the employment of LL(1) grammar, this system can convert the recognition results into a (LTEX)-T-A file.