Release Time:2019-03-12 Hits:
Indexed by: Conference Paper
Date of Publication: 2006-08-16
Included Journals: CPCI-S、EI
Volume: 4113
Page Number: 274-279
Abstract: An OCR system is presented to understand mathematical formulas in binary printed document images. The system utilizes a novel component-labeling algorithm for extracting local maximum components from image, and uses these components to locate the mathematical formulas. A character recognition algorithm based on neural networks is then adopted. For segmenting merged characters in the image, a novel segmentation algorithm based on a modified SOM neural network was introduced into the system. With the employment of LL(1) grammar, this system can convert the recognition results into a (LTEX)-T-A file.