location: Current position: Home >> Scientific Research >> Paper Publications

Automatic part-of-speech tagging for Oromo language using Maximum Entropy Markov Model (MEMM)

Hits:

Indexed by:期刊论文

Date of Publication:2014-07-01

Journal:Journal of Information and Computational Science

Included Journals:EI、Scopus

Volume:11

Issue:10

Page Number:3319-3334

ISSN No.:15487741

Abstract:The problem of Part-of-speech tagging (POS tagging) for natural language processing task or computational linguistics is inevitable for every natural language of mankind. In this paper, we present experimental results on one of the state-of-the-art probabilistic model for sequence classification, Maximum Entropy Markov Model (MEMM), for tagging Oromo language. This model assigns the correct part-of-speech tag to each word or token of the sentence, considering many features and contexts. We used a MEMM and it was found to be the best way to estimate word classes of Oromo text. To implement the model, experiments were conducted on a manually annotated corpus of 452 sentences (total of 6094 words) of Oromo language. Experimental results show that the new algorithm performs well with accuracy of 93.01% evaluated by tenfold cross validation. By the result of this paper it can be generalized that this modelling technique, MEMM, has shown some advantages over Hidden Markov Models for sequence tagging since it offers increased freedom in choosing features to represent observations for POS tagging of oromo language. 1548-7741/Copyright ? 2014 Binary Information Press.

Pre One:基于条件随机场与时间词库的中文时间表达式识别

Next One:Integrating Semantic Information into Multiple Kernels for Protein-Protein Interaction Extraction from Biomedical Literatures