张立勇

个人信息Personal Information

副教授

硕士生导师

性别:男

毕业院校:大连理工大学

学位:硕士

所在单位:控制科学与工程学院

电子邮箱:zhly@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

基于改进BP网络的中文歧义字段分词方法研究

点击次数:

发表时间:2007-01-01

发表刊物:大连理工大学学报

期号:1

页面范围:131-135

ISSN号:1000-8608

摘要:In the text mining, the technology of Chinese automatic word segmentation is a difficult problem that the computer science has to face. Aiming at the characteristics of Chinese writing, such as no space between words, continuous writing in sentences and difficulty of segmenting the ambiguous words, the grammatical phenomena are summarized which lie in the typical ambiguity, and the codes library of different parts of speech used for coding is built up. On this basis, words in ambiguity fields with special grammatical rules are set with codes and transformed to the representation form of inputting vector which can be accepted by the neural network. Then the samples are trained and the grammatical rules can be obtained by improving the self-learning of BP neural network. After a lot of training through adopting the BP network, the algorithm reaches 93.13% of training precision and 92.50% of test precision on ambiguous words segmentation.

备注:新增回溯数据