location: Current position: Home >> Scientific Research >> Paper Publications

Automatic word segmentation for Chinese classics of tea based on tree-pruning

Hits:

Indexed by:会议论文

Date of Publication:2009-11-30

Included Journals:EI、CPCI-S、Scopus

Volume:1

Page Number:438-+

Key Words:classics of tea; segmentation; tree-pruning

Abstract:Automatic word-segmentation is vital for the reading, comprehension and translation of classics. However, large amount of special terms, allusions and proper names within the classics make it difficult for word segmentation. Taking classics of tea as the subject of research, a method was proposed using likelihood ratio statistics to decide two-character words candidate, three-character words candidates and multi-character words candidates, and then segment classics of tea automatically by tree-pruning algorithm. The computation complexity of the tree-pruning algorithm is O (LN), L is number of the Chinese characters of the longest word. Experiments show it has better results in word-segmentation.

Pre One:Automatic Choosing of English Rhymes in Translation of Chinese Ancient Poems

Next One:中华茶典籍互文特质及其模因传播