Hits:
Indexed by:期刊论文
Date of Publication:2010-02-01
Journal:ICIC Express Letters
Included Journals:EI、Scopus
Volume:4
Issue:1
Page Number:189-196
ISSN No.:1881803X
Abstract:In this paper, a bootstrapping method for automatically extractzng foreignperson names (F-names) from Chinese web pages is presented. Starting from asmall set of F-name characters, the method iteratively extracts text-segmentscontaining F-name characters from the web. A context cue-word set is used toimprove the efficiency of extractzng. Statistic information is used to recognizeF-names from these text-segments. A confidence measure is assigned to eachpossible F-name candidate and a segmentation digraph is constructed forselecting F-names from F-name candidates. The method is used to extract 10000F-names from the Internet and the recognition precision is about 87%. Theresults show that the proposed method is effective. ICIC International ? 2010.