Hits:
Indexed by:会议论文
Date of Publication:2015-01-01
Included Journals:CPCI-S
Page Number:448-452
Key Words:text structural information; information extraction; conditional random fields; XML expression
Abstract:Facing tremendous volume of semi-structured XML and non-structured free text, network information retrieval is one of the most research hotspots in dealing with these data more efficiently, precisely and uniformly. Many traditional IR methods ignore text semantics and their labeling result has usually only one level, lacking of context expression as well, therefore structure extraction from free text and its conversion to XML format are studied, with a CRF based algorithm SIECRF provided. Experiment results are analyzed, showing its efficiency to extracting text structure and has a good application future.