孟军

个人信息Personal Information

教授

博士生导师

硕士生导师

性别:女

毕业院校:大连理工大学

学位:博士

所在单位:计算机科学与技术学院

学科:计算机应用技术. 计算机软件与理论

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Dimension reduction of latent semantic indexing extracting from local feature space

点击次数:

论文类型:期刊论文

发表时间:2008-06-01

发表刊物:Journal of Computational Information Systems

收录刊物:EI

卷号:4

期号:3

页面范围:915-922

ISSN号:15539105

摘要:Latent Semantic Indexing is a successful technology in information retrieval which attempts to explore the latent semantics implied by a query or a document through representing them in a dimension-reduced space, but it is not an optimal representation for text classification. It always drops the text classification performance when being applied to the whole training set because this completely unsupervised method ignores class discrimination while only concentrating on representation. An improved Latent Semantic Indexing method named Local Feature Latent Semantic Indexing (LFLSI) which considers the local features of each word representing the dimensionality of a text is proposed. It clarifies the meaning of each word in a specific text, so that it can select the most discriminative basis vectors using the training data iteratively. We adopt kNN and SVM to train and classify. Experiments conducted on the Reuters-21578 dataset indicate that the method is much better than traditional methods on classification within a much representative and effective dimension.