![]() |
个人信息Personal Information
教授
博士生导师
硕士生导师
性别:女
毕业院校:大连理工大学
学位:博士
所在单位:计算机科学与技术学院
学科:计算机应用技术. 计算机软件与理论
Dimension reduction of latent semantic indexing extracting from local feature space
点击次数:
论文类型:期刊论文
发表时间:2008-06-01
发表刊物:Journal of Computational Information Systems
收录刊物:EI
卷号:4
期号:3
页面范围:915-922
ISSN号:15539105
摘要:Latent Semantic Indexing is a successful technology in information retrieval which attempts to explore the latent semantics implied by a query or a document through representing them in a dimension-reduced space, but it is not an optimal representation for text classification. It always drops the text classification performance when being applied to the whole training set because this completely unsupervised method ignores class discrimination while only concentrating on representation. An improved Latent Semantic Indexing method named Local Feature Latent Semantic Indexing (LFLSI) which considers the local features of each word representing the dimensionality of a text is proposed. It clarifies the meaning of each word in a specific text, so that it can select the most discriminative basis vectors using the training data iteratively. We adopt kNN and SVM to train and classify. Experiments conducted on the Reuters-21578 dataset indicate that the method is much better than traditional methods on classification within a much representative and effective dimension.