location: Current position: Lin Yuan >> Scientific Research >> Paper Publications

Tripartite-replicated softmax model for document representations

Hits:

Indexed by:会议论文

Date of Publication:2017-07-13

Included Journals:EI

Volume:10390 LNCS

Page Number:109-121

Abstract:Text mining tasks based on machine learning require inputs to be represented as fixed-length vectors, and effective vectors of words, phrases, sentences and even documents may greatly improve the performance of these tasks. Recently, distributed word representations based on neural networks have been demonstrated powerful in many tasks by encoding abundant semantic and linguistic information. However, it remains a great challenge for document representations because of the complex semantic structures in different documents. To meet the challenge, we propose two novel tripartite graphical models for document representations by incorporating word representations into the Replicated Softmax model, and we name the models as Tripartite-Replicated Softmax model (TRPS) and directed Tripartite-Replicated Softmax model (d-TRPS), respectively. We also introduce some optimization strategies for training the proposed models to learn better document representations. The proposed models can capture linear relationships among words and latent semantic information within documents simultaneously, thus learning both linear and nonlinear document representations. We examine the learned document representations in a document classification task and a document retrieval task. Experimental results show that the learned representations by our models outperform the state-of-the-art models in improving the performance of these two tasks. © Springer International Publishing AG 2017.

Pre One:专利查询扩展的词向量方法研究*

Next One:基于情感常识的微博事件公众情感趋势预测