![]() |
个人信息Personal Information
教授
博士生导师
硕士生导师
任职 : 智能计算教研室主任
性别:男
毕业院校:吉林大学
学位:博士
所在单位:计算机科学与技术学院
学科:计算机应用技术. 计算机软件与理论
办公地点:创新园大厦A820
联系方式:13304609362
电子邮箱:lucos@dlut.edu.cn
论文成果
当前位置: 姚念民欢迎报考硕博士 >> 科学研究 >> 论文成果Generating word and document matrix representations for document classification
点击次数:
论文类型:期刊论文
发表时间:2020-07-01
发表刊物:NEURAL COMPUTING & APPLICATIONS
收录刊物:SCIE
卷号:32
期号:14
页面范围:10087-10108
ISSN号:0941-0643
关键字:Document-level classification; Word matrix; Document matrix; Subwindows
摘要:We present an effective word and document matrix representation architecture based on a linear operation, referred to as doc2matrix, to learn representations for document-level classification. It uses a matrix to present each word or document, which is different from the traditional form of vector representation. Doc2matrix defines proper subwindows as the scale of text. A word matrix and a document matrix are generated by stacking the information of these subwindows. Our document matrix not only contains more fine-grained semantic and syntactic information than the original representation but also introduces abundant two-dimensional features. Experiments conducted on four document-level classification tasks demonstrate that the proposed architecture can generate higher-quality word and document representations and outperform previous models based on linear operations. We can see that compared to different classifiers, a convolutional-based classifier is more suitable for our document matrix. Furthermore, we also demonstrate that the convolution operation can better capture the two-dimensional features of the proposed document matrix by the analysis from both theoretical and experimental perspectives.