个人信息Personal Information
副教授
博士生导师
硕士生导师
主要任职:无
性别:男
毕业院校:大连理工大学
学位:博士
所在单位:软件学院、国际信息与软件学院
学科:软件工程
办公地点:软件学院综合楼417
联系方式:liangzhao@dlut.edu.cn
A Hybrid Method for Incomplete Data Imputation
点击次数:
论文类型:会议论文
发表时间:2015-08-24
收录刊物:EI、CPCI-S、Scopus
页面范围:1725-1730
关键字:missing values; data imputation; stacked auto-encoder; incremental clustering
摘要:With the explosive increase of data volume, the research of data quality and data usability draws extensive attention. In this work, we focus on one aspect of data usability-incomplete data imputation, and present a novel missing value imputation method using stacked auto-encoder and incremental clustering (SAICI). Specifically, SAICI's functionality rests on four pillars: (i) a distinctive value assigned to impute missing values initially, (ii) the stacked auto-encoder(SAE) applied to locate principal features, (iii) a new incremental clustering utilized to partition incomplete data set, and (iv) the top k% nearest neighbors' weighted values designed to refill the missing values. Most importantly, stages (ii)similar to(iv) iterate until convergence condition is satisfied. Experimental results demonstrate that the proposed scheme not only imputes the missing data values effectively, but also has better time performance. Moreover, this work is suitable for distributed data processing framework, which can be applied to the imputation of incomplete big data.