赖晓晨

个人信息Personal Information

教授

硕士生导师

性别:男

毕业院校:大连理工大学

学位:博士

所在单位:软件学院、国际信息与软件学院

电子邮箱:laixiaochen@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

A Hierarchical Missing Value Imputation Method by Correlation-Based K-Nearest Neighbors

点击次数:

论文类型:会议论文

发表时间:2020-01-01

收录刊物:EI

卷号:1037

页面范围:486-496

关键字:Missing value imputation; K-nearest neighbors; Correlation analysis; Incomplete record division

摘要:Missing value is a common occurrence in the real-world dataset, and many methods have been proposed to solve it. Among those methods, KNN imputation attracts a lot of attention due to the simple realization, easy understanding, and relatively high accuracy. However, it ignores the influence of correlations between attributes on the similarity of records. In this paper, we take the correlations into consideration when selecting the nearest neighbors, and impute the incomplete records successively according to the number of missing values in each record. During the imputation, the correlation coefficients are calculated by the complete records and updated with the union of complete records and imputed records. Therefore, the correlations between attributes are more accurate with the improvement of data utilization, which makes the selected nearest neighbors more appropriate. Experimental results demonstrate that the improved method is more effective in missing value imputation.