大连理工大学主页平台管理系统陈志奎 Incomplete Big Data Clustering Algorithm Using Feature Selection and Partial Distance Home

Current position: Home >> Scientific Research >> Paper Publications

Incomplete Big Data Clustering Algorithm Using Feature Selection and Partial Distance

Release Time:2019-03-11 Hits:

Indexed by: Conference Paper

Date of Publication: 2014-11-28

Included Journals: CPCI-S、EI、Scopus

Page Number: 263-266

Key Words: big data; incomplete data clustering; feature subset selection; cluster analysis

Abstract: Incomplete data clustering plays an important role in the big data analysis and processing. Existing algorithms for clustering incomplete high-dimensional big data have low performances in both efficiency and effectiveness. The paper proposes an incomplete high-dimensional big data clustering algorithm based on feature selection and partial distance strategy. First, a hierarchical clustering-based feature subset selection algorithm is designed to reduce the dimensions of the data set. Next, a parallel k-means algorithm based on partial distance is derived to cluster the selected data subset in the first step. Experimental results demonstrate that the proposed algorithm achieves better clustering accuracy than the existing algorithms and takes significantly less time than other algorithms for clustering high-dimensional big data.

Prev One:基于深度学习的不完整大数据填充算法

Next One:Time optimization algorithm for scheduling budget-constrained communication-aware workflow

Home

Scientific Research

Teaching Research

Awards and Honours

Enrollment Information

Student Information

My Album

Blog

Incomplete Big Data Clustering Algorithm Using Feature Selection and Partial Distance