Hits:
Indexed by:期刊论文
Date of Publication:2017-12-01
Journal:IEEE SYSTEMS JOURNAL
Included Journals:SCIE
Volume:11
Issue:4
Page Number:2160-2169
ISSN No.:1932-8184
Key Words:Feature learning; incomplete multimedia data; possiblistic C-means (PCM) algorithm; tensor distance; vector outer product
Abstract:Clustering is a commonly used technique for multimedia organization, analysis, and retrieval. However, most multimedia clustering methods are difficult to capture the high-order nonlinear correlations over multimodal features, resulting in the low clustering accuracy. Furthermore, they cannot extract features from multimedia data with missing values, leading to failure in clustering incomplete multimedia data that are widespread in practical applications. In this paper, we propose a high-order possibilistic C-means algorithm (HOPCM) for clustering incomplete multimedia data. HOPCM improves the basic autoencoder model for learning features of multimedia data with missing values. Furthermore, HOPCM uses the tensor distance rather than the Euclidean distance as the distance metric to capture as much as possible the unknown high-dimensional distribution of multimedia data. Extensive experiments are carried out on three representative multimedia data sets: NUS-WIDE, CUAVE, and SNAE. The results demonstrate thatHOPCMachieves significantly better clustering performance than many existing algorithms. More importantly, HOPCMis able to cluster both high-qualitymultimedia data and incomplete multimedia data effectively, while other existing methods can only cluster the high-quality multimedia data.