通讯作者：Chen, ZK (reprint author), Dalian Univ Technol, Sch Software Technol, Dalian 116620, Peoples R China.
合写作者：Chen, Zhikui,Ning, Zhaolong,Min, Geyong,Hu, Yueming
发表刊物：JOURNAL OF COMPUTATIONAL SCIENCE
关键字：Multimodal integration optimization; Deep learning; Internet of things; Image classification; Stacked autoencoders
摘要：Recently, a large number of physical devices, together with distributed information systems, deployed in internet of things (IoT), are collecting more and more images. Such collected images recognition poses an important challenge on optimization in internet of things. Specially, most of existing methods only adopt shallow learning models to integrate various features of images for recognition limiting classification accuracy. In this paper, we propose a multimodal deep learning (MMDL) approach to integrate heterogeneous visual features by considering each type of visual feature as one modality for image recognition optimization in internet of things. In our scheme, we extract the high-level abstraction of each modality by a stacked autoencoders. Furthermore, we design a back propagation algorithm with shared weights learned from a softmax layer to update the pretrained parameters of multiple stacked autoencoders simultaneously. The integration is performed by concatenating the last hidden layers of the multimodal stacked autoencoders architecture. Extensive experiments are carried out on three datasets i.e. Animal with Attributes, NUS-WIDE-OBJECT, and Handwritten Numerals, by comparison with SVM, SAE, and AMMSS. Results demonstrate that our scheme has superior performance on heterogeneous visual features integration for image recognition optimization in internet of things. (C) 2016 Elsevier B.V. All rights reserved.