Hits:
Indexed by:Journal Papers
Date of Publication:2020-11-01
Journal:PATTERN RECOGNITION
Included Journals:SCIE
Volume:107
ISSN No.:0031-3203
Key Words:Cross-modal hashing; Semantic gap; Semantic augmentation; Cross-modal retrieval
Abstract:Hashing methods for cross-modal retrieval has drawn increasing research interests and has been widely studied in recent years due to the explosive growth of multimedia big data. However, a significant phenomenon which has been ignored is that there is a large gap between the results of cross-modal hashing in most cases. For example, the results of Text-to-Image frequently outperform that of Image-to-Text with a large margin. In this paper, we propose a strategy named semantic augmentation to improve and balance the results of cross-modal hashing. An intermediate semantic space is constructed to re-align the feature representations that embedded with weak semantic information. By using the intermediate semantic space, the semantic information of visual features can be further augmented before being sent to cross-modal hashing algorithms. Extensive experiments are carried out on four datasets via seven state-of-the-art cross-modal hashing methods. Compared against the results without semantic augmentation, the Image-to-Text results of these methods with semantic augmentation are improved considerably, which demonstrates the effectiveness of the proposed semantic augmentation strategy in bridging the gap between the results of cross-modal retrieval. Additional experiments are conducted on the real-valued, semi-supervised, semi-paired, partial-paired, and unpaired cross-modal retrieval methods, the results further indicates the effectiveness of our strategy in improving performance of cross-modal retrieval. (C) 2020 Elsevier Ltd. All rights reserved.