location: Current position: Home >> Scientific Research >> Paper Publications

Language-aware weak supervision for salient object detection

Hits:

Indexed by:Journal Papers

Date of Publication:2019-12-01

Journal:PATTERN RECOGNITION

Included Journals:EI、SCIE

Volume:96

ISSN No.:0031-3203

Key Words:Saliency detection; Natural language; Textual-visual pairwise; Self-supervision

Abstract:Natural Language Processing has achieved remarkable performance in multitudinous computer tasks, but the potential capability of textual information has not been completely explored in visual saliency detection. In this paper, we learn to detect salient object from natural language by addressing the two essential issues: finding a semantic content matching the corresponding linguistic concept and recovering fine details without any pixel-level annotations. We first propose the Feature Matching Network (FMN) to explore the internal relation between the linguistic concept and visual image in the semantic space. The FMN simultaneously establishes the textual-visual pairwise affinities and generates a language aware coarse saliency map. to refine the coarse map, the Recurrent Fine-tune Network (RFN) is proposed to enhance its predicted performance progressively by self-supervision. Our approach only leverages the caption to provide important cues of salient object, but generates a fine-detailed foreground map at a detecting speed of 72 FPS without any post-processing. Extensive experiments demonstrate that our method takes full advantage of textual information of natural language in saliency detection, and performs favorably against state-of-the-art approaches on the most existing datasets. (C) 2019 Elsevier Ltd. All rights reserved.

Pre One:ONLINE SINGLE PERSON TRACKING FOR UNMANNED AERIAL VEHICLES: BENCHMARK AND NEW BASELINE

Next One:LANGUAGE PERSON SEARCH WITH MUTUALLY CONNECTED CLASSIFICATION LOSS