的个人主页 http://faculty.dlut.edu.cn/yexinchen/zh_CN/index.htm
Deep Joint Depth Estimation and Color Correction from Monocular Underwater
Images based on Unsupervised Adaptation Networks
Xinchen Ye1, Zheng Li1, Baoli Sun1, Zhihui Wang*1, Rui Xu1, Haojie Li1, Xin Fan1
1 Dalian University of Technology
* Corresponding author
Paper: Underwater DE CC.pdf
Abstract
Degraded visibility and geometrical distortion typically make the underwater vision more intractable than open air vision, which impedes the development of underwater-related machine vision and robotic perception. Therefore, this paper addresses the problem of joint underwater depth estimation and color correction from monocular underwater images, which aims at enjoying the mutual benefits between these two related tasks from a multi-task perspective. Our core ideas lie in our new deep learning architecture. Due to the lack of effective underwater training data, and the weak generalization to the real-world underwater images trained on synthetic data, we consider the problem from a novel perspective of style-level and feature-level adaptation, and propose an unsupervised adaptation network to deal with the joint learning problem. Specifically, a style adaptation network (SAN) is first proposed to learn a style-level transformation to adapt in-air images to the style of underwater domain. Then, we formulate a task network (TN) to jointly estimate the scene depth and correct the color from a single underwater image by learning domain-invariant representations. The whole framework can be trained end-to-end in an adversarial learning manner. Extensive experiments are conducted under air-to-water domain adaptation settings. We show that the proposed method performs favorably against state-of-the-art methods in both depth estimation and color correction tasks.
Method
Fig. 1. Our whole network architecture for joint depth estimation and color correction. It consists of two networks, i.e., style adaptation network (SAN) and task network (TN). Domain adaptation is used in both networks to adapt for low-level appearance style and high-level feature representation, simultaneously. The whole framework can be trained end-to-end in an adversarial learning manner. The depth maps are colored with red for farther distance while blue for closer distance for easy observation.
Fig. 2. Our stacked conditional GANs architecture for joint depth estimation and color correction. Gc is sketched out briefly and the domain adaptation modules on both generators are omitted for saving space.
Results
Fig. 3. Training datasets: From left to right are real underwater images captured under three different light conditions (bluish, greenish, shallow), and in-air NYU dataset, respectively.
Fig. 4. Evaluation on SAN from the perspective of training details and rendering results. (a) and (b) show the loss curves of training process from WaterGAN and ours. (c-e) present three visual examples for clearly observing the rendering results. From top to bottom are the results from WaterGAN, ours, and the real underwater images, respectively.
Fig. 5. Qualitative comparison on real underwater images under different module configurations: (a)DESN or CCSN separately; (b) DESN + CCSN, (c) DESN with DA + CCSN, (d) DESN with DA + CCSN with DA, (e) DESN + CCSN with DA. The depth maps are colored with red for farther distance, while blue for closer distance. We use red rectangles to direct readers to focus on those specific areas to compare the difference under different cases.
Citation
Xinchen Ye; Zheng Li; Baoli Sun; Zhihui Wang*; Rui Xu; Haojie Li; Xin Fan; Deep Joint Depth Estimation and Color Correction from Monocular Underwater Images based on Unsupervised Adaptation Networks, IEEE Trans. Circuits and Systems for Video Technology, 30(11): 3995-4008, 2020.
@article{Ye2020tcsvt,
author = {Xinchen Ye; Zheng Li; Baoli Sun; Zhihui Wang; Rui Xu; Haojie Li; Xin Fan},
title = {Deep Joint Depth Estimation and Color Correction from Monocular Underwater Images based on Unsupervised Adaptation Networks},
booktitle = {IEEE Trans. Circuits and Systems for Video Technology (TCSVT)},
year={2020}, volume={30}, number={11} pages={3995-4008},
}