扫描手机二维码

欢迎您的访问
您是第 位访客

开通时间:..

最后更新时间:..

  • 叶昕辰 ( 副教授 )

    的个人主页 http://faculty.dlut.edu.cn/yexinchen/zh_CN/index.htm

  •   副教授   博士生导师   硕士生导师
  • 主要任职:IEEE member, ACM member
  • 其他任职:IEEE协会会员, ACM协会会员, CCF计算机协会会员
StereoDistill(TIP21) 当前位置: 中文主页 >> 论文及项目 >> StereoDistill(TIP21)



Unsupervised Monocular Depth Estimation via Recursive Stereo Distillation



Xinchen Ye, Xin Fan*, Mingliang Zhang, Wei Zhong, Rui Xu

Dalian University of Technology

* Corresponding author

Code: https://github.com/goldenwoman/Recursive_Stereo_Disill

Paper: bare_jrnl.pdf



Abstract

Existing unsupervised monocular depth estimation methods resort to stereo image pairs instead of ground-truth depth maps as supervision to predict scene depth. Constrained by the type of monocular input in testing phase, they fail to fully exploit the stereo information through the network during training, leading to the unsatisfactory performance of depth estimation. Therefore, we propose  a novel architecture which consists of a monocular network (Mono-Net) that infers depth maps from monocular inputs, and a stereo network (Stereo-Net) that further excavates the stereo information by taking stereo pairs as input. During training, the sophisticated Stereo-Net guides the learning of Mono-Net and devotes to enhance the performance of Mono-Net without changing its network structure and increasing its computational burden. Thus, monocular depth estimation with superior performance and fast runtime can be achieved in testing phase by only using the lightweight Mono-Net. For the proposed framework,  our core idea lies in: 1) how to design the Stereo-Net so that it can accurately estimate depth maps by fully exploiting the stereo information; 2) how to use the sophisticated Stereo-Net to improve the performance of Mono-Net. To this end, we propose a recursive estimation and refinement strategy for Stereo-Net to boost its performance of depth estimation. Meanwhile, a multi-space knowledge distillation scheme is designed to help Mono-Net amalgamate the knowledge and master the expertise from Stereo-Net in a multi-scale fashion. Experiments demonstrate that our method achieves the superior performance of monocular depth estimation in comparison with other state-of-the-art methods.


Method

framework.png

Figure 1. Network overview. It includes a Mono-Net M and a Stereo-Net S, where M is a lightweight network that takes a single image as input, while S takes stereo images pair as input. S contains a recursive estimation strategy and a feature-driven adaptive refinement module to further improve the accuracy of depth estimation. The multi-space knowledge distillation scheme is designed to distill knowledge from S and squeeze into M.


twonet.png

Figure 2. Structures of Mono-Net M, Stereo-Net S, and the multi-space knowledge distillation scheme. We propose to cascade the feature-driven adaptive refinement module with S and update network weights in a recursive manner. The multi-space knowledge distillation scheme is designed to transfer knowledge from S to M in the aspects of output space, feature space and long-range dependencies based on multi-scale feature extraction.



Results


method_c.png

Figure 3. Qualitative comparison with different methods on KITTI dataset. (a) Color image, (b) Ground-truth, (c) Xu et al., (d) Godard et al., (e) Zhan et al., (f) Pilzer et al., (g) Wong et al., (h) Ours.





Citation

Xinchen Ye, Xin Fan*, Mingliang Zhang, Wei Zhong, Rui Xu, Unsupervised Monocular Depth Estimation via

Recursive Stereo Distillation, IEEE Trans. Image Processing, accepted, 2021. 

 

@article{Ye2021tip,
  author = {Xinchen Ye, Xin Fan, Mingliang Zhang, Wei Zhong, Rui Xu},
  title = {Unsupervised Monocular Depth Estimation via Recursive Stereo Distillation},

  booktitle = {IEEE Trans. Image Processing (TIP)},
  year={2021}, volume={0}, pages={0-0},

}








辽ICP备05001357号 地址:中国·辽宁省大连市甘井子区凌工路2号 邮编:116024
版权所有:大连理工大学