Skip to content

KAIST-VICLab/From_Ground_To_Objects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[CVPR 2024] From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Corresponding author
Korea Advanced Institute of Science and Technology, South Korea

This repository is the official PyTorch implementation of "From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior".

Data Prepare

Download dynamic object masks for Cityscapes dataset from DynamicDepth github

Pretraind Model

Pretrained models can be downloaded from here

Put model checkpoints (mono_encoder.pth & mono_depth.pth) in /checkpoints/MonoViT/

Depth estimation results on Cityscapes.

WIR: Whole Image Region / DOR: Dynamic Object Region

Method Input size abs rel a1
Ours-MonoViT 192 x 640 WIR 0.087 0.921
DOR 0.099 0.910

Precomputed results (disparity_map & error_map) can be downloaded from here

Test

# Test pretrained MonoViT with our proposed method on Cityscapes dataset
python test.py --config ./configs/test_monovit_cs.yaml

Citation

If you find this work useful, please consider citing:

@inproceedings{moon2024ground,
  title={From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior},
  author={Moon, Jaeho and Bello, Juan Luis Gonzalez and Kwon, Byeongjun and Kim, Munchurl},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={10519--10529},
  year={2024}
}

Reference

Monodepth: https://github.com/nianticlabs/monodepth2

MonoViT: https://github.com/zxcqlf/MonoViT

DynamicDepth: https://github.com/AutoAILab/DynamicDepth

Acknowledgement

This work was supported by the Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT): No. 2021-0-00087, Development of high-quality conversion technology for SD/HD low-quality media and No. RS2022-00144444, Deep Learning Based Visual Representational Learning and Rendering of Static and Dynamic Scenes.