Luginov, A. and Shahzad, M.
ORCID: https://orcid.org/0009-0002-9394-343X
(2025)
NimbleD: enhancing self-supervised monocular depth estimation with pseudo-labels and large-scale video pre-training.
In: Del Bue, A., Canton, C., Pont-Tuset, J. and Tommasi, T. (eds.)
Computer Vision – ECCV 2024 Workshops Proceedings, Part II.
Lecture Notes in Computer Science (15624).
Springer, Cham, pp. 235-251.
ISBN 9783031923869
doi: 10.1007/978-3-031-92387-6_18
Abstract/Summary
We introduce NimbleD, an efficient self-supervised monocular depth estimation learning framework that incorporates supervision from pseudo-labels generated by a large vision model. This framework does not require camera intrinsics, enabling large-scale pre-training on publicly available videos. Our straightforward yet effective learning strategy significantly enhances the performance of fast and lightweight models without introducing any overhead, allowing them to achieve performance comparable to state-of-the-art self-supervised monocular depth estimation models. This advancement is particularly beneficial for virtual and augmented reality applications requiring low latency inference. The source code, model weights, and acknowledgments are available at https://github.com/xapaxca/nimbled.
Altmetric Badge
| Item Type | Book or Report Section |
| URI | https://centaur.reading.ac.uk/id/eprint/118622 |
| Identification Number/DOI | 10.1007/978-3-031-92387-6_18 |
| Refereed | Yes |
| Divisions | Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science |
| Publisher | Springer |
| Download/View statistics | View download statistics for this item |
University Staff: Request a correction | Centaur Editors: Update this record
Download
Download