Monocular self-supervised depth estimation with a low-cost sensor is the mainstream solution to gathering dense depth maps for robots and autonomous driving. In this paper, based on the philosophy “less is more” (i.e., focusing only on valid pixels in sparse LiDAR), we propose a novel framework, Efficient Sparse Depth (EffisDepth), for predicting dense depth. The Sparse Feature Extractor (SFE) embedded in the proposed framework effectively handles sparse LiDAR by forming sparse tensors. The Slender Group Block (SGB) is the main building block in SFE, which extracts features from sparse tensors via a structure of two branches. Extensive experiments show that our method achieves state-of-the-art performance on the KITTI benchmark, demonstrating the effectiveness of each proposed component and the self-supervised learning framework
Supplementary notes can be added here, including code, math, and images.