Blurry-Edges

Abstract

Extracting depth information from photon-limited, defocused images is challenging because depth from defocus (DfD) relies on accurate estimation of defocus blur, which is fundamentally sensitive to image noise. We present a novel approach to robustly measure object depths from photon-limited images along the defocused boundaries. It is based on a new image patch representation, Blurry-Edges, that explicitly stores and visualizes a rich set of low-level patch information, including boundaries, color, and smoothness. We develop a deep neural network architecture that predicts the Blurry-Edges representation from a pair of differently defocused images, from which depth can be analytically calculated using a novel DfD relation we derive. Our experiment shows that our method achieves the highest depth estimation accuracy on photon-limited images compared to a broad range of state-of-the-art DfD methods.

Representaion

Blurry-Edges models each patch by a set of parameters, $\boldsymbol{\Psi} = \left( \{\boldsymbol{p}_i, \boldsymbol{\theta}_i, \boldsymbol{c}_i, \eta_i, i = 1,2,\cdots,l\}, \boldsymbol{c}_0\right)$. The tuple $(\boldsymbol{p}_i, \boldsymbol{\theta}_i, \boldsymbol{c}_i, \eta_i)$ parameterize the $i$th wedge in the patch, with $\boldsymbol{p}_i = (x_i, y_i)$ representing the vertex, $\boldsymbol{\theta}_i = (\theta_{i1}, \theta_{i2})$ denoting the starting and ending angle, $\boldsymbol{c}_i$ indicating the RGB color, and $\eta_i$ recording the smoothness of the boundary. The wedge with a large index is in the front. The vector $\boldsymbol{c}_0$ represents the RGB color of the background. As shown on the right, this representation can model various boundary structures and smoothness with only two wedges.

Model architecture

There are two stages. The local stage consists of residual blocks and predicts the Blurry-Edges representation for each patch locally. The global stage consists of a Transformer Encoder and refines the Blurry-Edges representation for all patches globally. Finally, the framework combines all the per-patch representations and outputs the global boundary map, color map, and depth map.

Comparison

Ground truth

Ours-W (2.962)

Focal Track (2.898)

PhaseCam3D (5.984)

DFV-DFF (9.604)

Ours (1.922)

Ours-PP (1.748)

Tang et al. (2.838)

DefocusNet (5.480)

DEReD (6.611)

Ground truth

Ours-W (5.558)

Focal Track (6.460)

PhaseCam3D (7.617)

DFV-DFF (10.970)

Ours (4.566)

Ours-PP (3.750)

Tang et al. (7.162)

DefocusNet (7.635)

DEReD (8.491)

Ground truth

Ours-W (5.612)

Focal Track (5.090)

PhaseCam3D (16.526)

DFV-DFF (8.696)

Ours (5.000)

Ours-PP (3.877)

Tang et al. (7.759)

DefocusNet (6.475)

DEReD (6.709)

Ground truth

Ours-W (1.128)

Focal Track (7.846)

PhaseCam3D (9.220)

DFV-DFF (11.712)

Ours (1.379)

Ours-PP (0.687)

Tang et al. (7.101)

DefocusNet (3.198)

DEReD (6.626)

Ground truth

Ours-W (4.629)

Focal Track (5.127)

PhaseCam3D (6.230)

DFV-DFF (6.253)

Ours (4.514)

Ours-PP (2.314)

Tang et al. (5.532)

DefocusNet (6.948)

DEReD (7.549)

* The notations Ours, Ours-W, and Ours-PP refer to the sparse depth maps, dense depth maps from Blurry-Edges, and dense depth maps generated from the sparse depth maps using a U-Net as post-processing, respectively. Insert numbers are RMSE (cm) of the predict depth values.

Real-captured images

Input #1, shutter speed +

Input #2, shutter speed +

Reference

Depth map, shutter speed +

Input #1, shutter speed ++

Input #2, shutter speed ++

Depth map, shutter speed ++

Input #1, shutter speed +

Input #2, shutter speed +

Reference

Depth map, shutter speed +

Input #1, shutter speed ++

Input #2, shutter speed ++

Depth map, shutter speed ++

Input #1, shutter speed +

Input #2, shutter speed +

Reference

Depth map, shutter speed +

Input #1, shutter speed ++

Input #2, shutter speed ++

Depth map, shutter speed ++

Input #1, shutter speed +

Input #2, shutter speed +

Reference

Depth map, shutter speed +

Input #1, shutter speed ++

Input #2, shutter speed ++

Depth map, shutter speed ++

BibTeX

@misc{xu2025blurryedges, title={Blurry-Edges: Photon-Limited Depth Estimation from Defocused Boundaries}, author={Wei Xu and Charles James Wagner and Junjie Luo and Qi Guo}, year={2025}, eprint={2503.23606}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2503.23606}, }

Blurry-Edges: Photon-Limited Depth Estimation from Defocused Boundaries

CVPR 2025

Abstract

Video Presentation

Representaion

Model architecture

Comparison

Real-captured images

BibTeX