Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/48125Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Huang, Yiheng | - |
| dc.contributor.author | CHEN, Junhong | - |
| dc.contributor.author | Ning, Anqi | - |
| dc.contributor.author | Liang, Zhanhong | - |
| dc.contributor.author | MICHIELS, Nick | - |
| dc.contributor.author | CLAESEN, Luc | - |
| dc.contributor.author | Liu, Wenyin | - |
| dc.date.accessioned | 2026-01-15T13:16:59Z | - |
| dc.date.available | 2026-01-15T13:16:59Z | - |
| dc.date.issued | 2025 | - |
| dc.date.submitted | 2025-12-21T10:30:13Z | - |
| dc.identifier.citation | Ieee Robotics and Automation Letters, 11 (2) , p. 2074 -2081 | - |
| dc.identifier.uri | http://hdl.handle.net/1942/48125 | - |
| dc.description.abstract | Self-supervised monocular depth estimation has achieved notable success under daytime conditions. However, its performance deteriorates markedly at night due to low visibility and varying illumination, e.g., insufficient light causes textureless areas, and moving objects bring blurry regions. To this end, we propose a self-supervised framework named DASP that leverages spatiotemporal priors for nighttime depth estimation. Specifically, DASP consists of an adversarial branch for extracting spatiotemporal priors and a self-supervised branch for learning. In the adversarial branch, we first design an adversarial network where the discriminator is composed of four devised spatiotemporal priors learning blocks (SPLB) to exploit the daytime priors. In particular, the SPLB contains a spatial-based temporal learning module (STLM) that uses orthogonal differencing to extract motion-related variations along the time axis and an axial spatial learning module (ASLM) that adopts local asymmetric convolutions with global axial attention to capture the multiscale structural information. By combining STLM and ASLM, our model can acquire sufficient spatiotemporal features to restore textureless areas and estimate the blurry regions caused by dynamic objects. In the self-supervised branch, we propose a 3D consistency projection loss to bilaterally project the target frame and source frame into a shared 3D space, and calculate the 3D discrepancy between the two projected frames as a loss to optimize the 3D structural consistency and daytime priors. Extensive experiments on the Oxford RobotCar and nuScenes datasets demonstrate that our approach achieves state-of-theart performance for nighttime depth estimation. Ablation studies further validate the effectiveness of each component. | - |
| dc.description.sponsorship | The work of Chen Junhong was supported by China Scholarship Council under Grant 202208440309. This work was supported in part by the National Natural Science Foundation of China under Grant 91748107, in part by the Special Research Fund (BOF) of Hasselt University under Grant BOF23DOCBL11, and in part by the Guangdong Innovative Research Team Program under Grant 2014ZT05G157. | - |
| dc.language.iso | en | - |
| dc.publisher | IEEE | - |
| dc.subject.other | Deep Learning for Visual Perception | - |
| dc.subject.other | Deep Learning Methods | - |
| dc.subject.other | Semantic Scene Understanding. | - |
| dc.title | DASP: Self-Supervised Nighttime Monocular Depth Estimation With Domain Adaptation of Spatiotemporal Priors | - |
| dc.type | Journal Contribution | - |
| dc.identifier.epage | 2081 | - |
| dc.identifier.issue | 2 | - |
| dc.identifier.spage | 2074 | - |
| dc.identifier.volume | 11 | - |
| local.format.pages | 8 | - |
| local.bibliographicCitation.jcat | A1 | - |
| local.publisher.place | New York | - |
| dc.relation.references | [1] J.-B. Weibel, P. Sebeto, S. Thalhammer, and M. Vincze, “Challenges of depth estimation for transparent objects,” in Proc. Int. Symp. Visual. Computing. Springer, 2023, pp. 277–288. [2] R. A. Newcombe, S. J. Lovegrove, and A. J. Davison, “Dtam: Dense tracking and mapping in real-time,” in Proc. IEEE Int. Conf. Comp. Vis., 2011, pp. 2320–2327. [3] H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuscenes: A multimodal dataset for autonomous driving,” in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2020, pp. 11 621–11 631. [4] T. Zhou, M. Brown, N. Snavely, and D. G. Lowe, “Unsupervised learning of depth and ego-motion from video,” in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2017, pp. 1851–1858. [5] Z. Yin and J. Shi, “Geonet: Unsupervised learning of dense depth, optical flow and camera pose,” in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2018, pp. 1983–1992. [6] H. Yang, C. Zhao, L. Sheng, and Y. Tang, “Self-supervised monocular depth estimation in the dark: towards data distribution compensation,” in Proc. Int. Joint Conf. Artificial Intell., 2024. [7] Y. Zou, Z. Luo, and J.-B. Huang, “Df-net: Unsupervised joint learning of depth and flow using cross-task consistency,” in Proc. Eur. Conf. Comp. Vis., 2018, pp. 36–53. [8] C. Godard, O. Mac Aodha, M. Firman, and G. J. Brostow, “Digging into self-supervised monocular depth estimation,” in Proc. IEEE Int. Conf. Comp. Vis., 2019, pp. 3828–3838. [9] K. Wang, Z. Zhang, Z. Yan, X. Li, B. Xu, J. Li, and J. Yang, “Regularizing nighttime weirdness: Efficient self-supervised monocular depth estimation in the dark,” in Proc. IEEE Int. Conf. Comp. Vis., 2021, pp. 16 055–16 064. [10] Y. Zheng, C. Zhong, P. Li, H.-a. Gao, Y. Zheng, B. Jin, L. Wang, H. Zhao, G. Zhou, Q. Zhang et al., “Steps: Joint self-supervised nighttime image enhancement and depth estimation,” in Proc. IEEE Int. Conf. Robot. Autom., 2023, pp. 4916–4923. [11] R. Cong, C. Wu, X. Song, W. Zhang, S. Kwong, H. Li, and P. Ji, “Srnsd: Structure-regularized night-time self-supervised monocular depth estimation for outdoor scenes,” IEEE Trans. Image Process., 2024. [12] V. Casser, S. Pirk, R. Mahjourian, and A. Angelova, “Depth prediction without the sensors: Leveraging structure for unsupervised learning from monocular videos,” in Proc. Conf. AAAI, 2019, pp. 8001–8008. [13] M. Poggi, F. Aleotti, F. Tosi, and S. Mattoccia, “On the uncertainty of self-supervised monocular depth estimation,” in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2020, pp. 3227–3237. [14] H. Li, A. Gordon, H. Zhao, V. Casser, and A. Angelova, “Unsupervised monocular depth learning in dynamic scenes,” in Conference on Robot Learning, 2021, pp. 1908–1917. [15] S. Lee, F. Rameau, F. Pan, and I. S. Kweon, “Attentive and contrastive learning for joint depth and motion field estimation,” in Proc. IEEE Int. Conf. Comp. Vis., 2021, pp. 4862–4871. [16] Z. Feng, L. Yang, L. Jing, H. Wang, Y. Tian, and B. Li, “Disentangling object motion and occlusion for unsupervised multi-frame monocular depth,” in Proc. Eur. Conf. Comp. Vis., 2022, pp. 228–244. [17] J. Watson, O. Mac Aodha, V. Prisacariu, G. Brostow, and M. Firman, “The temporal opportunist: Self-supervised multi-frame monocular depth,” in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2021, pp. 1164– 1174. [18] L. Sun, J.-W. Bian, H. Zhan, W. Yin, I. Reid, and C. Shen, “Sc-depthv3: Robust self-supervised monocular depth estimation for dynamic scenes,” IEEE Trans. Pattern Anal. Mach. Intell., pp. 497–508, 2023. [19] J. Moon, J. L. G. Bello, B. Kwon, and M. Kim, “From-ground-toobjects: Coarse-to-fine self-supervised monocular depth estimation of dynamic objects with ground contact prior,” in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2024, pp. 10 519–10 529. [20] J. Spencer, R. Bowden, and S. Hadfield, “Defeat-net: General monocular depth via simultaneous unsupervised representation learning,” in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2020, pp. 14 402–14 413. [21] M. Vankadari, S. Garg, A. Majumder, S. Kumar, and A. Behera, “Unsupervised monocular depth estimation for night-time images using adversarial domain feature adaptation,” in Proc. Eur. Conf. Comp. Vis., 2020, pp. 443–459. [22] L. Liu, X. Song, M. Wang, Y. Liu, and L. Zhang, “Self-supervised monocular depth estimation for all day images using domain separation,” in Proc. IEEE Int. Conf. Comp. Vis., 2021, pp. 12 737–12 746. [23] S. Gasperini, N. Morbitzer, H. Jung, N. Navab, and F. Tombari, “Robust monocular depth estimation under challenging conditions,” in Proc. IEEE Int. Conf. Comp. Vis., 2023, pp. 8177–8186. [24] C. Wang, G. Zhang, Z. Cheng, and W. Zhou, “Promptmono: Cross prompting attention for self-supervised monocular depth estimation in challenging environments,” arXiv preprint arXiv:2501.13796, 2025. [25] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, 2004. [26] C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2017, pp. 270–279. [27] J. Bian, Z. Li, N. Wang, H. Zhan, C. Shen, M.-M. Cheng, and I. Reid, “Unsupervised scale-consistent depth and ego-motion learning from monocular video,” Adv. Neural Inf. Process. Syst., vol. 32, 2019. [28] H. Wang, Y. Zhu, B. Green, H. Adam, A. Yuille, and L.-C. Chen, “Axialdeeplab: Stand-alone axial-attention for panoptic segmentation,” in Proc. Eur. Conf. Comp. Vis., 2020, pp. 108–126. [29] K. Jiang, J. Cao, Z. Yu, J. Jiang, and J. Zhou, “Always clear depth: Robust monocular depth estimation under adverse weather,” in Proc. Int. Joint Conf. Artificial Intell., 2025. [30] Q. Liang, L. Wang, L. Wang, X. Liu, and G. Wang, “Light-dark: A novel lightweight self-supervised monocular depth estimation in the dark,” in Proc. Int. Conf. Intelligent Computing, 2024, pp. 3–14. [31] W. P. Maddern, G. Pascoe, C. Linegar, and P. Newman, “1 year, 1000 km: The oxford robotcar dataset,” The International Journal of Robotics Research, vol. 36, pp. 15 – 3, 2017. | - |
| local.type.refereed | Refereed | - |
| local.type.specified | Article | - |
| dc.identifier.doi | 10.1109/LRA.2025.3644148 | - |
| dc.identifier.isi | WOS:001651966100007 | - |
| local.provider.type | CrossRef | - |
| local.uhasselt.international | yes | - |
| item.fulltext | With Fulltext | - |
| item.contributor | Huang, Yiheng | - |
| item.contributor | CHEN, Junhong | - |
| item.contributor | Ning, Anqi | - |
| item.contributor | Liang, Zhanhong | - |
| item.contributor | MICHIELS, Nick | - |
| item.contributor | CLAESEN, Luc | - |
| item.contributor | Liu, Wenyin | - |
| item.accessRights | Open Access | - |
| item.fullcitation | Huang, Yiheng; CHEN, Junhong; Ning, Anqi; Liang, Zhanhong; MICHIELS, Nick; CLAESEN, Luc & Liu, Wenyin (2025) DASP: Self-Supervised Nighttime Monocular Depth Estimation With Domain Adaptation of Spatiotemporal Priors. In: Ieee Robotics and Automation Letters, 11 (2) , p. 2074 -2081. | - |
| crisitem.journal.issn | 2377-3766 | - |
| crisitem.journal.eissn | 2377-3766 | - |
| Appears in Collections: | Research publications | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| DASP_Nighttime_Depth_Estimation_RAL__Version_final.pdf | Peer-reviewed author version | 9.15 MB | Adobe PDF | View/Open |
| DASP_Self-Supervised_Nighttime_Monocular_Depth_Estimation_With_Domain_Adaptation_of_Spatiotemporal_Priors.pdf Restricted Access | Published version | 11.46 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.