DistillGrasp: Integrating Features Correlation with Knowledge Distillation for Depth Completion of Transparent Objects

Huang, Yiheng; CHEN, Junhong; MICHIELS, Nick; Asim, Muhammad; CLAESEN, Luc; Liu, Wenyin

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/43644

Full metadata record

DC Field	Value	Language
dc.contributor.author	Huang, Yiheng	-
dc.contributor.author	CHEN, Junhong	-
dc.contributor.author	MICHIELS, Nick	-
dc.contributor.author	Asim, Muhammad	-
dc.contributor.author	CLAESEN, Luc	-
dc.contributor.author	Liu, Wenyin	-
dc.date.accessioned	2024-09-02T13:18:19Z	-
dc.date.available	2024-09-02T13:18:19Z	-
dc.date.issued	2024	-
dc.date.submitted	2024-08-23T06:39:59Z	-
dc.identifier.citation	IEEE ROBOTICS AND AUTOMATION LETTERS, 9(10), p. 8945-8952	-
dc.identifier.issn	2377-3766	-
dc.identifier.uri	http://hdl.handle.net/1942/43644	-
dc.description.abstract	Due to the visual properties of reflection and refraction, RGB-D cameras cannot accurately capture the depth of transparent objects, leading to incomplete depth maps. To fill in the missing points, recent studies tend to explore new visual features and design complex networks to reconstruct the depth, however, these approaches tremendously increase computation, and the correlation of different visual features remains a problem. To this end, we propose an efficient depth completion network named DistillGrasp which distillates knowledge from the teacher branch to the student branch. Specifically, in the teacher branch, we design a position correlation block (PCB) that leverages RGB images as the query and key to search for the corresponding values, guiding the model to establish correct correspondence between two features and transfer it to the transparent areas. For the student branch, we propose a consistent feature correlation module (CFCM) that retains the reliable regions of RGB images and depth maps respectively according to the consistency and adopts a CNN to capture the pairwise relationship for depth completion. To avoid the student branch only learning regional features from the teacher branch, we devise a distillation loss that not only considers the distance loss but also the object structure and edge information. Extensive experiments conducted on the ClearGrasp dataset manifest that our teacher network outperforms state-of-the-art methods in terms of accuracy and generalization, and the student network achieves competitive results with a higher speed of 48 FPS. In addition, the significant improvement in a real-world robotic grasping system illustrates the effectiveness and robustness of our proposed system.	-
dc.language.iso	en	-
dc.publisher	IEEE	-
dc.subject.other	Index Terms-Distillation learning	-
dc.subject.other	depth completion	-
dc.subject.other	trans- parent object grasping	-
dc.title	DistillGrasp: Integrating Features Correlation with Knowledge Distillation for Depth Completion of Transparent Objects	-
dc.type	Journal Contribution	-
dc.identifier.epage	8952	-
dc.identifier.issue	10	-
dc.identifier.spage	8945	-
dc.identifier.volume	9	-
local.bibliographicCitation.jcat	A1	-
local.publisher.place	New York	-
dc.relation.references	[1] J. Weibel, P. Sebeto, S. Thalhammer, and M. Vincze, “Challenges of Depth Estimation for Transparent Objects,” International Symposium on Visual Computing, pp. 277-288, 2023. [2] S. Sajjan, M. Matthew, M. Pan, N. Ganesh, J. Lee, A. Zeng, and S. Song, “ClearGrasp:3D shape estimation of transparent objects for manipulation,” IEEE International Conference on Robotics and Automation (ICRA), pp. 1-13, 2020. [3] Y. Tang, J. Chen, Z. Yang, Z. Lin, Q. Li, and W. Liu, “Depthgrasp: Depth completion of transparent objects using selfattentive adversarial network with spectral residual for grasping,” IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 5710-5716,2021. [4] K. Chen, S. Wang, B. Xia, D. Li, Z. Kan, and B. Li, “TODETrans: Transparent Object Depth Estimation with Transformer,” IEEE International Conference on Robotics and Automation (ICRA), pp. 4880-4886, 2023. [5] Y. Hong, J. Chen, Y. Cheng, Y. Han, F. Van Reeth, L. Claesen, and W. Liu, “Cluedepth grasp: Leveraging positional clues of depth for completing depth of transparent objects,” Frontiers in Neurorobotics, vol. 16, 2022. [6] J. Jiang, G. Cao, J. Deng, T. -T. Do and S. Luo, ”Robotic Perception of Transparent Objects: A Review,” in IEEE Transactions on Artificial Intelligence, vol. 5, no. 6, pp. 2547-2567, 2023. [7] U. Klank, D. Carton, and M. Beetz, “Transparent object detection and reconstruction on a mobile platform,” IEEE International Conference on Robotics and Automation (ICRA), 2011, pp. 59715978. [8] J. Ichnowski, Y. Avigal, J. Kerr, and K. Goldberg, “Dex-nerf: Using a neural radiance field to grasp transparent objects,” Conference on Robot Learning (CoRL), 2021. [9] J. Kerr, L. Fu, H. Huang, J. Ichnowski, M. Tancik, Y. Avigal, A. Kanazawa, and K. Goldberg, “Evo-nerf: Evolving nerf for sequential robot grasping,” Conference on Robot Learning (CoRL), 2022. [10] Y. Qian, M. Gong, and Y. H. Yang, “3d reconstruction of transparent objects with position-normal consistency,” in Proc. IEEE/CVFInternational Conference on Pattern Recognition( ICPR), pp. 43694377, 2016. [11] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015. [12] Z. Chen, Z. Li, S. Zhang, L. Fang, Q. Jiang, and F. Zhao, ”Bevdistill: Cross-modal bev distillation for multi-view 3d object detection,” International Conference on Learning Representations (ICLR), 2023. [13] Y. Hong, H. Dai, and Y. Ding, ”Cross-modality knowledge distillation network for monocular 3d object detection,” in European Conference on Computer Vision. Springer, 2022, pp. 87104. [14] J. Cen, S. Zhang, Y. Pei, K. Li, H. Zheng, M. Luo, Y. Zhang, and Q. Chen, ”Cmdfusion: Bidirectional fusion network with cross-modality knowledge distillation for lidar semantic segmentation,” IEEE Robotics and Automation Letters, vol. 9, no. 1, pp. 771778, 2023. [15] F. Jiang, H. Gao, S. Qiu, H. Zhang, R. Wan, and J. Pu., ”Knowledge Distillation from 3D to Birds-Eye-View for LiDAR Semantic Segmentation,” in 2023 IEEE International Conference on Multimedia and Expo (ICME), Brisbane, Australia, 2023 pp. 402-407. [16] W. Yuan, X. Gu, Z. Dai, S. Zhu, and P. Tan, “Neural Window Fully-connected CRFs for Monocular Depth Estimation,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3906-3915, 2022. [17] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10012-10022, 2021. [18] D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” Neural Information Processing Systems(NIPS), pp. 2366-2374, 2014. [19] H. Fang, H. -S. Fang, S. Xu and C. Lu, “TransCG: A Large- Scale Real-World Dataset for Transparent Object Depth Completion and a Grasping Baseline,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 7383-7390, 2022. [20] L. Zhu, A. Mousavian, Y. Xiang, H. Mazhar, J. Eenbergen, S. Debnath, and D. Fox, “RGB-D Local Implicit Function for Depth Completion of Transparent Objects,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4647-4656, 2021. [21] T. Li, Z. Chen, H. Liu, and C. Wang, “FDCT: Fast Depth Completion for Transparent Objects,” IEEE Robotics and Automation Letters, vol. 8, no. 9, pp. 5823-5830, 2023. [22] S. Kumra, S. Joshi, and F. Sahin, “Antipodal robotic grasping using generative residual convolutional neural network,” IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 9626-9633, 2020.	-
local.type.refereed	Refereed	-
local.type.specified	Article	-
dc.identifier.doi	10.1109/lra.2024.3455849	-
dc.identifier.isi	001313351400002	-
dc.identifier.eissn	2377-3766	-
local.provider.type	Pdf	-
local.uhasselt.international	yes	-
item.accessRights	Restricted Access	-
item.contributor	Huang, Yiheng	-
item.contributor	CHEN, Junhong	-
item.contributor	MICHIELS, Nick	-
item.contributor	Asim, Muhammad	-
item.contributor	CLAESEN, Luc	-
item.contributor	Liu, Wenyin	-
item.fullcitation	Huang, Yiheng; CHEN, Junhong; MICHIELS, Nick; Asim, Muhammad; CLAESEN, Luc & Liu, Wenyin (2024) DistillGrasp: Integrating Features Correlation with Knowledge Distillation for Depth Completion of Transparent Objects. In: IEEE ROBOTICS AND AUTOMATION LETTERS, 9(10), p. 8945-8952.	-
item.fulltext	With Fulltext	-
crisitem.journal.issn	2377-3766	-
crisitem.journal.eissn	2377-3766	-
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
DistillGrasp_Integrating_Features_Correlation_With_Knowledge_Distillation_for_Depth_Completion_of_Transparent_Objects.pdf Restricted Access	Published version	11.36 MB	Adobe PDF	View/Open Request a copy

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM