Exploring Model Architectures for Real-Time Lung Sound Event Detection

JACOBS, Michiel; Vuegen, Lode; Verresen, Tom; Schouterden, Marie; RUTTENS, David; Karsmakers, Peter

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/48197

Title:	Exploring Model Architectures for Real-Time Lung Sound Event Detection
Authors:	JACOBS, Michiel Vuegen, Lode Verresen, Tom Schouterden, Marie RUTTENS, David Karsmakers, Peter
Advisors:	Karsmakers
Issue Date:	2025
Source:	ESANN 2025 - Proceedings 33rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning First Edition, p. 735 -740
Series/Report:	Proceedings of the 33rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
Series/Report no.:	33
Abstract:	Computerized detection of relevant lung sound events has the potential to assist physicians during auscultation and to monitor the severity of pulmonary diseases in ambulatory settings. In some cases, real-time detection of adventitious lung sounds is required to provide instant feedback to physicians, e.g. during autogenic drainage therapy. State-of-the-art solutions for this task leverage deep learning models, which vary significantly in complexity. For real-time applications on resource-constrained devices, such as stethoscope-integrated hardware, both detection accuracy and model complexity are important to consider. While most existing research focusses primarily on accuracy, this work evaluates both accuracy and computational complexity. The contributions of this work are threefold. First, the effect of using a full breathing cycle as input is studied to assess its impact on event detection performance. This approach introduces a computational cost due to the required segmentation process. Second, a transformer-based architecture is compared with two relatively simple convolutional models, each utilizing different input horizons. Evaluations are conducted on both public and in-house lung sound datasets. Third, recognizing that the event detection task aligns better with a multi-label setting than the commonly used multi-class setup, this study compares both approaches. We conclude that a multi-label output outperforms a multi-class approach, that inputs segmented per breathing cycle are preferred, and that the high complexity models have similar performance to the models with low complexity on unseen data.
Document URI:	http://hdl.handle.net/1942/48197
Link to publication/dataset:	https://www.esann.org/sites/default/files/proceedings/2025/ES2025-201.pdf
ISBN:	9782875870926
Category:	C1
Type:	Proceedings Paper
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
ES2025-201.pdf Restricted Access	Published version	1.98 MB	Adobe PDF	View/Open Request a copy

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM