Clustering multiply imputed multivariate high-dimensional longitudinal profiles

BRUCKERS, Liesbeth; MOLENBERGHS, Geert; DENDALE, Paul

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/24959

Title:	Clustering multiply imputed multivariate high-dimensional longitudinal profiles
Authors:	BRUCKERS, Liesbeth MOLENBERGHS, Geert DENDALE, Paul
Issue Date:	2017
Publisher:	WILEY
Source:	BIOMETRICAL JOURNAL, 59(5), p. 998-1015
Abstract:	In this paper, we propose a method to cluster multivariate functional data with missing observations. Analysis of functional data often encompasses dimension reduction techniques such as principal component analysis (PCA). These techniques require complete data matrices. In this paper, the data are completed by means of multiple imputation, and subsequently each imputed data set is submitted to a cluster procedure. The final partition of the data, summarizing the partitions obtained for the imputed data sets, is obtained by means of ensemble clustering. The uncertainty in cluster membership, due to missing data, is characterized by means of the agreement between the members of the ensemble and fuzziness of the consensus clustering. The potential of the method was brought out on the heart failure (HF) data. Daily measurement for four biomarkers (heart rate, diastolic, and systolic blood pressure, weight) were used to cluster the patients. To normalize the distributions of the longitudinal outcomes, the data were transformed with a natural logarithm function. A cubic spline base with 69 basis functions was employed to smooth the profiles. The proposed algorithm indicates the existence of a latent structure and divides the HF patients into two clusters, showing a different evolution in blood pressure values and weight. In general, cluster results are sensitive to choices made. Likewise for the proposed approach, alternative choices for the distance measure, procedure to optimize the objective function, choice of the scree-test threshold, or the number of principal components, to be used in the approximation of the surrogate density, could all influence the final partition. For the HF data set, the final partition depends on the number of principal components used in the procedure.
Notes:	[Bruckers, Liesbeth; Molenberghs, Geert; Dendale, Paul] Univ Hasselt, I BioStat, Agoralaan, B-3590 Diepenbeek, Belgium. [Molenberghs, Geert] Katholieke Univ Leuven, I BioStat, B-3000 Leuven, Belgium. [Dendale, Paul] Jessa Hosp, Heart Ctr Hasselt, B-3500 Hasselt, Belgium.
Keywords:	cluster analysis; data reduction; functional data analysis; missing data;Cluster analysis; Data reduction; Functional data analysis; Missing data
Document URI:	http://hdl.handle.net/1942/24959
ISSN:	0323-3847
e-ISSN:	1521-4036
DOI:	10.1002/bimj.201500027
ISI #:	000408988700020
Rights:	(C) 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Category:	A1
Type:	Journal Contribution
Validations:	ecoom 2018
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
bruckers 1.pdf Restricted Access	Published version	755.58 kB	Adobe PDF	View/Open Request a copy
508.pdf	Peer-reviewed author version	250.64 kB	Adobe PDF	View/Open

Show full item record

SCOPUS^TM
Citations

11

checked on Jun 27, 2026

WEB OF SCIENCE^TM
Citations

10

checked on Jun 25, 2026

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM