Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/45933
Title: | Can machine learning support survival model selection to inform economic evaluations? Exploring K-Fold cross validation based model selection in seven datasets | Authors: | BERMEJO DELGADO, Inigo Grimm, S. |
Issue Date: | 2024 | Publisher: | ELSEVIER SCIENCE INC | Source: | Value in health, 27 (12) (Art N° MSR17) | Abstract: | from the intent-to-treat population. Estimated HRs and 95% CIs of ivosidenib versus placebo were calculated. Results: The previously published RPSFTM-adjusted results showed that ivosidenib was associated with mortality risk reduction (MRR) versus placebo (HR=0.49 [95% CI: 0.34; 0.70]). The external analysis reported here, showed that ivosidenib was associated with MRR, using the RPSFTM 'treatment group' (not re-censored HR=0.52 [95% CI: 0.37; 0.75]), and 'on-treatment' approaches (re-censored HR=0.49 [95% CI: 0.28; 0.87]; not re-censored HR=0.52 [95% CI: 0.36; 0.74]). The IPCW-adjusted Cox proportional hazards regression analysis also showed that ivosidenib was associated with MRR (HR=0.74 [95% CI: 0.35; 1.56]). Conclusions: All three crossover adjustment methods applied in this external re-analysis of ClarIDHy data showed that ivosidenib was associated with MRR, consistent with previously published RPSFTM-adjusted results. Objectives: The selection of survival models for informing economic evaluations of innovative therapies with limited long-term data traditionally relies on metrics of statistical goodness of fit in the full trial data. However, models selected based on full trial data might underperform in the target population due to overfitting. K-fold cross validation (CV), commonly used in machine learning, splits the data allowing better estimation of fit in unseen data. We explore whether k-fold CV improves model selection. Methods: We used seven publicly available long-term survival datasets covering a range of diseases. We simulated 100 artificial data locks by sampling 250 patients without replacement, and right-censoring once median survival was reached. We fitted standard parametric and flexible survival models to each simulated dataset and selected models with lowest AIC/BIC as estimated using 10-fold CV and traditional methods. We then estimated the restricted mean survival time (RMST) error of best-fitting models relative to the RMST calculated from the full dataset's Kaplan-Meier. Results: K-fold CV led to lower mean RMST errors compared to traditional model selection methods in six (all seven) datasets when selecting models based on AIC (and BIC). On average, the RMST error was 27% higher (when based on AIC) and 40% (BIC) higher using traditional model selection compared to CV-based model selection. CV never selected complex models (3+ parameters) whilst the traditional method resulted in complex models being selected in 51% (AIC) and 12% (BIC) of simulations. Conclusions: In the first study exploring k-fold CV for survival model selection, we show that it can regularly outperform traditional methods. Notably, k-fold CV favors less complex models compared to traditional methods, which may hint at their better generalizability. We conclude that k-fold CV may be an important addition to the modeler's toolbox when performing survival analysis. Further research should explore whether these findings hold in additional settings. Objectives: The purpose of the present study was to assess how the eight dimensions of the SF-36 HRQoL profile instrument impact the utility scores derived from the major multiattribute utility instruments (MAUIs). Methods: We employed the ordinary least squares (OLS) estimator to estimate models that analyze the relationship between SF-36 dimensions and various MAUIs using data from the multi-instrument comparison (MIC) study (Richardson et al., 2015). We focused on the sensitivity of six major MAUIs-AQoL-4D, AQoL-8D, 15D, EQ-5D, SF-6D, and HUI3-to changes in the eight SF-36 dimensions. Results: Our analysis show that the AQoL-8D demonstrates greater sensitivity to mental health (MH) compared to AQoL-4D, 15D, EQ-5D, and HUI3. The EQ-5D showed higher sensitivity to bodily pain (BP) than all other MAUIs. Additionally, the 15D was more sensitive to physical functioning (PF) compared to AQoL-4D and AQoL-8D. Finally, the SF-6D exhibited greater sensitivity to the role emotional (RE) dimension than 15D, AQoL-4D, and AQoL-8D. Conclusions: Our study highlights that HRQoL utility scores are affected differently by the eight dimensions measured by the SF-36 survey, depending on the MAUI used. These findings allow to deduce which dimensions of the SF-36 have the greatest influence on the utility scores generated by a specific MAUI. Thus, the selection of a MAUI for research may be informed by its sensitivity to the health dimensions of particular interest. | Document URI: | http://hdl.handle.net/1942/45933 | ISSN: | 1098-3015 | e-ISSN: | 1524-4733 | ISI #: | 001457486000270 | Category: | M | Type: | Journal Contribution |
Appears in Collections: | Research publications |
Show full item record
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.