Detection of atypical data in multicenter clinical trials using unsupervised statistical monitoring

Trotta, Laura; Kabeya, Yuusuke; BUYSE, Marc; Doffagne, Erik; Venet, David; Desmet, Lieven; BURZYKOWSKI, Tomasz; Tsuburaya, Akira; Yoshida, Kazuhiro; Miyashita, Yumi; Morita, Satoshi; Sakamoto, Junichi; Praveen, Paurush; Oba, Koji

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/29082

Title:	Detection of atypical data in multicenter clinical trials using unsupervised statistical monitoring
Authors:	Trotta, Laura Kabeya, Yuusuke BUYSE, Marc Doffagne, Erik Venet, David Desmet, Lieven BURZYKOWSKI, Tomasz Tsuburaya, Akira Yoshida, Kazuhiro Miyashita, Yumi Morita, Satoshi Sakamoto, Junichi Praveen, Paurush Oba, Koji
Issue Date:	2019
Publisher:	SAGE PUBLICATIONS LTD
Source:	CLINICAL TRIALS, 16 (5), p. 512-522
Abstract:	Background/Aims A risk-based approach to clinical research may include a central statistical assessment of data quality. We investigated the operating characteristics of unsupervised statistical monitoring aimed at detecting atypical data in multicenter experiments. The approach is premised on the assumption that, save for random fluctuations and natural variations, data coming from all centers should be comparable and statistically consistent. Unsupervised statistical monitoring consists of performing as many statistical tests as possible on all trial data, in order to detect centers whose data are inconsistent with data from other centers. Methods We conducted simulations using data from a large multicenter trial conducted in Japan for patients with advanced gastric cancer. The actual trial data were contaminated in computer simulations for varying percentages of centers, percentages of patients modified within each center and numbers and types of modified variables. The unsupervised statistical monitoring software was run by a blinded team on the contaminated data sets, with the purpose of detecting the centers with contaminated data. The operating characteristics (sensitivity, specificity and Youden's J-index) were calculated for three detection methods: one using the p-values of individual statistical tests after adjustment for multiplicity, one using a summary of all p-values for a given center, called the Data Inconsistency Score, and one using both of these methods. Results The operating characteristics of the three methods were satisfactory in situations of data contamination likely to occur in practice, specifically when a single or a few centers were contaminated. As expected, the sensitivity increased for increasing proportions of patients and increasing numbers of variables contaminated. The three methods showed a specificity better than 93% in all scenarios of contamination. The method based on the Data Inconsistency Score and individual p-values adjusted for multiplicity generally had slightly higher sensitivity at the expense of a slightly lower specificity. Conclusions The use of brute force (a computer-intensive approach that generates large numbers of statistical tests) is an effective way to check data quality in multicenter clinical trials. It can provide a cost-effective complement to other data-management and monitoring techniques.
Notes:	[Trotta, Laura; Doffagne, Erik; Praveen, Paurush] CluePoints SA, Ave Albert Einstein 2a, B-1348 Louvain La Neuve, Belgium. [Kabeya, Yuusuke; Oba, Koji] Univ Tokyo, Dept Biostat, Tokyo, Japan. [Kabeya, Yuusuke] EPS Corp, Tokyo, Japan. [Buyse, Marc] IDDI, San Francisco, CA USA. [Buyse, Marc] CluePoints, Wayne, PA USA. [Venet, David] Univ Brussels, IRIDIA, Brussels, Belgium. [Desmet, Lieven] Univ Louvain, Inst Stat Biostat & Actuarial Sci ISBA, Louvain La Neuve, Belgium. [Burzykowski, Tomasz] IDDI, Louvain La Neuve, Belgium. [Burzykowski, Tomasz] Univ Hasselt, Interuniv Inst Biostat & Stat Bioinformat I BioSt, Hasselt, Belgium. [Tsuburaya, Akira] Jizankai Med Fdn, Tsuboi Canc Ctr Hosp, Dept Surg, Koriyama, Fukushima, Japan. [Yoshida, Kazuhiro] Gifu Univ, Grad Sch Med, Dept Surg Oncol, Gifu, Japan. [Miyashita, Yumi; Sakamoto, Junichi] ECRIN, Okazaki, Aichi, Japan. [Morita, Satoshi] Kyoto Univ, Grad Sch Med, Dept Biomed Stat & Bioinformat, Kyoto, Japan. [Sakamoto, Junichi] Tokai Cent Hosp, Kakamigahara, Japan. [Oba, Koji] Univ Tokyo, Interfac Initiat Informat Studies, Tokyo, Japan.
Keywords:	Data quality;central statistical monitoring;risk-based monitoring;simulations;operating characteristics;fraud detection
Document URI:	http://hdl.handle.net/1942/29082
ISSN:	1740-7745
e-ISSN:	1740-7753
DOI:	10.1177/1740774519862564
ISI #:	000478323700001
Rights:	The Author(s) 2019 Article reuse guidelines: sagepub.com/journals-permissions
Category:	A1
Type:	Journal Contribution
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
Detection of atypical data in multicenter clinical trials using unsupervised statistical monitoring.pdf Restricted Access	Published version	1.99 MB	Adobe PDF	View/Open Request a copy

Show full item record

SCOPUS^TM
Citations

15

checked on Jan 14, 2026

WEB OF SCIENCE^TM
Citations

14

checked on Jan 21, 2026

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM