Statistical evaluation methodology for surrogate endpoints in clinical studies

VAN DER ELST, Wim

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/22898

Title:	Statistical evaluation methodology for surrogate endpoints in clinical studies
Authors:	VAN DER ELST, Wim
Advisors:	MOLENBERGHS, Geert Abad, Ariel Alonso
Issue Date:	2016
Abstract:	In klinische studies is het niet altijd mogelijk om de meest relevante indicator van behandelingssucces te gebruiken. Zo bestaat bijvoorbeeld de mogelijkheid dat het klinische eindpunt pas lange tijd na de start van de studie kan worden gemeten (bvb. overlevingstijd in oncologie). Gebruik van dit klinisch eindpunt zou dan ook impliceren dat de nieuwe behandeling pas na een lange tijd kan worden geëvalueerd. Als zulke situatie zich voordoet kan het interessant zijn om het klinisch eindpunt te vervangen door een zgn. ‘surrogaat eindpunt’. Dit is een eindpunt dat sneller kan worden gemeten dan het klinisch eindpunt, en dat toelaat om het effect van de behandeling op het klinisch eindpunt accuraat te schatten. Een mogelijke surrogaat voor overlevingstijd in oncologie zou bvb. verandering van tumorvolume kunnen zijn. Alvorens een klinisch eindpunt (K) kan worden vervangen door een surrogaat eindpunt (S) is het noodzakelijk om empirisch te verifiëren dat S hiervoor ‘geschikt’ is, i.e., dat S toelaat om het effect van de behandeling op K accuraat in te schatten. De laatste 40 jaar zijn een groot aantal statistische procedures ontwikkeld om na te gaan of een kandidaat S werkelijk geschikt is om K te vervangen. Deze methoden verschillen m.b.t. twee belangrijke aspecten: (i) sommige methoden nemen aan dat de data van één klinische studie beschikbaar is terwijl andere methoden aannemen dat de data van meerdere klinische studies beschikbaar zijn, en (ii) sommige methoden focussen op individuele causale effecten terwijl andere methoden focussen op verwachtte causale effecten. Als de focus op individuele causale effecten ligt, wordt aangenomen dat elke patiënt j twee mogelijke uitkomsten heeft voor K: een waarde K0j die zou worden geobserveerd als de patiënt de controlebehandeling krijgt, en een waarde K1j die zou worden geobserveerd als de patiënt de experimentele behandeling krijgt (en idem voor S). Individuele causale effecten worden dan gedefinieerd als ∆Kj = K1j − K0j en ∆Sj = S1j − S0j . In Hoofdstuk 3 worden de theoretische kaders besproken die gebruikt worden om surrogaten te evalueren op basis van individuele en verwachtte causale effecten in settings waar de data van één en meerdere klinische studies beschikbaar zijn. Er worden nieuwe indices voorgesteld die de accuraatheid van een kandidaat S om K te voorspellen kwantificeren. Daarnaast worden ook simulatiestudies uitgevoerd, die aantonen dat surrogaat-indices gebaseerd op individuele en verwachtte causale effecten aan mekaar gerelateerd zijn. Hoofdstuk 4 gaat dieper in op scenario’s waar zowel S als K binaire uitkomsten zijn. Twee nieuwe surrogaat-indices worden voorgesteld. Een eerste maat, de ‘individuele causale associatie’ kwantificeert de accuraatheid waarmee het individuele causale behandelingseffect op K kan worden voorspeld op basis van het individuele causale behandelingseffect op S. Een tweede maat, de ‘surrogaat predictieve functie’ laat toe om op een meer fijnmazige manier de relatie tussen ∆Kj en ∆Sj te bestuderen, i.c., om de meest waarschijnlijke uitkomst voor ∆Kj gegeven ∆Sj te bekijken. Dit laat toe om belangrijke wetenschappelijke vragen te beantwoorden, bvb. de vraag hoe waarschijnlijk het is dat de behandeling een negatieve impact op K zal hebben als de behandeling een positieve impact heeft op S, i.c., de waarschijnlijkheid dat de surrogaat een vals-positief resultaat zal opleveren. Diverse andere onderwerpen die gerelateerd zijn aan de evaluatie van surrogaten worden besproken in het tweede deel van deze thesis. In Hoofdstuk 5 ligt de focus op ‘gepersonaliseerde geneeskunde’. Dit concept verwijst naar het idee dat een behandeling idealiter op maat van de individuele patiënt zou moeten worden aangeboden, in tegenstelling tot het huidige dominante paradigma in de geneeskunde, waar alle patiënten met eenzelfde ziekte ook dezelfde behandeling krijgen (i.c., de behandeling die ‘gemiddeld genomen’ het beste werkt in de populatie). De huidige methoden die gebruikt worden om predictoren van behandelingssucces op te sporen zijn gebaseerd op correlationele modellen. In hoofdstuk 5 wordt een alternatieve aanpak voorgesteld gebaseerd op individuele causale modellen. In situaties waar surrogaten moeten worden geëvalueerd op basis van meerdere klinische studies (de zgn. ‘meta-analytische setting’) worden vaak mixed effects modellen gebruikt om surrogacy te kwantificeren. Het fitten van dit soort modellen is vaak niet evident, en convergentieproblemen komen frequent voor. In Hoofdstuk 6 worden simulatiestudies gebruikt om na te gaan welke factoren een invloed hebben op de convergentie van de modellen. Er wordt ook een methode die gebaseerd is op meervoudige imputatie voorgesteld om de convergentieproblemen te vermijden. In de meta-analytische setting wordt de correlatie tussen S and K (gecorrigeerd voor mogelijke trial- en behandelingseffecten) vaak gebruikt om de kwaliteit van een kandidaat surrogaat te kwantificeren. Een psychometrisch concept dat hieraan nauw aansluit is betrouwbaarheid. Betrouwbaarheid kwantificeert de herhaalbaarheid van twee of meer metingen binnen dezelfde persoon. In Hoofdstuk 7 worden enkele methoden voorgesteld om betrouwbaarheid op een flexibele manier te schatten gebaseerd op mixed-effects modellen. De methodologie die werd ontwikkeld in deze thesis is geïmplementeerd in drie R pakketten (i.c., de Surrogate, EffectTreat, en CorrMixed pakketten) die kunnen worden gedownload via CRAN. Alle voorbeeldanalyses die worden beschreven in deze thesis kunnen worden gerepliceerd met deze R pakketten. It is sometimes not feasible to use the true endpoint (i.e., the most credible indicator of the therapeutic response) in a clinical trial. For example, the true endpoint may require a long follow-up time (e.g., survival time in oncology) and thus it would take a long time to evaluate the new therapy using this endpoint. In such a situation, it can be an appealing strategy to substitute the true endpoint by another endpoint that can be measured earlier (e.g., change in tumour volume in oncology) and that can predict the treatment effect on the true endpoint. Such a replacement outcome for the true endpoint is referred to as a ‘surrogate endpoint’. Before the candidate surrogate endpoint can replace the true endpoint, it has to be formally evaluated. This is not a trivial task, and over the last decades a number of statistical procedures to achieve this aim have been proposed. These methods can be classified along two dimensions, taking into account (i) whether they use information from a single or from multiple clinical trials, and (ii) whether they focus on individual or on expected causal effects. When the focus is on individual causal effects, it is assumed that each patient j has two potential outcomes for the true endpoint T: an outcome T0j that would be observed under the control treatment and an outcome T1j that would be observed under the experimental treatment – and similarly for S. These are called ‘potential outcomes’, because they represent the outcomes that could have been observed if the patient had received the control treatment or the experimental treatment. Individual causal effects can then be defined as ∆Tj = T1j − T0j and ∆Sj = S1j − S0j . Expected causal treatment effects essentially refer to the averaged individual causal treatment effects. In Chapter 3, the conceptual frameworks that underlie the surrogate evaluation methodology based on individual and expected causal effects in the singleand the multiple-trial settings is detailed for scenarios where both endpoints are normally distributed variables. Even though the causal inference paradigm is typically framed into the single-trial setting, it is shown that this methodology can also be embedded in the multiple-trial setting. Further, new metrics of surrogacy based on individual causal effects in the single and multiple trial settings are proposed – the so-called individual and meta-analytic individual causal associations, respectively. Both metrics essentially quantify the accuracy by which the individual causal treatment effect on T can be predicted based on the individual causal treatment effect on S. Simulation studies showed that the metrics of surrogacy based on individual and expected causal effects were related, but in a rather complex way. In Chapter 4, the focus is on a surrogate evaluation scenario in the single trial setting where both endpoints are binary. Two new metrics of surrogacy based on individual causal effects are proposed. Similarly to what is the case in the normal-normal scenario, the individual causal association quantifies the overall accuracy by which the individual causal treatment effect on T can be predicted based on the individual causal treatment effect on S. It has an appealing interpretation in terms of uncertainty reduction in the prediction of ∆Tj based on ∆Sj . The so-called ‘surrogate predictive function’ supplements this metric in the sense that it allows for a more fine-grained assessment of how ∆Tj and ∆Sj are related. This function basically allows us to determine what the most likely outcome of ∆Tj will be for a given outcome of ∆Sj . In this way it allows for the evaluation of some important scientific questions that cannot be explicitly addressed using the individual causal association. For example one may be interested in quantifying the conditional probability that the treatment has a negative impact on T given that the treatment has a beneficial impact on S (i.e., the probability that the surrogate will produce a false positive result). Several other topics that are related to surrogate evaluation methods are discussed in the second part of this thesis. In Chapter 5, the focus is on ‘personalized medicine’. Personalized medicine refers to the idea that a medical treatment should be tailored to the individual patients’ specific characteristics, as opposed to the practice where all patients who have the same disease receive the same treatment (i.e., the treatment that works best ‘on average’ in the population). It is argued that the commonly used correlational approaches to identify predictors of treatment success are not sufficient to answer the relevant scientific question. A new metric that quantifies the extent to which therapeutic success can be predicted based on pretreatment predictors is proposed, the socalled predictive causal association. In the meta-analytic surrogate evaluation framework (which assesses surrogacy based on expected causal effects in a multi-trial setting), linear mixedeffects models are typically fitted to estimate the surrogacy metrics of interest. Unfortunately, in real-life surrogate evaluation settings, model convergence problems are prevalent. In Chapter 6, simulation studies are used to examine the factors that affect model convergence and a multiple imputation-based approach to reduce model convergence issues is proposed. In the meta-analytic surrogate evaluation approach, one of the metrics that assesses surrogacy is the coefficient of individual-level surrogacy. This metric essentially quantifies the treatment- and trial-corrected correlation between S and T. A psychometric concept that is closely related to individual-level surrogacy is reliability. Reliability quantifies the reproducibility (or, predictability) of two or more outcomes that are repeatedly measured within the same person. In Chapter 7, some methods are proposed to estimate reliability in a flexible way using linear mixed-effects models. The methodology that was developed in this thesis has been implemented in three R packages, i.e., the Surrogate, EffectTreat, and CorrMixed packages (available for download at CRAN). A detailed account on how the results of the case study analyses that are described throughout this thesis can be replicated using these R packages is available in an online Appendix that accompanies this thesis.
Keywords:	surrogate endpoints; statistics
Document URI:	http://hdl.handle.net/1942/22898
Category:	T1
Type:	Theses and Dissertations
Appears in Collections:	PhD theses Research publications

Files in This Item:

File	Description	Size	Format
Thesis_v2 formaat 17x24.pdf Restricted Access		2.4 MB	Adobe PDF	View/Open Request a copy

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Google Scholar^TM