Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/26069
Title: Quantile regression in heteroscedastic varying coefficient models: testing and variable selection
Authors: IBRAHIM, Mohammed Abdulkerim 
Advisors: VERHASSELT, Anneleen
Gijbels, Irène
Issue Date: 2018
Abstract: In this dissertation, quantile regression in varying coefficient models using a nonparametric technique called P-splines is investigated. In mean regression, we study the influence of the covariates on the conditional mean of the response. An alternative way to study the central location is median regression which is robust to heavy-tailed distributions. Quantile regression is a generalization of median regression to investigate the influence of the covariates on the quantiles/percentiles (entire distribution) of the response. It allows for a wide range of applications. For instance, investigating the 25th percentile of the response (e.g. weight of the child) might be of interest in studying severe malnutrition in children. In order to find the estimates, we need to minimize a quantile objective function. In contrast to that of mean regression, the objective function for quantile regression is not differentiable everywhere. Hence, the coefficient estimates have no explicit expression. A varying coefficient model is an extension of a classical linear regression model, where each coefficient is varying with another variable. This model is important when we have a complex data setting like longitudinal data. In such data scheme, it is intuitive to allow the coefficients to vary with `time'. We consider, in particular, a location-scale varying coefficient model. The key statistical tools are introduced in Chapter 1. Population conditional quantiles cannot cross for different quantile levels (percentiles). However, individual conditional quantile estimators can cross each other. To avoid these crossings, we use an approach called `AHe' (based on two assumptions). This approach enables us to estimate the scale (variability function), and by doing so estimate several quantiles with less computational time. In Chapter 2, we show the consistency of the proposed estimator theoretically and illustrate it in a simulation study. Since estimation of the quantiles other than the median relies on the variability function, it is important to identify the correct structure of the variability function (e.g. homoscedastic or time varying). Under a homoscedastic variability structure, we can conclude that the influence of the covariates on all other quantiles of the response is similar to that of the median. We develop a testing procedure (Likelihood-Ratio-Type test) to investigate the structure of the variability function. The performance of the testing procedure is shown in the simulation study. The estimation and testing procedures are illustrated on data examples. In Chapter 3, we focus on testing the shape of a coefficient function (e.g. constant, monotone or convex). Several testing procedures are proposed. A monotonicity test is important to check, for example, that the weight of a child is decreasing at some point in time. The performance of the testing procedures is illustrated in a simulation study. An application to a data example also illustrates the use of the procedures. To further simplify the model, in Chapter 4, two types of variable selection techniques called `grouped Adaptive Lasso' and `NonNegative Garrote' are proposed. The performance of these techniques is illustrated in a simulation study. In Chapter 5, we also propose another two-stage variable selection technique called NNG_SIS, in an ultrahigh-dimensional setting (when the number of coefficients to estimate is possibly much bigger than the number of observations). Simulation studies illustrate the performance of the procedures. We also demonstrate the use of the methods on data examples. Chapter 6 draws some conclusions with discussion of possible future perspectives. The R-code to implement the methodology developed in this dissertation is presented in Appendix A.
Keywords: adaptive lasso; heteroscedasticity; likelihood-ratio test; non-negative garrote; penalized splines; qualitative shape testing; quantile regression; varying coefficient models
Document URI: http://hdl.handle.net/1942/26069
Category: T1
Type: Theses and Dissertations
Appears in Collections:PhD theses
Research publications

Files in This Item:
File Description SizeFormat 
Doctoralthesis_MAIbrahim_digital.pdf1.34 MBAdobe PDFView/Open
Show full item record

Page view(s)

80
checked on Sep 6, 2022

Download(s)

70
checked on Sep 6, 2022

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.