SPEED: SPectral Evaluation and Enhanced Deconvolution

PROSTKO, Piotr

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/42948

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Valkenborg, Dirk	-
dc.contributor.advisor	Khamiakova, Tatsiana	-
dc.contributor.advisor	Burzykowski, Tomasz	-
dc.contributor.author	PROSTKO, Piotr	-
dc.date.accessioned	2024-05-14T14:20:50Z	-
dc.date.available	2024-05-14T14:20:50Z	-
dc.date.issued	2024	-
dc.date.submitted	2024-04-24T14:16:31Z	-
dc.identifier.uri	http://hdl.handle.net/1942/42948	-
dc.description.abstract	The acronym in the dissertation’s title, SPEED, stands for SPectral Evaluation and Enhanced Deconvolution. Two words are context-specific and thus call for disambiguation. They are: spectral and deconvolution. Explaining their intended meaning should, at the same time, throw light on the subjects tackled in this dissertation. The term spectrum can be informally defined as a range of values or characteristics within a particular category. As such, this concept is in use in various scientific disciplines, for example • mathematics (spectral decomposition of square matrices based on eigenvalues and eigenvectors), • physics (the range of electromagnetic waves), • chemistry (studying molecules’ chemical properties with spectroscopic techniques that output spectra), • signal processing and statistics (in the frequency domain analysis), • medicine (autism spectrum, generalised anxiety spectrum, antimicrobial spectrum with regards to antibiotics, etc.). In fact, the spectra analysed throughout this doctoral project are, to some extent connected to all of these areas except mathematics. We say so because these spectra have been acquired with Nuclear Magnetic Resonance (NMR) spectroscopy and mass spectrometry (MS) and subsequently analysed with signal processing and statistical methods to characterise physico-chemical attributes of medicines. The second ambiguous word, convolution, denotes an operation on two functions, say, f(t) and g(t), given by y(t) = f(t) ∗ g(t) = Z g(t − τ )f(τ )dτ. (1.1) The interpretation of the equation above is that the signal of interest, f(t), becomes intertwined with a nuisance signal, g(t), giving rise to the observed information y(t). Extracting the functional form of f(t) from y(t) based on prior information about the form of g(t) is the essence of deconvolution. Besides the mathematical definition above, distinct and discipline-specific meanings of convolution and deconvolution exist (see Introduction in Ciach et al. (2020)). For example, charge deconvolution in mass spectrometry (MS) means converting mass-to-charge ratio (m/z) values of observed isotope patterns into mass values. We prefer to introduce an informal explanation of convolution (Figure 1.1) and use it throughout the rest of this dissertation. In Figure 1.1, the observed signal is a function of the information of interest blended with nuisance information (e.g., from the sample matrix or impurities). A reconstruction algorithm that uses a known form of the underlying function can be applied to remove the nuisance component and estimate (deconvolute) the unobserved signals of interest. Therefore, deconvolution is, in essence, a statistical modelling procedure that operates on the observed data and requires prior information about the unobserved individual signals. Figure 1.1. Due to various physical, chemical, or biological influences (‘physicochemical aberration’ in the figure), the initial individual signals (data x - data z) become convoluted and lead to convoluted data measurements. Reversing this process by exploiting prior information and estimating the underlying individual signals is the main task of deconvolution. Since deconvolution attempts to extract the unobserved signal(s), the quality of that reconstruction depends on underlying assumptions. First, the type of availableprior information regarding the signals of interest and the aberration/nuisance process strongly impacts the deconvolution outcome. Second, the signal processing conducted before or concurrently with deconvolution may also interact with the quality of the final results. These two remarks should be heartfeltly embraced when designing a deconvolution process. This is possible only if one gets a profound subject-matter knowledge. In this application-oriented research, the subject-matter knowledge has been abundantly available via the industrial partner - Janssen Pharmaceutica, operating in Beerse, Belgium. During our collaboration, Janssen Pharmaceutica’s expert scientists have provided the necessary data for developing (and validating) deconvolution models and have given indispensable insights on spectral data acquisition and analysis results. They have also provided prior information that greatly assisted our deconvolution solutions. As such, the problems and solutions presented in the upcoming dissertation chapters are well-grounded in the day-to-day operations of a large pharmaceutical organisation. A major pharmaceutical company can house a variety of statistical activities, not only commonly known clinical trials. Now, we briefly review the steps of drug research and development (R&D) to understand better where precisely in this process this doctoral project is rooted and what type of statistics we have been occupied with. In the book Nonclinical Statistics for Pharmaceutical and Biotechnology Industries, Zhang (2016) distinguish four development parts: discovery, nonclinical development, clinical development, and post-approval development (Figure 1.2). The information provided in the next couple of paragraphs closely follows the book of Zhang (2016). The discovery stage focuses on identifying a disease-related target, typically a gene or protein, that can be modified to influence disease progression. This involves, e.g., differential expression testing to come up with target candidates. To be deemed useful, the candidates have to pass validation analyses. Once the target is in place, the next stage is to find molecules that interact with that target and exhibit drug-like characteristics. This process is called lead generation. Boosting selected properties of the generated leads by introducing various modifications is lead optimisation. Those optimised leads proceed to the next stage, i.e., nonclinical development. In the nonclinical phase, the optimised leads (a.k.a. drug candidates) undergo various studies to check their overall safety, toxicity, and pharmacological profiles. These studies can be in vitro or in vivo (animal models). After favourable safety readouts, the drug candidates may enter clinical development. The clinical phases are not in the scope of this research, and therefore, we only remark that the results from clinical development may influence nonclinical development activities and vice versa. From discovery until post-approval development (again Figure 1.2), statistical thinking supports and drives many development processes. Depending on the current development stage and type of studies of interest, recommended statistical tools may differ, as well as the terminology used to describe those statistical efforts. Figure 1.3 is an attempt at the classification of the branches of statistics that are typically utilised in drug R&D. This dissertation takes particular interest in nonclinical statistics, and in Chemistry, Manufacturing and Control (CMC) in particular. CMC encompasses varied activities for ensuring the uninterrupted supply of quality pharmaceutical products through the entire development cycle and after regulatory approval. It is important to realise that the CMC efforts and demand for the drug compound evolve together with the progression through the development stages. For example, the drug amount needed for initial testing in the discovery step is small and can be synthesised by discovery scientists. The purity of the synthesised drug is assessed, but only minimal effort is invested in identifying/quantifying its impurities. In the preclinical stage (a part of nonclinical development), however, the demand grows, and hence, an initial manufacturing process with quality control is established. The quality control aspect necessitates setting up analytical methods (e.g., based on MS or NMR measurements) for the characterisation and impurity profiling of the drug candidate substance. Next, for clinical trials in humans, a ‘nearly optimal’ and validated. drug formulation should be in place. Note that a drug formulation is considered as all manufacturing steps, starting from combining the Active Pharmaceutical Ingredient (API) with other pharmaceutical substances (excipients) until the final drug product (Figure 1.4). The clinical stage is also the right moment for conducting stability studies in various storage conditions to determine the shelf life of the drug product.After this short excursion into drug R&D and CMC operations, one should be able to imagine that a diverse set of analytical techniques and laboratory equipment are needed to accomplish the earlier-mentioned goals at each stage of drug development. NMR and MS are two of many frequently used technologies for CMC tasks. However, the vast amounts of the NMR- and MS-generated data suffer from multiple ambiguities, among others, signal overlap. This means that the observed data is a mixture of signals, e.g., of an active substance manufactured according to the specification and a variant containing an impurity. Therefore, disentangling (deconvoluting) and quantifying these different sources is an important objective in drug development in general, and in, for instance, determining the compound’s purity or conducting stability studies. Eventually, we hope that the achieved results, together with the ideas for future research, will promote further development and usage of statistical and numerical methods across departments in pharmaceutical businesses. Specifically, we expect that the statistics-based deconvolution of NMR and MS spectral information in drug manufacturing will increase the confidence and reliability of the interpreted analytical data. In turn, this should speed up selected CMC activities in pharmaceutical drug development and control of marketed products.	-
dc.language.iso	en	-
dc.title	SPEED: SPectral Evaluation and Enhanced Deconvolution	-
dc.type	Theses and Dissertations	-
local.format.pages	314	-
local.bibliographicCitation.jcat	T1	-
local.type.refereed	Non-Refereed	-
local.type.specified	Phd thesis	-
local.provider.type	Pdf	-
local.uhasselt.international	no	-
item.contributor	PROSTKO, Piotr	-
item.accessRights	Embargoed Access	-
item.fullcitation	PROSTKO, Piotr (2024) SPEED: SPectral Evaluation and Enhanced Deconvolution.	-
item.fulltext	With Fulltext	-
item.embargoEndDate	2029-05-14	-
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
Binder1.pdf Until 2029-05-14		13.44 MB	Adobe PDF	View/Open Request a copy

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Google Scholar^TM