Constrained Standardization of Count Data from Massive Parallel Sequencing

VAN HOUTVEN, Joris; Cuypers , B; Meysman, P; HOOYBERGHS, Jef; Laukens, K; VALKENBORG, Dirk

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/36014

Title:	Constrained Standardization of Count Data from Massive Parallel Sequencing
Authors:	VAN HOUTVEN, Joris Cuypers , B Meysman, P HOOYBERGHS, Jef Laukens, K VALKENBORG, Dirk
Issue Date:	2021
Publisher:	ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
Source:	Journal of Molecular Biology, 433 (11) (Art N°166966)
Abstract:	In high-throughput omics disciplines like transcriptomics, researchers face a need to assess the quality of an experiment prior to an in-depth statistical analysis. To efficiently analyze such voluminous collections of data, researchers need triage methods that are both quick and easy to use. Such a normalization method for relative quantitation, CONSTANd, was recently introduced for isobarically-labeled mass spectra in proteomics. It transforms the data matrix of abundances through an iterative, convergent process enforcing three constraints: (I) identical column sums; (II) each row sum is fixed (across matrices) and (III) identical to all other row sums. In this study, we investigate whether CONSTANd is suitable for count data from massively parallel sequencing, by qualitatively comparing its results to those of DESeq2. Further, we propose an adjustment of the method so that it may be applied to identically balanced but differently sized experiments for joint analysis. We find that CONSTANd can process large data sets at well over 1 million count records per second whilst mitigating unwanted systematic bias and thus quickly uncovering the underlying biological structure when combined with a PCA plot or hierarchical clustering. Moreover, it allows joint analysis of data sets obtained from different batches, with different protocols and from different labs but without exploiting information from the experimental setup other than the delineation of samples into identically processed sets (IPSs). CONSTANd's simplicity and applicability to proteomics as well as transcriptomics data make it an interesting candidate for integration in multi-omics workflows. (C) 2021 Elsevier Ltd. All rights reserved.
Keywords:	normalization;RNA-seq;transcriptomics;proteomics;multi-omics
Document URI:	http://hdl.handle.net/1942/36014
ISSN:	0022-2836
e-ISSN:	1089-8638
DOI:	10.1016/j.jmb.2021.166966
ISI #:	000648520800028
Rights:	2021 Elsevier Ltd. All rights reserved.
Category:	A1
Type:	Journal Contribution
Validations:	ecoom 2022
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
Constrained Standardization of Count Data from Massive Parallel Sequencing.pdf Restricted Access	Published version	1.23 MB	Adobe PDF	View/Open Request a copy
CONSTANd RNAseq rev1.pdf	Peer-reviewed author version	680.85 kB	Adobe PDF	View/Open

Show full item record

SCOPUS^TM
Citations

2

checked on Feb 23, 2026

WEB OF SCIENCE^TM
Citations

1

checked on Feb 22, 2026

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM