TY - JOUR
T1 - Improvement of quantitative structure–retention relationship models for chromatographic retention prediction of peptides applying individual local partial least squares models
AU - Andries, Jan P.M.
AU - Goodarzi, Mohammad
AU - Heyden, Yvan Vander
N1 - Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/11/1
Y1 - 2020/11/1
N2 - In Reversed-Phase Liquid Chromatography, Quantitative Structure–Retention Relationship (QSRR) models for retention prediction of peptides can be built, starting from large sets of theoretical molecular descriptors. Good predictive QSRR models can be obtained after selecting the most informative descriptors. Reliable retention prediction may be an aid in the correct identification of proteins/peptides in proteomics and in chromatographic method development. Traditionally, global QSRR models are built, using a calibration set containing a representative range of analytes. In this study, a strategy is presented to build individual local Partial Least Squares (PLS) models for peptides, based on selected local calibration samples, most similar to the specific query peptide to be predicted. Similar local calibration peptides are selected from a possible calibration set. The calibration samples with the lowest Euclidian distances to the query peptide are considered as most similar. Two Euclidian distances are investigated as similarity parameter, (i) in the autoscaled descriptor space and, (ii) in the PLS factor space of the global calibration samples, both after variable selection by the Final Complexity Adapted Models (FCAM) method. The predictive abilities of individual local QSRR PLS models for peptides, developed with both Euclidian distances, are found significantly better than those of two global models, i.e. before and after FCAM variable selection. The predictive abilities of the local models, developed with distances calculated in the PLS factor space, were best.
AB - In Reversed-Phase Liquid Chromatography, Quantitative Structure–Retention Relationship (QSRR) models for retention prediction of peptides can be built, starting from large sets of theoretical molecular descriptors. Good predictive QSRR models can be obtained after selecting the most informative descriptors. Reliable retention prediction may be an aid in the correct identification of proteins/peptides in proteomics and in chromatographic method development. Traditionally, global QSRR models are built, using a calibration set containing a representative range of analytes. In this study, a strategy is presented to build individual local Partial Least Squares (PLS) models for peptides, based on selected local calibration samples, most similar to the specific query peptide to be predicted. Similar local calibration peptides are selected from a possible calibration set. The calibration samples with the lowest Euclidian distances to the query peptide are considered as most similar. Two Euclidian distances are investigated as similarity parameter, (i) in the autoscaled descriptor space and, (ii) in the PLS factor space of the global calibration samples, both after variable selection by the Final Complexity Adapted Models (FCAM) method. The predictive abilities of individual local QSRR PLS models for peptides, developed with both Euclidian distances, are found significantly better than those of two global models, i.e. before and after FCAM variable selection. The predictive abilities of the local models, developed with distances calculated in the PLS factor space, were best.
KW - Final complexity adapted models (FCAM)
KW - Local models
KW - Molecular descriptors
KW - Partial least squares
KW - Peptides
KW - Quantitative Structure–Retention relationships (QSRR)
UR - http://www.scopus.com/inward/record.url?scp=85086826385&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85086826385&partnerID=8YFLogxK
U2 - 10.1016/j.talanta.2020.121266
DO - 10.1016/j.talanta.2020.121266
M3 - Article
C2 - 32887157
AN - SCOPUS:85086826385
SN - 0039-9140
VL - 219
JO - Talanta
JF - Talanta
M1 - 121266
ER -