Prediction of the acidic dissociation constant (pKa) of some organic compounds using linear and nonlinear QSPR methods

Nasser Goudarzi, Mohammad Goodarzi

Research output: Contribution to journalArticlepeer-review

22 Scopus citations

Abstract

In this work, some chemometrics methods were applied for modelling and prediction of the acidic dissociation constants of some organic compounds with descriptors calculated from the molecular structure alone. The stepwise multiple linear regression method was used to select descriptors which are responsible for the pKa of these compounds. Then support vector machine (SVM), principal component regression (PCR), partial least squares (PLS) and multiple linear regression (MLR) were utilized to construct the nonlinear and linear quantitative structure-activity relationship models. The results obtained using SVM were compared with PLS, PCR and MLR, revealing that the SVM model was much better than other models. The root-mean-square errors of the training set and the test set for the SVM model are 0.2551 and 0.6139, and the correlation coefficients were 0.9936 and 0.9919, respectively. This paper provides a new and effective method for predicting pKa of organic compounds, and also reveals that SVM can be used as a powerful chemometrics tool for QSPR studies. Finally, results have shown that the SVM drastically enhances the ability of prediction in QSAR studies superior to multiple linear regression, principal component regression and partial least squares.

Original languageEnglish (US)
Pages (from-to)1495-1503
Number of pages9
JournalMolecular Physics
Volume107
Issue number14
DOIs
StatePublished - Jan 2009

Keywords

  • MLR
  • PLS
  • Principal component regression
  • Quantitative structure-property relationship
  • Support vector machines

ASJC Scopus subject areas

  • Biophysics
  • Molecular Biology
  • Condensed Matter Physics
  • Physical and Theoretical Chemistry

Fingerprint

Dive into the research topics of 'Prediction of the acidic dissociation constant (pKa) of some organic compounds using linear and nonlinear QSPR methods'. Together they form a unique fingerprint.

Cite this