pKa modeling and prediction of a series of pH indicators through genetic algorithm-least square support vector regression

Mohammad Goodarzi, Matheus P. Freitas, Chih H. Wu, Pablo R. Duchowicz

Research output: Contribution to journalArticle

36 Scopus citations


The pKa values of a series of 107 indicators have been modeled by means of a quantitative structure-property relationship (QSPR) approach based on physicochemical descriptors and different variable selection and regression methods. A genetic algorithm/least square support vector regression (GA-LSSVR) model gave the most accurate estimations/predictions, with squared correlation coefficients of 0.90 and 0.89 for the training and test set compounds, respectively. The prediction ability of this model was found to be superior to that based on support vector machine regression alone, revealing the important effect of selecting suitable descriptors during a QSPR modeling. Moreover, the GA-LSSVR model showed higher predictive capability than linear methods, demonstrating the influence of nonlinearity on the modeling of pKa values, an extremely useful parameter in the analytical sciences.

Original languageEnglish (US)
Pages (from-to)102-109
Number of pages8
JournalChemometrics and Intelligent Laboratory Systems
Issue number2
StatePublished - Apr 1 2010



  • PH indicators
  • Quantitative structure-property relationships
  • Support vector machines
  • pK

ASJC Scopus subject areas

  • Analytical Chemistry
  • Software
  • Process Chemistry and Technology
  • Spectroscopy
  • Computer Science Applications

Cite this