The pKa values of a series of 107 indicators have been modeled by means of a quantitative structure-property relationship (QSPR) approach based on physicochemical descriptors and different variable selection and regression methods. A genetic algorithm/least square support vector regression (GA-LSSVR) model gave the most accurate estimations/predictions, with squared correlation coefficients of 0.90 and 0.89 for the training and test set compounds, respectively. The prediction ability of this model was found to be superior to that based on support vector machine regression alone, revealing the important effect of selecting suitable descriptors during a QSPR modeling. Moreover, the GA-LSSVR model showed higher predictive capability than linear methods, demonstrating the influence of nonlinearity on the modeling of pKa values, an extremely useful parameter in the analytical sciences.
- PH indicators
- Quantitative structure-property relationships
- Support vector machines
ASJC Scopus subject areas
- Analytical Chemistry
- Process Chemistry and Technology
- Computer Science Applications