TY - JOUR
T1 - Ant colony optimization as a feature selection method in the QSAR modeling of anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives using MLR, PLS and SVM regressions
AU - Goodarzi, Mohammad
AU - Freitas, Matheus P.
AU - Jensen, Richard
N1 - Funding Information:
CNPq is gratefully acknowledged for the fellowship (to M.P.F.), as is Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) of Argentina by the financial support.
PY - 2009/10/15
Y1 - 2009/10/15
N2 - A quantitative structure-activity relationship (QSAR) modeling was carried out for the anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives. The ant colony optimization (ACO) strategy was used as a feature selection (descriptor selection) and model development method. Modeling of the relationship between selected molecular descriptors and pEC50 data was achieved by linear (multiple linear regression-MLR, and partial least squares regression-PLS) and nonlinear (support-vector machine regression; SVMR) methods. The QSAR models were validated by cross-validation, as well as through the prediction of activities of an external set of compounds. Both linear and nonlinear methods were found to be better than a PLS-based method using forward stepwise (FS) selection, resulting in accurate predictions, especially for the SVM regression. The squared correlation coefficients of experimental versus predicted activities for the test set obtained by MLR, PLS and SVMR models using ACO feature selection were 0.942, 0.945 and 0.991, respectively.
AB - A quantitative structure-activity relationship (QSAR) modeling was carried out for the anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives. The ant colony optimization (ACO) strategy was used as a feature selection (descriptor selection) and model development method. Modeling of the relationship between selected molecular descriptors and pEC50 data was achieved by linear (multiple linear regression-MLR, and partial least squares regression-PLS) and nonlinear (support-vector machine regression; SVMR) methods. The QSAR models were validated by cross-validation, as well as through the prediction of activities of an external set of compounds. Both linear and nonlinear methods were found to be better than a PLS-based method using forward stepwise (FS) selection, resulting in accurate predictions, especially for the SVM regression. The squared correlation coefficients of experimental versus predicted activities for the test set obtained by MLR, PLS and SVMR models using ACO feature selection were 0.942, 0.945 and 0.991, respectively.
KW - 3-(3,5-Dimethylbenzyl)uracil derivatives
KW - Ant colony optimization
KW - Anti-HIV-1 activities
KW - Linear and nonlinear regression methods
KW - QSAR
UR - http://www.scopus.com/inward/record.url?scp=69349104350&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=69349104350&partnerID=8YFLogxK
U2 - 10.1016/j.chemolab.2009.05.005
DO - 10.1016/j.chemolab.2009.05.005
M3 - Article
AN - SCOPUS:69349104350
SN - 0169-7439
VL - 98
SP - 123
EP - 129
JO - Chemometrics and Intelligent Laboratory Systems
JF - Chemometrics and Intelligent Laboratory Systems
IS - 2
ER -