Ant colony optimization as a feature selection method in the QSAR modeling of anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives using MLR, PLS and SVM regressions

Mohammad Goodarzi, Matheus P. Freitas, Richard Jensen

Research output: Contribution to journalArticle

57 Citations (Scopus)

Abstract

A quantitative structure-activity relationship (QSAR) modeling was carried out for the anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives. The ant colony optimization (ACO) strategy was used as a feature selection (descriptor selection) and model development method. Modeling of the relationship between selected molecular descriptors and pEC50 data was achieved by linear (multiple linear regression-MLR, and partial least squares regression-PLS) and nonlinear (support-vector machine regression; SVMR) methods. The QSAR models were validated by cross-validation, as well as through the prediction of activities of an external set of compounds. Both linear and nonlinear methods were found to be better than a PLS-based method using forward stepwise (FS) selection, resulting in accurate predictions, especially for the SVM regression. The squared correlation coefficients of experimental versus predicted activities for the test set obtained by MLR, PLS and SVMR models using ACO feature selection were 0.942, 0.945 and 0.991, respectively.

Original languageEnglish (US)
Pages (from-to)123-129
Number of pages7
JournalChemometrics and Intelligent Laboratory Systems
Volume98
Issue number2
DOIs
StatePublished - Oct 15 2009

Fingerprint

Ant colony optimization
Feature extraction
Derivatives
Linear regression
Support vector machines
3-(3,5-dimethylbenzyl)uracil

Keywords

  • 3-(3,5-Dimethylbenzyl)uracil derivatives
  • Ant colony optimization
  • Anti-HIV-1 activities
  • Linear and nonlinear regression methods
  • QSAR

ASJC Scopus subject areas

  • Analytical Chemistry
  • Computer Science Applications
  • Software
  • Process Chemistry and Technology
  • Spectroscopy

Cite this

@article{93d79c9802034d7c9857946d204a8bb9,
title = "Ant colony optimization as a feature selection method in the QSAR modeling of anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives using MLR, PLS and SVM regressions",
abstract = "A quantitative structure-activity relationship (QSAR) modeling was carried out for the anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives. The ant colony optimization (ACO) strategy was used as a feature selection (descriptor selection) and model development method. Modeling of the relationship between selected molecular descriptors and pEC50 data was achieved by linear (multiple linear regression-MLR, and partial least squares regression-PLS) and nonlinear (support-vector machine regression; SVMR) methods. The QSAR models were validated by cross-validation, as well as through the prediction of activities of an external set of compounds. Both linear and nonlinear methods were found to be better than a PLS-based method using forward stepwise (FS) selection, resulting in accurate predictions, especially for the SVM regression. The squared correlation coefficients of experimental versus predicted activities for the test set obtained by MLR, PLS and SVMR models using ACO feature selection were 0.942, 0.945 and 0.991, respectively.",
keywords = "3-(3,5-Dimethylbenzyl)uracil derivatives, Ant colony optimization, Anti-HIV-1 activities, Linear and nonlinear regression methods, QSAR",
author = "Mohammad Goodarzi and Freitas, {Matheus P.} and Richard Jensen",
year = "2009",
month = "10",
day = "15",
doi = "10.1016/j.chemolab.2009.05.005",
language = "English (US)",
volume = "98",
pages = "123--129",
journal = "Chemometrics and Intelligent Laboratory Systems",
issn = "0169-7439",
publisher = "Elsevier",
number = "2",

}

TY - JOUR

T1 - Ant colony optimization as a feature selection method in the QSAR modeling of anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives using MLR, PLS and SVM regressions

AU - Goodarzi, Mohammad

AU - Freitas, Matheus P.

AU - Jensen, Richard

PY - 2009/10/15

Y1 - 2009/10/15

N2 - A quantitative structure-activity relationship (QSAR) modeling was carried out for the anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives. The ant colony optimization (ACO) strategy was used as a feature selection (descriptor selection) and model development method. Modeling of the relationship between selected molecular descriptors and pEC50 data was achieved by linear (multiple linear regression-MLR, and partial least squares regression-PLS) and nonlinear (support-vector machine regression; SVMR) methods. The QSAR models were validated by cross-validation, as well as through the prediction of activities of an external set of compounds. Both linear and nonlinear methods were found to be better than a PLS-based method using forward stepwise (FS) selection, resulting in accurate predictions, especially for the SVM regression. The squared correlation coefficients of experimental versus predicted activities for the test set obtained by MLR, PLS and SVMR models using ACO feature selection were 0.942, 0.945 and 0.991, respectively.

AB - A quantitative structure-activity relationship (QSAR) modeling was carried out for the anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives. The ant colony optimization (ACO) strategy was used as a feature selection (descriptor selection) and model development method. Modeling of the relationship between selected molecular descriptors and pEC50 data was achieved by linear (multiple linear regression-MLR, and partial least squares regression-PLS) and nonlinear (support-vector machine regression; SVMR) methods. The QSAR models were validated by cross-validation, as well as through the prediction of activities of an external set of compounds. Both linear and nonlinear methods were found to be better than a PLS-based method using forward stepwise (FS) selection, resulting in accurate predictions, especially for the SVM regression. The squared correlation coefficients of experimental versus predicted activities for the test set obtained by MLR, PLS and SVMR models using ACO feature selection were 0.942, 0.945 and 0.991, respectively.

KW - 3-(3,5-Dimethylbenzyl)uracil derivatives

KW - Ant colony optimization

KW - Anti-HIV-1 activities

KW - Linear and nonlinear regression methods

KW - QSAR

UR - http://www.scopus.com/inward/record.url?scp=69349104350&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=69349104350&partnerID=8YFLogxK

U2 - 10.1016/j.chemolab.2009.05.005

DO - 10.1016/j.chemolab.2009.05.005

M3 - Article

AN - SCOPUS:69349104350

VL - 98

SP - 123

EP - 129

JO - Chemometrics and Intelligent Laboratory Systems

JF - Chemometrics and Intelligent Laboratory Systems

SN - 0169-7439

IS - 2

ER -