Feature selection and linear/nonlinear regression methods for the accurate prediction of glycogen synthase kinase-3β inhibitory activities

Mohammad Goodarzi, Matheus P. Freitas, Richard Jensen

Research output: Contribution to journalArticle

52 Citations (Scopus)

Abstract

Few variables were selected from a pool of calculated Dragon descriptors through three different feature selection methods, namely genetic algorithm (GA), successive projections algorithm (SPA), and fuzzy rough set ant colony optimization (fuzzy rough set ACO). Each set of selected descriptors was regressed against the bioactivities of a series of glycogen synthase kinase-3β (GSK-3β) inhibitors, through linear and nonlinear regression methods, namely multiple linear regression (MLR), artificial neural network (ANN), and support vector machines (SVM). The fuzzy rough set ACO/SVM-based model gave the best estimation/prediction results, demonstrating the nonlinear nature of this analysis and suggesting fuzzy rough set ACO, first introduced in chemistry here, as an improved variable selection method in QSAR for the class of GSK-3β inhibitors.

Original languageEnglish (US)
Pages (from-to)824-832
Number of pages9
JournalJournal of Chemical Information and Modeling
Volume49
Issue number4
DOIs
StatePublished - Apr 27 2009

Fingerprint

Glycogen Synthase Kinase 3
Support vector machines
Feature extraction
regression
Ant colony optimization
Bioactivity
Linear regression
Genetic algorithms
Neural networks
neural network
projection
chemistry
Glycogen

ASJC Scopus subject areas

  • Chemistry(all)
  • Chemical Engineering(all)
  • Computer Science Applications
  • Library and Information Sciences

Cite this

Feature selection and linear/nonlinear regression methods for the accurate prediction of glycogen synthase kinase-3β inhibitory activities. / Goodarzi, Mohammad; Freitas, Matheus P.; Jensen, Richard.

In: Journal of Chemical Information and Modeling, Vol. 49, No. 4, 27.04.2009, p. 824-832.

Research output: Contribution to journalArticle

@article{037768d1c1f9462fa53235f7a2dba0d3,
title = "Feature selection and linear/nonlinear regression methods for the accurate prediction of glycogen synthase kinase-3β inhibitory activities",
abstract = "Few variables were selected from a pool of calculated Dragon descriptors through three different feature selection methods, namely genetic algorithm (GA), successive projections algorithm (SPA), and fuzzy rough set ant colony optimization (fuzzy rough set ACO). Each set of selected descriptors was regressed against the bioactivities of a series of glycogen synthase kinase-3β (GSK-3β) inhibitors, through linear and nonlinear regression methods, namely multiple linear regression (MLR), artificial neural network (ANN), and support vector machines (SVM). The fuzzy rough set ACO/SVM-based model gave the best estimation/prediction results, demonstrating the nonlinear nature of this analysis and suggesting fuzzy rough set ACO, first introduced in chemistry here, as an improved variable selection method in QSAR for the class of GSK-3β inhibitors.",
author = "Mohammad Goodarzi and Freitas, {Matheus P.} and Richard Jensen",
year = "2009",
month = "4",
day = "27",
doi = "10.1021/ci9000103",
language = "English (US)",
volume = "49",
pages = "824--832",
journal = "Journal of Chemical Information and Modeling",
issn = "1549-9596",
publisher = "American Chemical Society",
number = "4",

}

TY - JOUR

T1 - Feature selection and linear/nonlinear regression methods for the accurate prediction of glycogen synthase kinase-3β inhibitory activities

AU - Goodarzi, Mohammad

AU - Freitas, Matheus P.

AU - Jensen, Richard

PY - 2009/4/27

Y1 - 2009/4/27

N2 - Few variables were selected from a pool of calculated Dragon descriptors through three different feature selection methods, namely genetic algorithm (GA), successive projections algorithm (SPA), and fuzzy rough set ant colony optimization (fuzzy rough set ACO). Each set of selected descriptors was regressed against the bioactivities of a series of glycogen synthase kinase-3β (GSK-3β) inhibitors, through linear and nonlinear regression methods, namely multiple linear regression (MLR), artificial neural network (ANN), and support vector machines (SVM). The fuzzy rough set ACO/SVM-based model gave the best estimation/prediction results, demonstrating the nonlinear nature of this analysis and suggesting fuzzy rough set ACO, first introduced in chemistry here, as an improved variable selection method in QSAR for the class of GSK-3β inhibitors.

AB - Few variables were selected from a pool of calculated Dragon descriptors through three different feature selection methods, namely genetic algorithm (GA), successive projections algorithm (SPA), and fuzzy rough set ant colony optimization (fuzzy rough set ACO). Each set of selected descriptors was regressed against the bioactivities of a series of glycogen synthase kinase-3β (GSK-3β) inhibitors, through linear and nonlinear regression methods, namely multiple linear regression (MLR), artificial neural network (ANN), and support vector machines (SVM). The fuzzy rough set ACO/SVM-based model gave the best estimation/prediction results, demonstrating the nonlinear nature of this analysis and suggesting fuzzy rough set ACO, first introduced in chemistry here, as an improved variable selection method in QSAR for the class of GSK-3β inhibitors.

UR - http://www.scopus.com/inward/record.url?scp=66149118918&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=66149118918&partnerID=8YFLogxK

U2 - 10.1021/ci9000103

DO - 10.1021/ci9000103

M3 - Article

C2 - 19338295

AN - SCOPUS:66149118918

VL - 49

SP - 824

EP - 832

JO - Journal of Chemical Information and Modeling

JF - Journal of Chemical Information and Modeling

SN - 1549-9596

IS - 4

ER -