Towards better understanding of feature-selection or reduction techniques for Quantitative Structure-Activity Relationship models

Mohammad Goodarzi, Yvan Vander Heyden, Simona Funar-Timofei

Research output: Contribution to journalReview article

39 Citations (Scopus)

Abstract

A Quantitative Structure-Activity Relationship (QSAR) is a linear or non-linear model, which relates variations in molecular descriptors to variations in the biological activity of a series of active and/or inactive molecules. For this article, different feature-selection or reduction methods were all coupled with Partial Least Squares (PLS) modeling during the selection of features. A PLS model was also built with the entire set of molecular descriptors and was used as a reference to check the reliability and the performance of the different feature-selection methods. To evaluate the ability of the different feature-selection methods, they were performed on two data sets.

Original languageEnglish (US)
Pages (from-to)49-63
Number of pages15
JournalTrAC - Trends in Analytical Chemistry
Volume42
DOIs
StatePublished - Jan 1 2013

Fingerprint

Feature extraction
Bioactivity
Molecules
structure-activity relationship
modeling
method

Keywords

  • Aldose-reductase inhibitor
  • Biological activity
  • Biological property
  • Feature reduction
  • Feature selection
  • Molecular descriptor
  • Multiple Linear Regression (MLR)
  • Partial Least Squares (PLS)
  • Quantitative Structure-Activity Relationship (QSAR)
  • Rho kinase (ROCK) inhibitor

ASJC Scopus subject areas

  • Analytical Chemistry
  • Environmental Chemistry
  • Spectroscopy

Cite this

Towards better understanding of feature-selection or reduction techniques for Quantitative Structure-Activity Relationship models. / Goodarzi, Mohammad; Heyden, Yvan Vander; Funar-Timofei, Simona.

In: TrAC - Trends in Analytical Chemistry, Vol. 42, 01.01.2013, p. 49-63.

Research output: Contribution to journalReview article

@article{58d786fca0bf4a1596466eec328f60ab,
title = "Towards better understanding of feature-selection or reduction techniques for Quantitative Structure-Activity Relationship models",
abstract = "A Quantitative Structure-Activity Relationship (QSAR) is a linear or non-linear model, which relates variations in molecular descriptors to variations in the biological activity of a series of active and/or inactive molecules. For this article, different feature-selection or reduction methods were all coupled with Partial Least Squares (PLS) modeling during the selection of features. A PLS model was also built with the entire set of molecular descriptors and was used as a reference to check the reliability and the performance of the different feature-selection methods. To evaluate the ability of the different feature-selection methods, they were performed on two data sets.",
keywords = "Aldose-reductase inhibitor, Biological activity, Biological property, Feature reduction, Feature selection, Molecular descriptor, Multiple Linear Regression (MLR), Partial Least Squares (PLS), Quantitative Structure-Activity Relationship (QSAR), Rho kinase (ROCK) inhibitor",
author = "Mohammad Goodarzi and Heyden, {Yvan Vander} and Simona Funar-Timofei",
year = "2013",
month = "1",
day = "1",
doi = "10.1016/j.trac.2012.09.008",
language = "English (US)",
volume = "42",
pages = "49--63",
journal = "TrAC - Trends in Analytical Chemistry",
issn = "0165-9936",
publisher = "Elsevier",

}

TY - JOUR

T1 - Towards better understanding of feature-selection or reduction techniques for Quantitative Structure-Activity Relationship models

AU - Goodarzi, Mohammad

AU - Heyden, Yvan Vander

AU - Funar-Timofei, Simona

PY - 2013/1/1

Y1 - 2013/1/1

N2 - A Quantitative Structure-Activity Relationship (QSAR) is a linear or non-linear model, which relates variations in molecular descriptors to variations in the biological activity of a series of active and/or inactive molecules. For this article, different feature-selection or reduction methods were all coupled with Partial Least Squares (PLS) modeling during the selection of features. A PLS model was also built with the entire set of molecular descriptors and was used as a reference to check the reliability and the performance of the different feature-selection methods. To evaluate the ability of the different feature-selection methods, they were performed on two data sets.

AB - A Quantitative Structure-Activity Relationship (QSAR) is a linear or non-linear model, which relates variations in molecular descriptors to variations in the biological activity of a series of active and/or inactive molecules. For this article, different feature-selection or reduction methods were all coupled with Partial Least Squares (PLS) modeling during the selection of features. A PLS model was also built with the entire set of molecular descriptors and was used as a reference to check the reliability and the performance of the different feature-selection methods. To evaluate the ability of the different feature-selection methods, they were performed on two data sets.

KW - Aldose-reductase inhibitor

KW - Biological activity

KW - Biological property

KW - Feature reduction

KW - Feature selection

KW - Molecular descriptor

KW - Multiple Linear Regression (MLR)

KW - Partial Least Squares (PLS)

KW - Quantitative Structure-Activity Relationship (QSAR)

KW - Rho kinase (ROCK) inhibitor

UR - http://www.scopus.com/inward/record.url?scp=84871519436&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84871519436&partnerID=8YFLogxK

U2 - 10.1016/j.trac.2012.09.008

DO - 10.1016/j.trac.2012.09.008

M3 - Review article

AN - SCOPUS:84871519436

VL - 42

SP - 49

EP - 63

JO - TrAC - Trends in Analytical Chemistry

JF - TrAC - Trends in Analytical Chemistry

SN - 0165-9936

ER -