Interpretable linear and nonlinear quantitative structure-selectivity relationship (QSSR) modeling of a biomimetic catalytic system by particle swarm optimization based sparse regression

Lu Xu, Hai Yan Fu, Qiao Bo Yin, Yao Fan, Mohammad Goodarzi, Yuan Bin She

Research output: Contribution to journalArticle

2 Scopus citations


A particle swarm optimization (PSO) based sparse regression (PSO-SR) strategy was proposed to study the quantitative structure-selectivity relationship (QSSR) of a biomimetic catalytic system, where the selectivity in the mild oxidation of o-nitrotoluene to o-nitrobenzaldehyde was related to the molecular descriptors of 48 metalloporphyrin catalysts. PSO was used to obtain an optimal variable combination for linear or nonlinear models. For nonlinear modeling, a set of 44 nonlinear transforms were developed for each single descriptor. To enable model interpretability and reduce the risk of overfitting, the total descriptors were divided into subclasses and the selected variables were forced to be sparsely distributed in each subclass. Model complexity was controlled by adjusting the maximum total number of variables included. Accurate linear and nonlinear PSO-SR models were developed using multiple linear regression (MLR) and partial least squares (PLS) and validated by randomly and repeatedly splitting the data into training and test objects for 500 times. The best predictions were obtained with 10 variables with linear (Q2=0.9460) and nonlinear (Q2=0.9505) models. The results indicate PSO-SR could provide an effective and useful strategy for modeling and interpreting complex QSSR problems. The proposed nonlinear modeling method could provide more information for model interpretation by probing and catching the unknown nonlinear relationship between a descriptor and the observed selectivity.

Original languageEnglish (US)
Pages (from-to)187-195
Number of pages9
JournalChemometrics and Intelligent Laboratory Systems
StatePublished - Dec 15 2016



  • Metalloporphyrin catalysts
  • Quantitative structure-activity relationship (QSAR)
  • Quantitative structure-selectivity relationship (QSSR)
  • Selective oxidation
  • Sparse regression (SR)

ASJC Scopus subject areas

  • Analytical Chemistry
  • Software
  • Computer Science Applications
  • Process Chemistry and Technology
  • Spectroscopy

Cite this