Challenges of large-class-number classification (LCNC): A novel ensemble strategy (ES) and its application to discriminating the geographical origins of 25 green teas

Hai Yan Fu, Qiao Bo Yin, Lu Xu, Mohammad Goodarzi, Tian Ming Yang, Gang Feng Li, FengQiao, Yuan Bin She

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

Large-class-number classification (LCNC) would bring new challenges to pattern recognition due to increased data complexity and class overlapping. In this study, a novel ensemble strategy (ES) was proposed to tackle LCNC problems. By combining the One-Versus-Rest (OVR) and One-Versus-One (OVO) strategies to design a set of classifiers with reduced class numbers, ES assigns a new object to the class receiving the most votes. When two or more classes obtain the most votes, an additional OVR model is developed to discriminate them. ES, OVR, OVO and the softmax function were investigated to discriminate the geographical origins of 25 green tea samples using near-infrared (NIR) spectroscopy and Partial Least Squares Discriminant Analysis (PLSDA). Using the Standard Normal Variate (SNV) as a spectral scatter correction technique, the total accuracy was 0.6468 for OVR-PLSDA, 0.8494 for OVO-PLSDA, 0.9299 for PLSDA-softmax, and 0.9377 for ES-PLSDA, respectively. Using other preprocessing methods and multiple random splitting of the data sets obtained the similar results. The poor performance of OVR can be attributed to the increased possibility of class overlapping and high sub-model complexity. OVO was less influenced by LCNC because it is based on a set of relatively simpler two-class classifiers. PLSDA-softmax could overcome the class overlapping by nonlinear transformations. ES was demonstrated to be capable of extracting more useful information from sub-models and achieved improved performance in LCNC.

Original languageEnglish (US)
Pages (from-to)43-49
Number of pages7
JournalChemometrics and Intelligent Laboratory Systems
Volume157
DOIs
StatePublished - Oct 15 2016

Keywords

  • Ensemble strategy (ES)
  • Geographical origins of green teas
  • Large-class-number classification (LCNC)
  • Near-infrared (NIR) spectroscopy
  • One-Versus-One (OVO)
  • One-Versus-Rest (OVR)

ASJC Scopus subject areas

  • Analytical Chemistry
  • Software
  • Process Chemistry and Technology
  • Spectroscopy
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Challenges of large-class-number classification (LCNC): A novel ensemble strategy (ES) and its application to discriminating the geographical origins of 25 green teas'. Together they form a unique fingerprint.

Cite this