An ensemble approach to multi-label classification of textual data

Karol Kurach, Krzysztof Pawłowski, Łukasz Romaszko, Marcin Tatjewski, Andrzej Janusz, Hung Son Nguyen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

In this paper, we investigate different approaches to multi-label classification of textual data, with a special focus on ensemble techniques. Commonly used classifier ensembles combine outputs of base learning models in order to enhance the learning results. The multi-label classification problem introduces some new challenges to the ensemble learning methods. For instance, one needs to decide in which order is it better to aggregate the base learners - on a level of individual labels and then for the whole label sets, or the other way around. We discuss this issue and experimentally compare selected approaches. In the experiments, we use data from JRS'2012 Data Mining Competition, whose scope was topical classification of biomedical research papers, and as the base learners we utilize the models employed by the winners of this contest.

Original languageEnglish (US)
Title of host publicationAdvanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings
Pages306-317
Number of pages12
DOIs
StatePublished - 2012
Externally publishedYes
Event8th International Conference on Advanced Data Mining and Applications, ADMA 2012 - Nanjing, China
Duration: Dec 15 2012Dec 18 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7713 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th International Conference on Advanced Data Mining and Applications, ADMA 2012
CountryChina
CityNanjing
Period12/15/1212/18/12

Keywords

  • Data mining
  • Ensemble learning
  • Multi-label classification
  • Topical classification

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'An ensemble approach to multi-label classification of textual data'. Together they form a unique fingerprint.

Cite this