AR-Boost: Reducing overfitting by a robust data-driven regularization strategy

Baidya Nath Saha, Gautam Kunapuli, Nilanjan Ray, Joseph A Maldjian, Sriraam Natarajan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

We introduce a novel, robust data-driven regularization strategy called Adaptive Regularized Boosting (AR-Boost), motivated by a desire to reduce overfitting. We replace AdaBoost's hard margin with a regularized soft margin that trades-off between a larger margin, at the expense of misclassification errors. Minimizing this regularized exponential loss results in a boosting algorithm that relaxes the weak learning assumption further: it can use classifiers with error greater than 1/2. This enables a natural extension to multiclass boosting, and further reduces overfitting in both the binary and multiclass cases. We derive bounds for training and generalization errors, and relate them to AdaBoost. Finally, we show empirical results on benchmark data that establish the robustness of our approach and improved performance overall.

Original languageEnglish (US)
Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings
Pages1-16
Number of pages16
Volume8190 LNAI
EditionPART 3
DOIs
StatePublished - Oct 31 2013
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2013 - Prague, Czech Republic
Duration: Sep 23 2013Sep 27 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 3
Volume8190 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

OtherEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2013
CountryCzech Republic
CityPrague
Period9/23/139/27/13

Fingerprint

Adaptive boosting
Overfitting
Boosting
Data-driven
Margin
Regularization
AdaBoost
Multi-class
Misclassification Error
Generalization Error
Adaptive Strategies
Natural Extension
Classifiers
Trade-offs
Classifier
Binary
Benchmark
Robustness
Strategy

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Saha, B. N., Kunapuli, G., Ray, N., Maldjian, J. A., & Natarajan, S. (2013). AR-Boost: Reducing overfitting by a robust data-driven regularization strategy. In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings (PART 3 ed., Vol. 8190 LNAI, pp. 1-16). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8190 LNAI, No. PART 3). https://doi.org/10.1007/978-3-642-40994-3_1

AR-Boost : Reducing overfitting by a robust data-driven regularization strategy. / Saha, Baidya Nath; Kunapuli, Gautam; Ray, Nilanjan; Maldjian, Joseph A; Natarajan, Sriraam.

Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings. Vol. 8190 LNAI PART 3. ed. 2013. p. 1-16 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8190 LNAI, No. PART 3).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Saha, BN, Kunapuli, G, Ray, N, Maldjian, JA & Natarajan, S 2013, AR-Boost: Reducing overfitting by a robust data-driven regularization strategy. in Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings. PART 3 edn, vol. 8190 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 3, vol. 8190 LNAI, pp. 1-16, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2013, Prague, Czech Republic, 9/23/13. https://doi.org/10.1007/978-3-642-40994-3_1
Saha BN, Kunapuli G, Ray N, Maldjian JA, Natarajan S. AR-Boost: Reducing overfitting by a robust data-driven regularization strategy. In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings. PART 3 ed. Vol. 8190 LNAI. 2013. p. 1-16. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 3). https://doi.org/10.1007/978-3-642-40994-3_1
Saha, Baidya Nath ; Kunapuli, Gautam ; Ray, Nilanjan ; Maldjian, Joseph A ; Natarajan, Sriraam. / AR-Boost : Reducing overfitting by a robust data-driven regularization strategy. Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings. Vol. 8190 LNAI PART 3. ed. 2013. pp. 1-16 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 3).
@inproceedings{1f2c4998f6e04138a7813bad4bc84960,
title = "AR-Boost: Reducing overfitting by a robust data-driven regularization strategy",
abstract = "We introduce a novel, robust data-driven regularization strategy called Adaptive Regularized Boosting (AR-Boost), motivated by a desire to reduce overfitting. We replace AdaBoost's hard margin with a regularized soft margin that trades-off between a larger margin, at the expense of misclassification errors. Minimizing this regularized exponential loss results in a boosting algorithm that relaxes the weak learning assumption further: it can use classifiers with error greater than 1/2. This enables a natural extension to multiclass boosting, and further reduces overfitting in both the binary and multiclass cases. We derive bounds for training and generalization errors, and relate them to AdaBoost. Finally, we show empirical results on benchmark data that establish the robustness of our approach and improved performance overall.",
author = "Saha, {Baidya Nath} and Gautam Kunapuli and Nilanjan Ray and Maldjian, {Joseph A} and Sriraam Natarajan",
year = "2013",
month = "10",
day = "31",
doi = "10.1007/978-3-642-40994-3_1",
language = "English (US)",
isbn = "9783642409936",
volume = "8190 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 3",
pages = "1--16",
booktitle = "Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings",
edition = "PART 3",

}

TY - GEN

T1 - AR-Boost

T2 - Reducing overfitting by a robust data-driven regularization strategy

AU - Saha, Baidya Nath

AU - Kunapuli, Gautam

AU - Ray, Nilanjan

AU - Maldjian, Joseph A

AU - Natarajan, Sriraam

PY - 2013/10/31

Y1 - 2013/10/31

N2 - We introduce a novel, robust data-driven regularization strategy called Adaptive Regularized Boosting (AR-Boost), motivated by a desire to reduce overfitting. We replace AdaBoost's hard margin with a regularized soft margin that trades-off between a larger margin, at the expense of misclassification errors. Minimizing this regularized exponential loss results in a boosting algorithm that relaxes the weak learning assumption further: it can use classifiers with error greater than 1/2. This enables a natural extension to multiclass boosting, and further reduces overfitting in both the binary and multiclass cases. We derive bounds for training and generalization errors, and relate them to AdaBoost. Finally, we show empirical results on benchmark data that establish the robustness of our approach and improved performance overall.

AB - We introduce a novel, robust data-driven regularization strategy called Adaptive Regularized Boosting (AR-Boost), motivated by a desire to reduce overfitting. We replace AdaBoost's hard margin with a regularized soft margin that trades-off between a larger margin, at the expense of misclassification errors. Minimizing this regularized exponential loss results in a boosting algorithm that relaxes the weak learning assumption further: it can use classifiers with error greater than 1/2. This enables a natural extension to multiclass boosting, and further reduces overfitting in both the binary and multiclass cases. We derive bounds for training and generalization errors, and relate them to AdaBoost. Finally, we show empirical results on benchmark data that establish the robustness of our approach and improved performance overall.

UR - http://www.scopus.com/inward/record.url?scp=84886476301&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84886476301&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-40994-3_1

DO - 10.1007/978-3-642-40994-3_1

M3 - Conference contribution

AN - SCOPUS:84886476301

SN - 9783642409936

VL - 8190 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 1

EP - 16

BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings

ER -