Kagan Tumer's Publications

Display Publications by [Year] [Type] [Topic]


Bayes Error Rate Estimation using Classifier Ensembles. K. Tumer and J. Ghosh. International Journal of Smart Engineering System Design, 5(2):95–110, 2003.

Abstract

The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and associated choice of features. By reliably estimating this rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is.Classical approaches for estimating or finding bounds for the Bayes error in general yield rather weak results for small sample sizes, unless the problem has some simple characteristics such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error, with negligible extra computation.Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the aposteriori class probabilities, are combined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensemble members and provides error estimates based on the disagreements among classifiers.The methods are illustrated for both artificial data, a difficult four class problem involving underwater acoustic data, and two problems from the Proben1 benchmarks. For data sets with known Bayes error, the combiner based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets, for which the true Bayes rates are unknown.

Download

[PDF]464.1kB  

BibTeX Entry

@article{tumer-ghosh_jsesd03,
        author={K. Tumer and J. Ghosh},
        title="Bayes Error Rate Estimation using Classifier Ensembles",
        journal = {International Journal of Smart Engineering System Design},
        volume={5},
        number={2},
        pages ={95-110},
	abstract={The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and associated choice of features.  By reliably estimating this rate, one can assess the usefulness of the feature set that is being used for classification.  Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is.
Classical approaches for estimating or finding bounds for the Bayes error in general yield rather weak results for small sample sizes, unless the problem has some simple characteristics such as Gaussian class-conditional likelihoods.  This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error, with negligible extra computation.
Three methods of varying sophistication are described.  First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the aposteriori class probabilities, are combined through averaging.  Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate.  Finally, we discuss a more general method that just looks at the class labels indicated by ensemble members and provides error estimates based on the disagreements among classifiers.
The methods are illustrated for both artificial data, a difficult four class problem involving underwater acoustic data, and two problems from the Proben1 benchmarks. For data sets with known Bayes error, the  combiner based methods introduced in this article outperform existing methods.  The estimates obtained by the proposed methods also seem quite reliable for the real-life  data sets, for which the true Bayes rates are unknown.},
	bib2html_pubtype = {Journal Articles},
	bib2html_rescat = {Classifier Ensembles, Bayes Error Estimation},
        year={2003}
}

Generated by bib2html.pl (written by Patrick Riley ) on Tue Jun 26, 2018 19:10:42