Out-of-bag estimation of the optimal sample size in bagging
✍ Scribed by Gonzalo Martínez-Muñoz; Alberto Suárez
- Publisher
- Elsevier Science
- Year
- 2010
- Tongue
- English
- Weight
- 321 KB
- Volume
- 43
- Category
- Article
- ISSN
- 0031-3203
No coin nor oath required. For personal study only.
✦ Synopsis
The performance of m-out-of-n bagging with and without replacement in terms of the sampling ratio (m/n) is analyzed. Standard bagging uses resampling with replacement to generate bootstrap samples of equal size as the original training set m wor = n. Without-replacement methods typically use half samples m wr = n/2. These choices of sampling sizes are arbitrary and need not be optimal in terms of the classification performance of the ensemble. We propose to use the out-of-bag estimates of the generalization accuracy to select a near-optimal value for the sampling ratio. Ensembles of classifiers trained on independent samples whose size is such that the out-of-bag error of the ensemble is as low as possible generally improve the performance of standard bagging and can be efficiently built.
📜 SIMILAR VOLUMES
When Bron, a traditional Welsh hedge witch, helps people at Samhain, she's not expecting to be asked to clear a man's name of murder. She wants to believe that he's innocent. But then there's another suspicious death. These things shouldn't happen in the sleepy town of Llanfair and everyone's scare