𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Top-down decision tree learning as information based boosting

✍ Scribed by Eiji Takimoto; Akira Maruoka


Publisher
Elsevier Science
Year
2003
Tongue
English
Weight
180 KB
Volume
292
Category
Article
ISSN
0304-3975

No coin nor oath required. For personal study only.

✦ Synopsis


We consider a boosting technique that can be directly applied to multiclass classiÿcation problems. Although many boosting algorithms have been proposed so far, most of them are developed essentially for binary classiÿcation problems, and in order to handle multiclass classiÿcation problems, they need to be reduced somehow to binary ones. In order to avoid such reductions, we introduce a notion of the pseudo-entropy function G that gives an information-theoretic criterion, called the conditional G-entropy, for measuring the loss of hypotheses. The conditional G-entropy turns out to be useful for deÿning the weakness of hypotheses that approximate, in some way, a multiclass function in general, so that we can consider the boosting problem without reduction. We show that the top-down decision tree learning algorithm using the conditional G-entropy as its splitting criterion is an e cient boosting algorithm. Namely, the algorithm intends to minimize the conditional G-entropy, rather than the classiÿcation error. In the binary case, our algorithm turns out to be identical to the error-based boosting algorithm proposed by Kearns and Mansour, and our analysis gives a simpler proof of their results.


📜 SIMILAR VOLUMES


On the Boosting Ability of Top–Down Deci
✍ Michael Kearns; Yishay Mansour 📂 Article 📅 1999 🏛 Elsevier Science 🌐 English ⚖ 513 KB

We analyze the performance of top down algorithms for decision tree learning, such as those employed by the widely used C4.5 and CART software packages. Our main result is a proof that such algorithms are boosting algorithms. By this we mean that if the functions that label the internal nodes of the