๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

[ACM Press the ninth ACM SIGKDD international conference - Washington, D.C. (2003.08.24-2003.08.27)] Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '03 - Efficient decision tree construction on streaming data

โœ Scribed by Jin, Rouming; Agrawal, Gagan


Book ID
126219337
Publisher
ACM Press
Year
2003
Tongue
English
Weight
161 KB
Category
Article
ISBN-13
9781581137378

No coin nor oath required. For personal study only.

โœฆ Synopsis


Decision tree construction is a well studied problem in data mining. Recently, there has been much interest in mining streaming data. Domingos and Hulten have presented a one-pass algorithm for decision tree construction. Their work uses Hoeffding inequality to achieve a probabilistic bound on the accuracy of the tree constructed.In this paper, we revisit this problem. We make the following two contributions: 1) We present a numerical interval pruning (NIP) approach for efficiently processing numerical attributes. Our results show an average of 39% reduction in execution times. 2) We exploit the properties of the gain function entropy (and gini) to reduce the sample size required for obtaining a given bound on the accuracy. Our experimental results show a 37% reduction in the number of data instances required.


๐Ÿ“œ SIMILAR VOLUMES


[ACM Press the ninth ACM SIGKDD internat
โœ Jin, Rouming; Agrawal, Gagan ๐Ÿ“‚ Article ๐Ÿ“… 2003 ๐Ÿ› ACM Press ๐ŸŒ English โš– 161 KB

This Conference Brings Together Researchers And Practitioners And Focuses On New Developments In Knowledge Discovery And Data Mining. The Challenge Of Extracting Knowledge From Data Is An Area Of Common Interest To Researchers In Several Fields, Including Statistics, Databases, Pattern Recognition,

[ACM Press the ninth ACM SIGKDD internat
โœ Last, Mark; Friedman, Menahem; Kandel, Abraham ๐Ÿ“‚ Article ๐Ÿ“… 2003 ๐Ÿ› ACM Press ๐ŸŒ English โš– 289 KB

This Conference Brings Together Researchers And Practitioners And Focuses On New Developments In Knowledge Discovery And Data Mining. The Challenge Of Extracting Knowledge From Data Is An Area Of Common Interest To Researchers In Several Fields, Including Statistics, Databases, Pattern Recognition,

[ACM Press the ninth ACM SIGKDD internat
โœ Hsu, Wynne; Dai, Jing; Lee, Mong Li ๐Ÿ“‚ Article ๐Ÿ“… 2003 ๐Ÿ› ACM Press ๐ŸŒ English โš– 398 KB

This Conference Brings Together Researchers And Practitioners And Focuses On New Developments In Knowledge Discovery And Data Mining. The Challenge Of Extracting Knowledge From Data Is An Area Of Common Interest To Researchers In Several Fields, Including Statistics, Databases, Pattern Recognition,