Bayesian classification using a noninformative prior and mislabeled training data
โ Scribed by Robert S. Lynch Jr; Peter K. Willett
- Publisher
- Elsevier Science
- Year
- 1999
- Tongue
- English
- Weight
- 207 KB
- Volume
- 336
- Category
- Article
- ISSN
- 0016-0032
No coin nor oath required. For personal study only.
โฆ Synopsis
The average probability of error is used to demonstrate the performance of a Bayesian classi"cation test (referred to as the Combined Bayes Test (CBT)) when the training data of each class are mislabeled. The CBT combines the information in discrete training and test data to infer symbol probabilities, where a uniform Dirichlet prior (i.e., a noninformative prior of complete ignorance) is assumed for all classes. Using the CBT, classi"cation performance is shown to degrade when mislabeling exists in the training data, and this occurs with a severity that depends upon the mislabeling probabilities. With this, it is shown that as the mislabeling probabilities increase MH, which is the best quantization "neness related to the Hughes phenomenon of pattern recognition, also increases. Notice, that even when the actual mislabeling probabilities are known by the CBT it is not possible to achieve the classi"cation performance obtainable without mislabeling. However, the negative e!ect of mislabeling can be diminished, with more success for smaller mislabeling probabilities, if a data reduction method called the Bayesian Data Reduction Algorithm (BDRA) is applied to the training data.
๐ SIMILAR VOLUMES