We present a conceptual framework within which we can analyze simple reward schemes for classifier systems. The framework consists of a set of classifiers, a learning mechanism, and a finite automaton environment that outputs payoff. We find that many reward schemes have negative biases that degrade
Credit assignment and discovery in classifier systems
โ Scribed by G. E. Liepins; M. R. Hilliard; Mark Palmer; Gita Rangarajan
- Publisher
- John Wiley and Sons
- Year
- 1991
- Tongue
- English
- Weight
- 802 KB
- Volume
- 6
- Category
- Article
- ISSN
- 0884-8173
No coin nor oath required. For personal study only.
โฆ Synopsis
Classifier systems are "discovery" production rule systems that utilize the genetic algorithm for discovery and allocate credit through the bucket brigade. For any given problem, the success of a classifier system depends on the choice of representation, the system's ability to attain reward or punishment states (evaluation states), accurate estimation of the relative merit of individual classifiers, and the genetic algorithm's ability to use information about the current population of rules to generate better rules. This article addresses the adequacy of the bucket brigade and backward averaging for credit assignment and reviews a preliminary study of two variants in conjunction with rules that are fully enumerated as well as with discovery. Potential difficulties with each of these methods are highlighted in several theoretical examples, including one from the literature. Preliminary results and tentative similarities between these hybrids and Sutton's Adaptive Heuristic Critic (AHC) are suggested.
๐ SIMILAR VOLUMES
Randomly connected Boolean networks have been used as mathematical models of neural, genetic, and immune systems. A key quantity of such networks is the number of basins of attraction in the state space. The number of basins of attraction changes as a function of the size of the network, its connect