Application of a niched Pareto genetic algorithm for selecting features for nuclear transients classification
✍ Scribed by P. Baraldi; N. Pedroni; E. Zio
- Publisher
- John Wiley and Sons
- Year
- 2009
- Tongue
- English
- Weight
- 606 KB
- Volume
- 24
- Category
- Article
- ISSN
- 0884-8173
No coin nor oath required. For personal study only.
✦ Synopsis
Feature selection for transient classification is the problem of choosing among several monitored parameters (i.e., the features) to be used for efficiently recognizing the developing transient patterns. It is a critical issue for the application of "on condition" diagnostic techniques in complex systems, such as the nuclear power plants, where hundreds of parameters are measured. Indeed, irrelevant and noisy features have been shown to unnecessarily increase the complexity of the classification problem and degrade the diagnostic performance. In this paper, the problem of selecting the features to be used for efficient transient classification is tackled by means of multiobjective genetic algorithms. The approach leads to the identification of a family of equivalently optimal subsets of features, in the Pareto sense. However, difficulties in the convergence of the standard Pareto-based multiobjective genetic algorithm search in large feature spaces may arise in terms of representativeness of the identified Pareto front whose elements may turn out to be unevenly distributed in the objective functions space, thus not providing a full picture of the potential Pareto-optimal solutions. To overcome this problem, a niched Pareto genetic algorithm is embraced in this work. The performance of the feature subsets examined during the search is evaluated in terms of two optimization objectives: the classification accuracy of a Fuzzy K-Nearest Neighbors classifier and the number of features in the subsets. During the genetic search, the algorithm applies a controlled "niching pressure" to spread out the population in the search space so that convergence is shared on different niches of the Pareto front, which is thus evenly covered. The method is tested on a diagnostic problem characterized by a very large number of process features available for the classification of simulated transients in the feedwater system of a boiling water reactor. The dynamics of the transient signals is captured by wavelet decomposition, which actually increases the complexity of the search for the optimal feature subsets by triplicating the number of features to be considered.
📜 SIMILAR VOLUMES
Genetic programming (GP) is used to classify tumours based on 1 H nuclear magnetic resonance (NMR) spectra of biopsy extracts. Analysis of such data would ideally give not only a classification result but also indicate which parts of the spectra are driving the classification (i.e. feature selection