Document clustering, the grouping of documents into several clusters, has been recognized as a means for improving efficiency and effectiveness of information retrieval and text mining. With the growing importance of electronic media for storing and exchanging large textual databases, document clust
Data classification based on tolerant rough set
β Scribed by Daijin Kim
- Publisher
- Elsevier Science
- Year
- 2001
- Tongue
- English
- Weight
- 202 KB
- Volume
- 34
- Category
- Article
- ISSN
- 0031-3203
No coin nor oath required. For personal study only.
β¦ Synopsis
This paper proposes a new data classi"cation method based on the tolerant rough set that extends the existing equivalent rough set. Similarity measure between two data is described by a distance function of all constituent attributes and they are de"ned to be tolerant when their similarity measure exceeds a similarity threshold value. The determination of optimal similarity threshold value is very important for the accurate classi"cation. So, we determine it optimally by using the genetic algorithm (GA), where the goal of evolution is to balance two requirements such that (1) some tolerant objects are required to be included in the same class as many as possible and (2) some objects in the same class are required to be tolerable as much as possible. After "nding the optimal similarity threshold value, a tolerant set of each object is obtained and the data set is grouped into the lower and upper approximation set depending on the coincidence of their classes. We propose a two-stage classi"cation method that all data are classi"ed by using the lower approximation at the "rst stage and then the non-classi"ed data at the "rst stage are classi"ed again by using the rough membership functions obtained from the upper approximation set. The validity of the proposed classi"cation method is tested by applying it to the IRIS data classi"cation and its classi"cation performance and processing time are compared with those of other classi"cation methods such as BPNN, OFUNN, and FCM.
π SIMILAR VOLUMES