𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Data classification based on tolerant rough set

✍ Scribed by Daijin Kim


Publisher
Elsevier Science
Year
2001
Tongue
English
Weight
202 KB
Volume
34
Category
Article
ISSN
0031-3203

No coin nor oath required. For personal study only.

✦ Synopsis


This paper proposes a new data classi"cation method based on the tolerant rough set that extends the existing equivalent rough set. Similarity measure between two data is described by a distance function of all constituent attributes and they are de"ned to be tolerant when their similarity measure exceeds a similarity threshold value. The determination of optimal similarity threshold value is very important for the accurate classi"cation. So, we determine it optimally by using the genetic algorithm (GA), where the goal of evolution is to balance two requirements such that (1) some tolerant objects are required to be included in the same class as many as possible and (2) some objects in the same class are required to be tolerable as much as possible. After "nding the optimal similarity threshold value, a tolerant set of each object is obtained and the data set is grouped into the lower and upper approximation set depending on the coincidence of their classes. We propose a two-stage classi"cation method that all data are classi"ed by using the lower approximation at the "rst stage and then the non-classi"ed data at the "rst stage are classi"ed again by using the rough membership functions obtained from the upper approximation set. The validity of the proposed classi"cation method is tested by applying it to the IRIS data classi"cation and its classi"cation performance and processing time are compared with those of other classi"cation methods such as BPNN, OFUNN, and FCM.


πŸ“œ SIMILAR VOLUMES


Nonhierarchical document clustering base
✍ Tu Bao Ho; Ngoc Binh Nguyen πŸ“‚ Article πŸ“… 2002 πŸ› John Wiley and Sons 🌐 English βš– 118 KB

Document clustering, the grouping of documents into several clusters, has been recognized as a means for improving efficiency and effectiveness of information retrieval and text mining. With the growing importance of electronic media for storing and exchanging large textual databases, document clust