𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Exploiting parallelism in a structural scientific discovery system to improve scalability

✍ Scribed by Galal, Gehad M. ;Cook, Diane J. ;Holder, Lawrence B.


Publisher
John Wiley and Sons
Year
1999
Tongue
English
Weight
171 KB
Volume
50
Category
Article
ISSN
0002-8231

No coin nor oath required. For personal study only.

✦ Synopsis


The large amount of data collected today is quickly overwhelming researchers' abilities to interpret the data and discover interesting patterns. Knowledge discovery and data mining approaches hold the potential to automate the interpretation process, but these approaches frequently utilize computationally expensive algorithms. In particular, scientific discovery systems focus on the utilization of richer data representation, sometimes without regard for scalability. This research investigates approaches for scaling a particular knowledge discovery in databases (KDD) system, SUBDUE, using parallel and distributed resources. SUBDUE has been used to discover interesting and repetitive concepts in graph-based databases from a variety of domains, but requires a substantial amount of processing time. Experiments that demonstrate scalability of parallel versions of the SUBDUE system are performed using CAD circuit databases and artificially-generated databases, and potential achievements and obstacles are discussed.