𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Semisupervised learning using feature selection based on maximum density subgraphs

✍ Scribed by Yoshiyuki Nakatani; Kuangyi Zhu; Kuniaki Uehara


Book ID
104591302
Publisher
John Wiley and Sons
Year
2007
Tongue
English
Weight
511 KB
Volume
38
Category
Article
ISSN
0882-1666

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

In machine learning tasks on large‐scale datasets, the labeled data essential to the classification are not always sufficient, which degrades the learning accuracy. Meanwhile, unlabeled data are always abundant. Hence, semisupervised learning which uses both unlabeled and labeled data to improve the learning accuracy is currently of great interest. In this paper, we use a graph to represent the underlying distribution of both labeled and unlabeled data and split it by using multiway cut to classify unlabeled data. Additionally, we propose a graph‐based feature selection algorithm to improve the learning accuracy of our graph‐based semisupervised learning algorithm. In our algorithm, we first propose an evaluation criterion for the attribute relevance using the graph density. Then, we extract the relevant attribute subset by finding the clique on the graph where each vertex stands for the attribute and each edge stands for the relevance of a feature pair. © 2007 Wiley Periodicals, Inc. Syst Comp Jpn, 38(9): 32–43, 2007; Published online in Wiley InterScience (www.interscience. wiley.com). DOI 10.1002/scj.20757