๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

[ACM Press the 17th ACM SIGKDD international conference - San Diego, California, USA (2011.08.21-2011.08.24)] Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '11 - A game theoretic framework for heterogenous information network clustering

โœ Scribed by Alqadah, Faris; Bhatnagar, Raj


Book ID
118232105
Publisher
ACM Press
Year
2011
Tongue
English
Weight
629 KB
Volume
0
Category
Article
ISBN
1450308139

No coin nor oath required. For personal study only.

โœฆ Synopsis


Heterogeneous information networks are pervasive in applications ranging from bioinformatics to e-commerce. As a result, unsupervised learning and clustering methods pertaining to such networks have gained significant attention recently. Nodes in a heterogeneous information network are regarded as objects derived from distinct domains such as 'authors' and 'papers'. In many cases, feature sets characterizing the objects are not available, hence, clustering of the objects depends solely on the links and relationships amongst objects. Although several previous studies have addressed information network clustering, shortcomings remain. First, the definition of what constitutes an information network cluster varies drastically from study to study. Second, previous algorithms have generally focused on non-overlapping clusters, while many algorithms are also limited to specific network topologies.In this paper we introduce a game theoretic framework (GHIN) for defining and mining clusters in heterogeneous information networks. The clustering problem is modeled as a game wherein each domain represents a player and clusters are defined as the Nash equilibrium points of the game. Adopting the abstraction of Nash equilibrium points as clusters allows for flexible definition of reward functions that characterize clusters without any modification to the underlying algorithm. We prove that well-established definitions of clusters in 2-domain information networks such as formal concepts, maximal bi-cliques, and noisy binary tiles can always be represented as Nash equilibrium points. Moreover, experimental results employing a variety of reward functions and several real world information networks illustrate that the GHIN framework produces more accurate and informative clusters than the recently proposed NetClus and state of the art MDC algorithms.


๐Ÿ“œ SIMILAR VOLUMES