[ACM Press the 9th ACM symposium - Munich, Germany (2009.09.16-2009.09.18)] Proceedings of the 9th ACM symposium on Document engineering - DocEng '09 - HCX
β Scribed by Kutty, Sangeetha; Nayak, Richi; Li, Yuefeng
- Book ID
- 121845497
- Publisher
- ACM Press
- Year
- 2009
- Weight
- 458 KB
- Category
- Article
- ISBN
- 1605585750
No coin nor oath required. For personal study only.
β¦ Synopsis
This paper proposes a novel Hybrid Clustering approach for XML documents (HCX) that first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. The empirical analysis reveals that the proposed method is scalable and accurate.
π SIMILAR VOLUMES
A document often goes through many revisions before it is finalized. In the normal document creation process, newer revisions overwrite older ones and only the final revision is kept. At any stage of document creation, it might be desirable to see how the document came to its current form or to reve