Page segmentation using thinning of white areas
β Scribed by Koichi Kise; Osamu Yanagida
- Book ID
- 101303023
- Publisher
- John Wiley and Sons
- Year
- 1998
- Tongue
- English
- Weight
- 403 KB
- Volume
- 29
- Category
- Article
- ISSN
- 0882-1666
No coin nor oath required. For personal study only.
β¦ Synopsis
Page segmentation is a process used to extract such components as columns, figures, tables, and photos from an image of a document. This article proposes a page segmentation technique that is stable, irrespective of component shape or tilted document image, based on analyzing the white region (background) of the document image. When we process a document that has non-rectangular and tilted components, the boundary of the components, that is, the white region, takes any shape. Thus, important questions include how to express white regions and how to process them.
The proposed method uses thin lines that are extracted by thinning as an expression of white regions. Based on this expression of white regions, page segmentation is defined as extracting loops that surround the components. The proposed method extracts loops by eliminating unnecessary thin lines, for example, those that represent line spacing and character spacing. We try to use not only the feature of white regions, but also those of black regions, and to process several kinds of document layout.
This paper examines the effectiveness and limitations of the proposed method based on experimental results that are taken from 80 sample images that are tilted from 0 to 45 degrees.
π SIMILAR VOLUMES
a b s t r a c t Dermoscopy, also known as dermatoscopy or epiluminescence microscopy (ELM), permits visualization of features of pigmented melanocytic neoplasms that are not discernable by examination with the naked eye. White areas, prominent in early malignant melanoma and melanoma in situ, contri