The ecient query and extraction of web data is often dicult, because web data does not conform to any data organization standard. In addition, the development of web search technology is still at a relatively early stage. Search engines provide only primitive data query capabilities, and require a d
Web data retrieval and extraction
✍ Scribed by Zoé Lacroix
- Publisher
- Elsevier Science
- Year
- 2003
- Tongue
- English
- Weight
- 431 KB
- Volume
- 44
- Category
- Article
- ISSN
- 0169-023X
No coin nor oath required. For personal study only.
✦ Synopsis
We present the Object-Web Mediator to querying integrated Web data sources composed of a retrieval component based on an intermediate object view mechanism and search views, and an XML engine. Search views map the source capabilities to attributes defined at object classes, and parsers that process retrieved documents and cache them in XML format. The XML engine queries cached documents, extracts data, and returns extracted data for evaluation. The originality of this approach consists of a generic view mechanism to access data sources with limited data access and complex capabilities, and an XML engine to support data extraction and reorganization. This approach has been developed and demonstrated as part of the multi-database system supporting queries via uniform Object Protocol Model interfaces against public Web data sources of interest to the biologists.
📜 SIMILAR VOLUMES
## Abstract Understanding what kinds of Web pages are the most useful for Web search engine users is a critical task in Web information retrieval (IR). Most previous works used hyperlink analysis algorithms to solve this problem. However, little research has been focused on query‐independent Web da
Electronically available data on the Web is exploding at an ever increasing pace. Much of this data is unstructured, which makes searching hard and traditional database querying impossible. Many Web documents, however, contain an abundance of recognizable constants that together describe the essence