𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Web data retrieval and extraction

✍ Scribed by Zoé Lacroix


Publisher
Elsevier Science
Year
2003
Tongue
English
Weight
431 KB
Volume
44
Category
Article
ISSN
0169-023X

No coin nor oath required. For personal study only.

✦ Synopsis


We present the Object-Web Mediator to querying integrated Web data sources composed of a retrieval component based on an intermediate object view mechanism and search views, and an XML engine. Search views map the source capabilities to attributes defined at object classes, and parsers that process retrieved documents and cache them in XML format. The XML engine queries cached documents, extracts data, and returns extracted data for evaluation. The originality of this approach consists of a generic view mechanism to access data sources with limited data access and complex capabilities, and an XML engine to support data extraction and reorganization. This approach has been developed and demonstrated as part of the multi-database system supporting queries via uniform Object Protocol Model interfaces against public Web data sources of interest to the biologists.


📜 SIMILAR VOLUMES


A smart web query method for semantic re
✍ Roger H.L. Chiang; Cecil Eng Huang Chua; Veda C. Storey 📂 Article 📅 2001 🏛 Elsevier Science 🌐 English ⚖ 712 KB

The ecient query and extraction of web data is often dicult, because web data does not conform to any data organization standard. In addition, the development of web search technology is still at a relatively early stage. Search engines provide only primitive data query capabilities, and require a d

Web mining for Web image retrieval
✍ Zheng Chen; Liu Wenyin; Feng Zhang; Mingjing Li; Hongjiang Zhang 📂 Article 📅 2001 🏛 John Wiley and Sons 🌐 English ⚖ 422 KB
Data cleansing for Web information retri
✍ Yiqun Liu; Min Zhang; Rongwei Cen; Liyun Ru; Shaoping Ma 📂 Article 📅 2007 🏛 John Wiley and Sons 🌐 English ⚖ 614 KB

## Abstract Understanding what kinds of Web pages are the most useful for Web search engine users is a critical task in Web information retrieval (IR). Most previous works used hyperlink analysis algorithms to solve this problem. However, little research has been focused on query‐independent Web da

Conceptual-model-based data extraction f
✍ D.W. Embley; D.M. Campbell; Y.S. Jiang; S.W. Liddle; D.W. Lonsdale; Y.-K. Ng; R. 📂 Article 📅 1999 🏛 Elsevier Science 🌐 English ⚖ 858 KB

Electronically available data on the Web is exploding at an ever increasing pace. Much of this data is unstructured, which makes searching hard and traditional database querying impossible. Many Web documents, however, contain an abundance of recognizable constants that together describe the essence

Energy information and data retrieval
✍ Ali Bülent Çambel; Marino S. Peña-Taveras; Carl F. Oldsen 📂 Article 📅 1985 🏛 Elsevier Science 🌐 English ⚖ 591 KB