𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Automated indexing of the hazardous substances data bank

✍ Scribed by Carlo Nuss; Hua Florence Chang; Dorothy Moore; George C. Fonger


Publisher
Wiley (John Wiley & Sons)
Year
2005
Tongue
English
Weight
307 KB
Volume
40
Category
Article
ISSN
0044-7870

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

The Hazardous Substances Data Bank (HSDB), a factual data file produced and maintained by the Specialized Information Services (SIS) Division of the National Library of Medicine (NLM), contains over 4600 records on potentially hazardous chemicals. To improve information retrieval from HSDB, SIS has undertaken the development of an automated indexing protocol in collaboration with NLM's Indexing Initiative group. The Indexing Initiative investigates methods whereby automated indexing may partially or completely substitute for human indexing. Three main methodologies are applied: the MetaMap Indexing method, which maps text to concepts in the Unified Medical Language System (UMLS) Metathesaurus; the Trigram Phrase Matching method, which uses character trigrams to match text to Metathesaurus concepts; and a variant of the PubMed Related Citations method to find MeSH terms related to input text. The UMLS concepts generated by the first two methods are mapped to MeSH main headings through the Restrict‐to‐MeSH algorithm. The resulting MeSH terms are then clustered into a ranked list of recommended indexing terms. The purpose of the poster is to present our experience in applying these automated indexing methodologies to a large data file with highly structured records, a variety of text and data formats, and complex technical and biomedical terminology.


πŸ“œ SIMILAR VOLUMES


A Survey on the Automatic Indexing of Vi
✍ R. Brunelli; O. Mich; C.M. Modena πŸ“‚ Article πŸ“… 1999 πŸ› Elsevier Science 🌐 English βš– 403 KB

Today a considerable amount of video data in multimedia databases requires sophisticated indices for its effective use. Manual indexing is the most effective method to do this, but it is also the slowest and the most expensive. Automated methods have then to be developed. This paper surveys several

Macromolecular recognition in the Protei
✍ Janin, JoΓ«l ;Rodier, Francis ;Chakrabarti, Pinak ;Bahadur, Ranjit P. πŸ“‚ Article πŸ“… 2006 πŸ› International Union of Crystallography 🌐 English βš– 659 KB