𝔖 Bobbio Scriptorium
✦   LIBER   ✦

An example of information management in biology: Qualitative data economizing theory applied to the Human Genome Project databases

✍ Scribed by Iraj Daizadeh


Publisher
John Wiley and Sons
Year
2005
Tongue
English
Weight
75 KB
Volume
57
Category
Article
ISSN
1532-2882

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

Ironically, although much work has been done on elucidating algorithms for enabling scientists to efficiently retrieve relevant information from the glut of data derived from the efforts of the Human Genome Project and other similar projects, little has been performed on optimizing the levels of data economy across databases. One technique to qualify the degree of data economization is that constructed by Boisot. Boisot's Information Space (I‐Space) takes into account the degree to which data are written (codification), the degree to which the data can be understood (abstraction), and the degree to which the data are effectively communicated to an audience (diffusion). A data system is said to be more data economical if it is relatively high in these dimensions. Application of the approach to entries in two popular, publicly available biological data repositories, the Protein DataBank (PDB) and GenBank, leads to the recommendation that PDB increases its level of abstraction through establishing a larger set of detailed keywords, diffusion through constructing hyperlinks to other databases, and codification through constructing additional subsections. With these recommendations in place, PDB would achieve the greater data economies currently enjoyed by GenBank. A discussion of the limitations of the approach is presented.