𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Dynamic visualization of statistical learning in the context of high-dimensional textual data

✍ Scribed by Michael Greenacre; Trevor Hastie


Publisher
Elsevier Science
Year
2010
Tongue
English
Weight
561 KB
Volume
8
Category
Article
ISSN
1570-8268

No coin nor oath required. For personal study only.

✦ Synopsis


Our ability to record increasingly larger and more complex sets of data is accompanied by a decline in our capacity to interpret and understand these data in the fullest sense. Multivariate analysis partially assists us in our quest by reducing the dimensionality in optimal ways, but our view is stuck in two dimensions because of the planar nature of the graphical medium, be it the printed page or the computer screen. We are developing protocols and tools to add motion to scientific graphics so that high-dimensional data can be visualized dynamically. Using the freely available R language and modern methods of statistical learning and data mining, we construct animation sequences that take the viewer on a dynamic journey through the data. The idea is illustrated using a large data set of all the abstracts of the journal Vaccine in the years 2003-2006, according to their word frequencies and citation counts.


πŸ“œ SIMILAR VOLUMES


Pitfalls of merging GWAS data: lessons l
✍ Rebecca L. Zuvich; Loren L. Armstrong; Suzette J. Bielinski; Yuki Bradford; Chri πŸ“‚ Article πŸ“… 2011 πŸ› John Wiley and Sons 🌐 English βš– 485 KB

## Abstract Genome‐wide association studies (GWAS) are a useful approach in the study of the genetic components of complex phenotypes. Aside from large cohorts, GWAS have generally been limited to the study of one or a few diseases or traits. The emergence of biobanks linked to electronic medical r

Qmd-plot: A graphical utility for rapid
✍ Sam Kalat; Geoff Mann; Jan Hermans πŸ“‚ Article πŸ“… 2001 πŸ› John Wiley and Sons 🌐 English βš– 219 KB

## Abstract Qmd‐plot is a utility to obtain rapid information about past or on‐going simulations, or real‐time data collections, in the form of graphs of recorded variables (__x__,__y__,…), as __x__–__y__ plots or as a function of simulated or real time. Time series records in the data file must be