๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

[ACM Press 2011 International Conference for High Performance Computing, Networking, Storage and Analysis - Seattle, Washington (2011.11.12-2011.11.18)] Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11 - Fast implementation of DGEMM on Fermi GPU

โœ Scribed by Tan, Guangming; Li, Linchuan; Triechle, Sean; Phillips, Everett; Bao, Yungang; Sun, Ninghui


Book ID
121735988
Publisher
ACM Press
Year
2011
Weight
341 KB
Category
Article
ISBN
145030771X

No coin nor oath required. For personal study only.


๐Ÿ“œ SIMILAR VOLUMES


[ACM Press 2011 International Conference
โœ Tan, Guangming; Li, Linchuan; Triechle, Sean; Phillips, Everett; Bao, Yungang; S ๐Ÿ“‚ Article ๐Ÿ“… 2011 ๐Ÿ› ACM Press ๐ŸŒ English โš– 341 KB

In this paper we present a thorough experience on tuning double-precision matrix-matrix multiplication (DGEM-M) on the Fermi GPU architecture. We choose an optimal algorithm with blocking in both shared memory and registers to satisfy the constraints of the Fermi memory hierarchy. Our optimization s

[ACM Press 2011 International Conference
โœ Bautista-Gomez, Leonardo; Tsuboi, Seiji; Komatitsch, Dimitri; Cappello, Franck; ๐Ÿ“‚ Article ๐Ÿ“… 2011 ๐Ÿ› ACM Press ๐ŸŒ English โš– 523 KB

Large scientific applications deployed on current petascale systems expend a significant amount of their execution time dumping checkpoint files to remote storage. New fault tolerant techniques will be critical to efficiently exploit post-petascale systems. In this work, we propose a low-overhead hi

[ACM Press 2011 International Conference