𝔖 Bobbio Scriptorium
✦   LIBER   ✦

A C++11 implementation of arbitrary-rank tensors for high-performance computing

✍ Scribed by Aragón, Alejandro M.


Book ID
121706649
Publisher
Elsevier Science
Year
2014
Tongue
English
Weight
808 KB
Volume
185
Category
Article
ISSN
0010-4655

No coin nor oath required. For personal study only.


📜 SIMILAR VOLUMES


In search of a program generator to impl
✍ Albert Cohen; Sébastien Donadio; Maria-Jesus Garzaran; Christoph Herrmann; Oleg 📂 Article 📅 2006 🏛 Elsevier Science 🌐 English ⚖ 510 KB

The quality of compiler-optimized code for high-performance applications is far behind what optimization and domain experts can achieve by hand. Although it may seem surprising at first glance, the performance gap has been widening over time, due to the tremendous complexity increase in microprocess

[ACM Press 2011 International Conference
✍ Tan, Guangming; Li, Linchuan; Triechle, Sean; Phillips, Everett; Bao, Yungang; S 📂 Article 📅 2011 🏛 ACM Press 🌐 English ⚖ 341 KB

In this paper we present a thorough experience on tuning double-precision matrix-matrix multiplication (DGEM-M) on the Fermi GPU architecture. We choose an optimal algorithm with blocking in both shared memory and registers to satisfy the constraints of the Fermi memory hierarchy. Our optimization s