Personal tools
You are here: Home Publications Stencil Computation Optimization and Autotuning on State-of-the-Art Multicore Architectures
Document Actions

Kaushik Datta, Mark Murphy, Vasily Volkov, Samuel Williams, Jonathan Carter, Leonid Oliker, David Patterson, John Shalf, and Katherine Yelick (2008)

Stencil Computation Optimization and Autotuning on State-of-the-Art Multicore Architectures

Supercomputing 2008 (SC08).

Understanding the most efficient design and utilization of emerging multicore systems is one of the most challenging questions faced by the mainstream and scientific computing industries in several decades. Our work explores multicore stencil (nearest neighbor) computations — a class of algorithms at the heart of many structured grid codes, including PDE solvers. We develop a number of effective optimization strategies, and build an auto-tuning environment that searches over our optimizations and their parameters to minimize runtime, while maximizing performance portability. To evaluate the effectiveness of these strategies we explore the broadest set of multicore architectures in the current HPC literature, including the Intel Clovertown, AMD Barcelona, Sun Victoria Falls, IBM QS22 PowerXCell 8i, and NVIDIA GTX280. Overall, our auto-tuning optimization methodology results in the fastest multicore stencil performance to date. Finally, we present several key insights into the architectural tradeoffs of emerging multicore designs and their implications on scientific algorithm development.

by Jennifer Harris last modified 2009-04-20 19:29
« April 2018 »
Su Mo Tu We Th Fr Sa
1234567
891011121314
15161718192021
22232425262728
2930
 

Powered by Plone

CScADS Collaborators include:

Rice University ANL UCB UTK WISC