Algorithms for Computational Science and Engineering in the Multicore Era

Sverker Holmgren
Division of Scientific Computing
Department of Information Technology
Uppsala University


Abstract:

Efficient solutions of CSE problems require a match between the algorithm and the underlying architecture. In this talk, we start by giving an giving an overview of modern multicore processor architectures and the impact on program and algorithm performance. We also briefly discuss the impact on programming models, programming tools and libraries.

Multicore processors feature low intra-chip communication cost and smaller per-thread caches compared to single-core CPUs. From an algorithmic point of view this means that data locality issues become more important than communication overheads. This may result in that a re-evaluation of many existing parallel algorithms is required. As an illustration of this, we show results for a parallel implementations of a multigrid method using a temporally blocked, naturally ordered, Gauss-Siedel smoother. Compared with the standard parallel smoother, based on the two-color red-black algorithm, we improve the data locality often as much as ten times, while our use of a fine-grained locking scheme keeps the parallel efficiency high. This results in significant performance gains on both traditional shared memory and new multicore systems.