Dimitar Lukarski
Division of Scientific Computing
Department of Information Technology
Uppsala University
The modern CPU and GPU technologies provide great potential for fast
scientific computing. Only by exploring a fine-grained level of parallelism we
can obtain the peak performance of these systems. Typically, this is a
hardware specific task and requires different programming models/languages for
each individual platform. This leads to hardware specific code and
problem isolated solutions. In particular this affects the design of the
software for sparse matrix computation.
PARALUTION is a C++ library for iterative sparse methods, where the main
focus is the utilization of multi/many-core platforms and accelerators. In this
talk, we describe the main design decisions and the chosen abstraction model
of the library. Based on that, we are able to build portable preconditioned
iterative methods which can be executed on GPU and CPU systems without any
code modifications.
Furthermore, we detail the fine-grained parallel preconditioners for
multi/many-core devices - their design and implementation. This covers ILU
(power(q)-pattern method), multi-elimination (I)LU, additive and approximate
inverse schemes.
We conclude with a few code snap shots and performance results. Last but not least, we will tell you about the first users of PARALUTION, our road-map and collaborators. Here is the web page for the library.