Explicit control of data for efficiency in OpenMP

Jarmo Rantakokko
UPPMAX and Division of Scientific Computing
Department of Information Technology
Uppsala University


Abstract:

The key to high parallel efficiency in OpenMP is explicit control of data partitioning and of data to thread locality. In OpenMP the "communication" is handled implicitly and it is hidden from the programmer but it affects in a high degree the performance. This is particularly evident on NUMA systems, e.g. the Sun Fire 15K system Ngorongoro, where remote memory accesses can be costly. The communication depends on the data partitioning which is an implicit consequence of the work sharing directives. In this talk we discuss explicit data partitioning techniques in OpenMP and show results from some recent research projects with up to six times performance improvements over standard OpenMP implementations