Uppsala University, Department of Information Technology, Scientific Computing
http://www.it.uu.se/

Exam, Programming of Parallel Computers, 2008-06-02


1. Why don't we get perfect speedup when running a parallel program with OpenMP on a distributed shared memory computer?
Give at least five reasons for this (the parallel overhead) and explain each of them.

2. For each of the parallel overheads above (problem 1) describe or discuss techniques for how the overheads can be reduced.

3. Gram-Schmidt algorithm ... (yada, yada, yada)

4. In the course we have been discussing different metrics used for evaluating parallel algorithms and programs.
Describe and explain these different metrics.

5. What is collective communication in MPI and what is characteristic for a collective communication call?
Give at least five examples of collective communication operations and explain their effect.

6. Assume that we have a set of heterogenous multi-core nodes with different number of cores within different nodes but all cores over all nodes are identical. The nodes are connected with som kind of interconnect. This gies us a heterogenous distributed (local address space) memory parallel computer. We can use MPI to communicate between the nodes and OpenMP to parallelize over the cores within a node. Assume that we want to do parallel matrix-matrix multiplication on this parallel machine when we have three nodes with 6 cores, 2 cores, and 4 cores, respectively. The matrix sizes are 1200x1200. Construct an efficient parallel algorithm for this problem using all cores and nodes.