Some Hints

Solving the problem

The most important part of devising an algorithm to solve the problem is to work out how to find which pixels belong to the same object. In the worst case all the pixels are seperate objects. So if we traverse the array that has stored the picture and update all the entries to hold a unique integer value then we know we have located the maximum possible number of independent objects.

Now we need to update these values to locate all those pixels that border other pixels. The idea is to propagate the numbers that are stored in the array in such a way that all the pixels belonging to the same object end up storing the same value. We can then count the number of occurrences of each value in the final representation to determine how many objects there are any the size of each of them. This allows us to report the number of independent objects and the number of pixels in the largest object.

What are Temporal and Storage complexity?

They are an estimate of the amount of time the program will take to execute and the amount of memory that the program will use. These complexities are analytical quantities computed using algebraic representations of the different aspects of the storage or time requirements of the program.

For instance, you might look at the following code

If we assume that the computation in the code above takes K time units to complete, then the completion time for the total program segment can be represented by the following equation.

Temporal Complexity= N*M*K

Since K is a constant it is not the most important factor, the size of the problem is determined by the loop indexes. We can therefore approximate the temporal performance of this program segment as O(N*M), and if N and M are sufficiently large that allows us to consider the overall complexity to be approximately O(N^2).

What data should I collect?

You should collect speedup data for the sequential algorithm and the parallel algorithm applied to the most complex image for 2-32 processor machines. Increase the sizes of machine that you get results for in steps of 2 processors. That is you will collect execution times for machines with 2,4,6,8,10,....,32 processors.

To compute speedup values you take the sequential time and divide by the parallel time. Ideally the speedup value for P processes running on a machine with P cpus should be close to P. If this is not the case your algorithm is not very efficient. Perhaps you have too much synchronisation?

To show that your program works for all the images show that a single execution of the parallel algorithm works correctly on each of the less complex images. Choose a number of processors that divides the workload in each of these images evenly.

How can I collect my performance data automatically?

You should probably use a shell script to generate a log of the stats files for a range of program exections on dincreasing machine sizes. This sample routine written in csh script shows how to achieve this.