High Performance Computing

A simple model for heat diffusion in 2-dimensional space, splits it up into n_x times n_x small boxes. In each time step, the temperature h(i,j) of a box with coordinates i and j is updated as

h(i,j) + c * ( (h(i-1,j) + h(i+1,j) + h(i,j-1) + h(i,j+1))/4 - h(i,j) )

where c is a diffusion constant. We consider the case of bounded region of box shape whose boundary has constant temperature 0.

Implement in C using the GPU via OpenCL a program called `diffusion`

that

- Reads array size and initial values from a text file called "init".
- Executes a given number of steps of heat diffusion with given diffusion constant, and
- Outputs the average of temperatures, say X, as
`average: X`

. - Outputs the average absolute difference of each temperature to the average of all temperatures, say Y, as
`average absolute difference: Y`

.

Your program should accept command line arguments

```
./diffusion -n20 -d0.02
```

to compute 20 iterations with diffusion constant 0.02. You may only use OpenCL parallelism.

The first row of the input file diffusion contains two positive integers, which you may assume are positive. They determine the width and the height. Each line after the first one contains three entries, two integer values that denoted valid coordinates and an initial value that parses as a floating point number. For example,

```
3 3
1 1 1e6
```

yields the setup in the following example.

Since rounding errors in GPU calculations depend heavily on implementation details, a tolerance of 20% while generous is not unreasonable.

As an example, we consider the case of 3 times 3 with initial values as below.

0 | 0 | 0 |

0 | 1,000,000 | 0 |

0 | 0 | 0 |

Computing with diffusion constant 1/30, the next two iterations are

0 | 8333 | 0 |

8333 | 9667e2 | 8333 |

0 | 8333 | 0 |

and

138.9 | 1611e1 | 138.9 |

1611e1 | 9347e2 | 1611e1 |

138.9 | 1611e1 | 138.9 |

In particular, the average temperature is 111080 (as opposed to the original average 111111).

The absolute difference of each temperature and the average is

1109e2 | 9497e1 | 1109e2 |

9497e1 | 8236e2 | 9497e1 |

1109e2 | 9497e1 | 1109e2 |

with average is 183031.55. After five further iterations this will decrease to 151816.97.

For instance, invoking the program `diffusion`

in the presence of a suitable init file, we expect the out

```
./diffusion -n2 -d0.03333
average: 111080.257812
average absolute difference: 183032.984375
```

If you take this course as a student at Chalmers/GU, you can find the timing goals in the Assignment Timing Goals.