After passing the assignments development environment, optimization, OpenMP, and threads you receive a passing grad, i.e., 3 or G. You can sign up for the exam to get a higher grade. Regardless of how you perform in the exam, you are guaranteed a grad of 3 or G.
The exam is a combined take-home/oral exam: You have to prepare individually a detailed analysis and benchmark of one aspect of one of the assignments on OpenMP, threads, OpenCL, or MPI. You also sign up for one time slot on Canvas.
Exam slots will be made available towards the middle of the course, after Chalmers deadline for exam sign-up on 08 October 2023.
The oral exam will take place online. A time slot is 10 minutes long, 8 of which are reserved for the exam and 2 of which are available for your conveniently settling down.
You are expected to give a five minutes presentation of your prepared benchmarks and conclusions (see below for details). Please prepare slides (usual not more than 4 or 5) that you use to support your presentation. Slides should not print the code for the mere sake of it; you want to distill its quintessence.
Following the presentation, I might ask further questions for up to three minutes. These questions may connected to any material presented in the lectures and are not limited to the assignment that you decided to present on.
The presentation is graded according to a fixed grading scheme.
In preparation to the oral exam, choose one assignment and one topic specific to that assignment. Next comes a list of topics that you may pick, but you are not limited to these.
Assignment | Topic / Aspect |
---|---|
openmp | efficient reading and parsing of the input file |
efficient computation of the distances | |
efficient use of memory and cache | |
SIMD instructions and/or intrinsics | |
threads | efficient evaluation of the formula for Newton iteration |
efficient writing to the files | |
efficient assignment of computation to computation threads | |
bottle necks for large number of threads or lines or high degree polynomials | |
opencl | efficient data transfer between host and GPU |
impact of branch divergence | |
reduce algorithms on the GPU | |
efficiency balance between host and GPU computation | |
mpi | reduce algorithms in MPI |
efficient communication patterns |
For your topic or aspect answer the following questions:
What is a naive approach to the topic? Implement and benchmark it. You may modify the code that you handed in.
Does your handed in assignment go beyond a naive solution? If so, what does it do differently?
Benchmark the given aspect of your handed in solution.
What theory presented in the course plays into the topic?
What approach does your understanding of the theory suggest could be fastest?
Implement at least one variant that goes beyond the naive approach. Benchmark it, too.
Provide interpretations for your benchmarks. Did your ideas work out? If not, what might be the reason?