The goal is to parallelize two-dimensional wave simulation. You are given a stub project which already supports compiling to Cuda and OpenCL and does some initialization and computation in the GPGPU. The tasks that are to be done are:
- Move the entire Runge-Kutta integration to the GPGPU
- Choose appropriate Grid and block size
- Optimize arithmetic operations
- Avoid unnecessary memory copying
- Choose appropriate data structures
- Use coalesced memory access
Ideally, you should be able to run the simulation faster than the given x86 based simulation but this is not an exact requirement to pass. The goal here is to familiarize oneself to GPGPU programming concepts and demonstrate an understanding of them - not to produce the fastest possible program. Though optimization is not forbidden, i.e., you may optimize as much as you like.
The code is C++ with Cuda and OpenCL kernels provided. The code is available as a .tar.gz archive:
Use the latest version and ee the ChangeLog for a list of changes.
The code consists of two user interface frontends: command line binary
wave-cli and graphical UI binary
wave-sdl which uses the SDL library. The latter is not required and running it on miranda can be slow due to the fact that there is no console access.
The backend for the user interface consists of a C++ abstract interface class, called
Wave() and three concrete implementation classes:
WaveOpenCL(). The first one is fully functional Runge-Kutta (RK4) integration of the wave equation in normal CPU C-code. The two latter perform initialization and the first step of the RK4 integration in GPGPU and the rest of the steps in CPU. Below is a graph of the stub code structure:
The code uses Cuda SDK headers and libraries:
Compiling it without the SDK is hard and requires substantial changes to the code.
Compiling and running
- Get a copy of the source code, unpack it, for example, in
Vto the version number of the latest version.)
wget https://wiki.tkk.fi/download/attachments/40023730/wave-$V.tar.gz tar zxvf wave-$V.tar.gz cd wave-$V
- Edit the Makefile and set up the paths CUDA_TOOLKIT_PATH and CUDA_SDK_PATH if you are running the code in your own system. Also, remove the GUI client building if you don't want to build it, otherwise SDL development headers and libraries are required.
- To compile in miranda, say
use cuda make
- Makefile includes two test rules with which you can test that everything is working. To produce a simple test with output to the console, issue:
To get output to a Gnuplot produced png file, say
Invoking the command line binary
wave-cli with no arguments produces a help message. The command line client runs the wave simulation for a given number of steps causing an impulse at the step 0, reseting it at step 1 and measuring the wave amplitude in the give position for N steps. The client prints the output either in ascii format or format suitable for Gnuplot. You can configure which of the backends you want to run.
GUI client binary is named
wave-sdl. It takes one mandatory argument indicating whether you want to run the simulation in x86, Cuda or OpenCL. Giving no arguments produces a help message. Clicking the mouse in the window causes an impulse and subsequent waves. The GUI client measures the FPS rate and prints it to stdout.
- The project deadline is on Thursday, April 22th, 2010 at 23:59.
- Submit your code in a tar.gz archive named SID.tar.gz, where SID is your student id by emailing it to me.
email@example.com the deadline
- All relevant source code must be submitted and I must be able to compile it in
miranda. Do not submit any binaries.
- Include a readme file (either in plain text or a PDF if you need more formatted output) where you describe
- Your name, student id, email address
- What you have done, preferably step-by-step description on how the various modifications you made improved the execution time and overall performance.
- Feedback of the exercise: time spent, did you learn anything from it (optional)
- To pass the code must perform all wave equation calculation in GPGPU and
the code must be compilable and executable in
- The project grading is done in the scale 0-5. The overall course grade is arithmetic mean of the project grade and the presentation grade. To get the maximum grade, you have to demonstrate understanding of the GPGPU programming concepts and peculiarities. You don't, however, produce the best possible optimization though doing so will probably earn you a better grade.
- You are free to change any of the code given in the stub archive but to get a maximum grade it is sufficient to edit only the backend code implementing the simulation in either Cuda or OpenCL. Remember to document all changes you made in the readme file!
- If you want, you can provide your own, better integration algorithm as an alternative implementation. You can even use a different programming language but if you consider these, contact me (
firstname.lastname@example.org) beforehand for approval.
You can ask questions and clarifications on the IRC channel or by emailing me
email@example.com. Contact me also if you need some additional software installed on