My raytracer so far is very simple. It uses no acceleration structures and the scene is static. Earlier versions ran on the CPU, but now I use Cuda to use the enormous floating point processing power of my GPU which shortened my rendering times a lot.
The raytracer loads a mesh from a 3DS Max file, puts a ‘room’ around it and renders the scene using only primary rays. Here is an example:

Raytracer output
The scene above consists of 1034 triangles and renders in 0.693 seconds on my Geforce 8600GT. My CPU needs around 10 seconds to render the same scene with the same detail so using the GPU is much faster.
Right now I’m in the process of implementing a BHV acceleration structure to make rendering faster. What it basically does is, it puts a bounding box around the mesh and then divides the mesh into two smaller bounding boxes. This process is repeated until some conditions are met (i.e. maximum depth reached) or when each bounding box contains a single traingle. This speeds up rendering because now the rays are checked against the bounding boxes instead of the triangles, and the bounding boxes that do not intersect with the ray don’t have to be checked.