Unless you wil lbe doing a lto fo operation on the same data, you are usualyl better off using SSE or the FPU to do your FP math. The bandwidth used to transfer the data to teh GPU and the results back exceeds any gains you get from having the GPU do the calculations unless you are doing at least 20 operations on each item. You also need to have very large quanitites of data to process. The shaders on a GPU usually run much slower than the CPU, so you need to give them enough work to use mroe than one or two of them. Just because you have 192 shaders, and you give it 192 pieces of data, does not mean that each shader will get one piece of data, more liekly is that one shader will get it all and the others will sit idle. You need to give an 8800 GTX something like 192 million member arrays to get it to fully engage all its shaders. At least with rapidmind, you can submit multiple jobs without waiting for the previous ones to finish and get allteh sahders to engage this way.
I think if the course requirement is to not use the graphics API, then the instructor probably wants you to do the graphics math in your pogram, so you learn whats invovled. A good book for this is 'Tricks of the game programming gurus' and 'More tricks of the game programmig guru's'. Both are out of print, but you can fidn plenty of copies on amazon. They were both written back in teh day's when a video card didnt do anything except display teh image you sent it.