Using the GPU to do some processing

**h3ro** · 03-04-2008

Hallo,

I am wondering if I can use the GPU to do some of the processing for the small game im am making. The game is made without using any API, so I dont have access to fast features that the graphics card has (through the use of openGl/directX) but I was wondering if I could write some of my code so that the GPU could posses it instead of the CPU? Would there be any gains from doing so? And, can the GPU be programmed with c++?

Regards,

**cyberfish** · 03-04-2008

take a look at CUDA (for NVIDIA GPUs, I don't know if an equivalent for ATI exists).

**cyberfish** · 03-04-2008

unless you want to do some crazy parallel floating point arithmetics... It probably doesn't worth the hassle to use the GPU though.

**h3ro** · 03-04-2008

Thats one of the things I am trying to figure out.

In some cases I have to draw the sprite pixel for pixel using this:

Code:

				screenDataPnt[0] = screenDataPnt[0] + (( alpha * (blue - screenDataPnt[0] )) >> 8);
				screenDataPnt[1] = screenDataPnt[1] + (( alpha * (green - screenDataPnt[1] )) >> 8);
				screenDataPnt[2] = screenDataPnt[2] + (( alpha * (red - screenDataPnt[2] )) >> 8);

It works fine, but when I have a lot of stuff in the screen things start to slow down. Say I have 40 sprites that is 128 x 128. Thats a lot calculations for each frame. So I was hoping that I could put that code over on the graphic card.

Not sure if it even makes sense, im completely blank on this subject.

**cyberfish** · 03-04-2008

what API are you using to actually display the pixels? You have to be using an API to actually make your pixels appear on the screen.

make sure it is really the calculations that is slowing down the program though. Most of the time it's actually the library calls that are bottlenecking the program. In that case, you would gain nothing by optimizing this part.

**h3ro** · 03-04-2008

I use something that is given to my by the university. It is based on directX (All the nice things have been taken out, so I am left with a pointer to the screen memory and functions for loading in files.)

The drawing function I posted (alpha blitting) is something like 10 times slower then the other blitting function I have.

**cyberfish** · 03-04-2008

The drawing function I posted (alpha blitting) is something like 10 times slower then the other blitting function I have.

I am assuming that is the result of profiling and not a guesstimation. Why not just use "the other blitting function" instead?

**h3ro** · 03-04-2008

The other (faster) blitting function does not support alpha. It used memcop () to copy one line at the time, instead of pixel by pixel

**cyberfish** · 03-04-2008

have you tried just taking out the alpha part of your current blitting function? If that improves the performance, then I guess we can be pretty sure that is the bottleneck. In that case, I guess it would make sense to use CUDA (or CTM, the ATI version as I just found out), since that is what the GPUs do the best (massively parallel arithmetics). In that case, though, why not just use DX or OGL instead? CUDA is designed to be used for things like biological computing that have nothing to do with graphics, but require parallel arithmetics, since graphical programs can more easily (and portably) make use of actual graphics APIs (DX or OGL).

**cyberfish** · 03-04-2008

Note that using CUDA or CTM hinders your program's portability greatly. CUDA requires a GeForce 8 GPU, and I am pretty sure CTM requires a pretty recent ATI card as well. AFAIK no portable library of that sort exist yet.

**cyberfish** · 03-04-2008

Also, if your screen is stored in main memory, using the GPU to just do the computing means that you will have to copy your whole screen worth of bytes from VRAM to system RAM for every frame. In that case, bandwidth may become an issue. Using a graphics library will almost always make sure that everything is done in VRAM.

**CornedBee** · 03-05-2008

If you're not given access to hardware blitters, what makes you think you're given access to general purpose computation on the GPU?

**Elysia** · 03-05-2008

The GPU is insanely fast at number crunching. So if you have calculations, the GPU is usually better off calculating them. The main problem, I guess, is the communication between the two, since the data needs to be transfered back if it's not going to be rendered.

**h3ro** · 03-05-2008

Originally Posted by cyberfish

have you tried just taking out the alpha part of your current blitting function? If that improves the performance, then I guess we can be pretty sure that is the bottleneck. In that case, I guess it would make sense to use CUDA (or CTM, the ATI version as I just found out), since that is what the GPUs do the best (massively parallel arithmetics). In that case, though, why not just use DX or OGL instead? CUDA is designed to be used for things like biological computing that have nothing to do with graphics, but require parallel arithmetics, since graphical programs can more easily (and portably) make use of actual graphics APIs (DX or OGL).

I which I could use DX or openGl, but I have to use this API for a class next year, so i might as well learn it now. Taking out the alpha channel calculation increases the speed several times, to it is a bottle neck.

Note that using CUDA or CTM hinders your program's portability greatly. CUDA requires a GeForce 8 GPU, and I am pretty sure CTM requires a pretty recent ATI card as well. AFAIK no portable library of that sort exist yet.

As long as I get it to run on my computer I am happy. Its my first game so it is probably going to suck to much for other people to play it.

If you're not given access to hardware blitters, what makes you think you're given access to general purpose computation on the GPU?

I though that hardware blitters were accessed trough the graphic API(which i cant use) while the general GPU could be accessed by anything?

The GPU is insanely fast at number crunching. So if you have calculations, the GPU is usually better off calculating them. The main problem, I guess, is the communication between the two, since the data needs to be transfered back if it's not going to be rendered.

Thats what I have read as well and it is why I want to see if I can use the GPU to do some of my calculations to speed up the blitting

Thanks for all of your replies.

**CornedBee** · 03-05-2008

while the general GPU could be accessed by anything?

The general GPU is accessed through a specialized API not unlike the graphics API. Basically, using the GPU this way would probably amount to sneaking past the course requirements, not really fulfilling them.

Thread: Using the GPU to do some processing

Thread Tools

Search Thread

Display

Using the GPU to do some processing

Similar Threads

Reentrant Message Processing (WndProc)

What types of files can C++ file processing handle?

Sub Array Processing

Using a lot of processing time!

file writing crashes