Thread: Using the GPU to do some processing

  1. #1
    Registered User
    Join Date
    Oct 2006
    Location
    UK/Norway
    Posts
    485

    Using the GPU to do some processing

    Hallo,

    I am wondering if I can use the GPU to do some of the processing for the small game im am making. The game is made without using any API, so I dont have access to fast features that the graphics card has (through the use of openGl/directX) but I was wondering if I could write some of my code so that the GPU could posses it instead of the CPU? Would there be any gains from doing so? And, can the GPU be programmed with c++?

    Regards,

  2. #2
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    take a look at CUDA (for NVIDIA GPUs, I don't know if an equivalent for ATI exists).

  3. #3
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    unless you want to do some crazy parallel floating point arithmetics... It probably doesn't worth the hassle to use the GPU though.

  4. #4
    Registered User
    Join Date
    Oct 2006
    Location
    UK/Norway
    Posts
    485
    Thats one of the things I am trying to figure out.

    In some cases I have to draw the sprite pixel for pixel using this:
    Code:
    				screenDataPnt[0] = screenDataPnt[0] + (( alpha * (blue - screenDataPnt[0] )) >> 8);
    				screenDataPnt[1] = screenDataPnt[1] + (( alpha * (green - screenDataPnt[1] )) >> 8);
    				screenDataPnt[2] = screenDataPnt[2] + (( alpha * (red - screenDataPnt[2] )) >> 8);
    It works fine, but when I have a lot of stuff in the screen things start to slow down. Say I have 40 sprites that is 128 x 128. Thats a lot calculations for each frame. So I was hoping that I could put that code over on the graphic card.

    Not sure if it even makes sense, im completely blank on this subject.

  5. #5
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    what API are you using to actually display the pixels? You have to be using an API to actually make your pixels appear on the screen.

    make sure it is really the calculations that is slowing down the program though. Most of the time it's actually the library calls that are bottlenecking the program. In that case, you would gain nothing by optimizing this part.

  6. #6
    Registered User
    Join Date
    Oct 2006
    Location
    UK/Norway
    Posts
    485
    I use something that is given to my by the university. It is based on directX (All the nice things have been taken out, so I am left with a pointer to the screen memory and functions for loading in files.)

    The drawing function I posted (alpha blitting) is something like 10 times slower then the other blitting function I have.

  7. #7
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    The drawing function I posted (alpha blitting) is something like 10 times slower then the other blitting function I have.
    I am assuming that is the result of profiling and not a guesstimation. Why not just use "the other blitting function" instead?

  8. #8
    Registered User
    Join Date
    Oct 2006
    Location
    UK/Norway
    Posts
    485
    The other (faster) blitting function does not support alpha. It used memcop () to copy one line at the time, instead of pixel by pixel

  9. #9
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    have you tried just taking out the alpha part of your current blitting function? If that improves the performance, then I guess we can be pretty sure that is the bottleneck. In that case, I guess it would make sense to use CUDA (or CTM, the ATI version as I just found out), since that is what the GPUs do the best (massively parallel arithmetics). In that case, though, why not just use DX or OGL instead? CUDA is designed to be used for things like biological computing that have nothing to do with graphics, but require parallel arithmetics, since graphical programs can more easily (and portably) make use of actual graphics APIs (DX or OGL).

  10. #10
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Note that using CUDA or CTM hinders your program's portability greatly. CUDA requires a GeForce 8 GPU, and I am pretty sure CTM requires a pretty recent ATI card as well. AFAIK no portable library of that sort exist yet.

  11. #11
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Also, if your screen is stored in main memory, using the GPU to just do the computing means that you will have to copy your whole screen worth of bytes from VRAM to system RAM for every frame. In that case, bandwidth may become an issue. Using a graphics library will almost always make sure that everything is done in VRAM.

  12. #12
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    If you're not given access to hardware blitters, what makes you think you're given access to general purpose computation on the GPU?
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  13. #13
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    The GPU is insanely fast at number crunching. So if you have calculations, the GPU is usually better off calculating them. The main problem, I guess, is the communication between the two, since the data needs to be transfered back if it's not going to be rendered.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  14. #14
    Registered User
    Join Date
    Oct 2006
    Location
    UK/Norway
    Posts
    485
    Quote Originally Posted by cyberfish View Post
    have you tried just taking out the alpha part of your current blitting function? If that improves the performance, then I guess we can be pretty sure that is the bottleneck. In that case, I guess it would make sense to use CUDA (or CTM, the ATI version as I just found out), since that is what the GPUs do the best (massively parallel arithmetics). In that case, though, why not just use DX or OGL instead? CUDA is designed to be used for things like biological computing that have nothing to do with graphics, but require parallel arithmetics, since graphical programs can more easily (and portably) make use of actual graphics APIs (DX or OGL).
    I which I could use DX or openGl, but I have to use this API for a class next year, so i might as well learn it now. Taking out the alpha channel calculation increases the speed several times, to it is a bottle neck.

    Note that using CUDA or CTM hinders your program's portability greatly. CUDA requires a GeForce 8 GPU, and I am pretty sure CTM requires a pretty recent ATI card as well. AFAIK no portable library of that sort exist yet.
    As long as I get it to run on my computer I am happy. Its my first game so it is probably going to suck to much for other people to play it.

    If you're not given access to hardware blitters, what makes you think you're given access to general purpose computation on the GPU?
    I though that hardware blitters were accessed trough the graphic API(which i cant use) while the general GPU could be accessed by anything?

    The GPU is insanely fast at number crunching. So if you have calculations, the GPU is usually better off calculating them. The main problem, I guess, is the communication between the two, since the data needs to be transfered back if it's not going to be rendered.
    Thats what I have read as well and it is why I want to see if I can use the GPU to do some of my calculations to speed up the blitting


    Thanks for all of your replies.

  15. #15
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    while the general GPU could be accessed by anything?
    The general GPU is accessed through a specialized API not unlike the graphics API. Basically, using the GPU this way would probably amount to sneaking past the course requirements, not really fulfilling them.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Reentrant Message Processing (WndProc)
    By phantomotap in forum Windows Programming
    Replies: 7
    Last Post: 04-28-2009, 10:44 AM
  2. What types of files can C++ file processing handle?
    By darsunt in forum C++ Programming
    Replies: 9
    Last Post: 10-28-2008, 11:33 AM
  3. Sub Array Processing
    By GCNDoug in forum C++ Programming
    Replies: 8
    Last Post: 11-28-2007, 04:41 PM
  4. Using a lot of processing time!
    By nickname_changed in forum C++ Programming
    Replies: 0
    Last Post: 09-25-2003, 03:44 AM
  5. file writing crashes
    By test in forum C Programming
    Replies: 25
    Last Post: 08-13-2002, 08:44 AM