For your example, Reclaimer, you can see that by going from your 615K per frame model to something larger:
1) pixmap requirements are calculated as follows: Pixmap
height times pixmap width times bits per pixel (bpp) or
320 x 240 x 8 = 614400 (or 615K per frame)
2) Addressing needs to be "aligned". Your rowbytes must
by aligned on 4-byte boundaries in order to allow the
processor to work in its native 32-bit favored word-
size. Memory accesses are faster because the processor
isn't having to waste cycle time masking and shifting bits
it isn't interested in.
3) Make sure all pixmaps are of the same depth. Usually
for most FPS (first person shooter's) this is 8 bits, which
allows a palette of 256 colors.
4) assembly is ideal for handling the actual drawing onto
the screen in drastically cut cycle times. Particularly for
operations such as per pixel calculations. For example,
on a 25Mhz processor, you might go through a
multiplication algorithm 7.2 million times in 3 seconds.
Converting this to an assembly macro can eliminate
incredible amounts of "stack time".