I have an ap on an ARM based processor that displays a bitmpa graphic, hence it copies chunks of data to the framebuffer memory for the LCD.
However, if the image is moved by an odd number of pixels, so that the resulting memory being copied to the framebuffer is not aligned to a 4-byte boundary, the performance of memcpy drops by 4 times (i.e. from 45 to 11 fps).
I have been doing some reading on this, and it seems if you can write your own optimised memcpy function to deal with this (http://www.embedded.com/showArticle....cleID=19205567).
Several variations of memcpy can be written that work better for certain sized memory chunks (e.g. 8 bytes or less, 32 bytes or less, 128 bytes or less, more than 128 bytes).
But surely, someone has written these before to deal with 4-byte boundaries? Does anyone know where I can find functions such as these?