malloc allocates memory in multiples of 1 byte. If you request 32 bytes with
you will get 32 bytes of memory that are contiguous. You need not worry about where in memory those 32 bytes came from.
Some platforms are more efficient when memory is allocated on certain boundaries (some require it to function at all). This is called alignment and your compiler takes care of this invisibly for you. 99% of the time you don't need to worry about the alignment; those 32 bytes you allocated are going to be 32 bytes somewhere in memory and ptr holds the address of that first byte. If your code takes this alignment into account for just about any reason, it becomes less portable because you are worrying about a particular platform. Your code might function properly on the platform you compiled it on, but might have undefined behavior on a different platform.
As a simple example, consider this very simplistic model of computer memory.
Code:
0 1 2 3 |4 5 6 7 |8...
01101001 10010110 11101001 00000001|00010010 10000101 00010110 11111110|0...
8 bytes are represented (8 bits per byte). If our platform requires memory access on 4 byte boundaries, then when it requests memory here it would grab either bytes 0 through 3 (4 bytes) or bytes 4 through 7 (4 bytes). If the data we need is contained all within one of those 4 byte chunks then we're good. Consider however if your program allocated a 32 bit int starting at byte 2 (through byte 5). The platform would have to get the memory from bytes 0 through 3 in one request, bytes 4 through 7 in a second request, and then spend more time putting the two relevant halves of the data together to use as a single 32 bit value.