Byte Ordering Question

**Syndacate** · 04-13-2014

Hey,

Was hoping some of you guys who know more about bit twiddling could shed some light, I just ran into this problem with a piece of code I'm working on, and I've seen it somewhere else too.

Basically it has to do with the byte ordering in a binary buffer vs the typing of a variable used to hold it.

To give you an example, if I have a buffer (say of indefinite length), and a ptr "ptr" pointing to a byte in the buffer (say, C0), such that if I open the buffer in a binary viewer it reads like this:

Code:

C0 DD FE 1F

Such that this is true:

Code:

/*ptr is uint8_t*/
*ptr == 0xC0

Then I do this:

Code:

uint16_t var;
var = *(ptr+1);

I would expect the result to be:

Code:

DD FE /*56830*/

Though if I print that out with:

Code:

printf("%u\n", var);

It'll print:

Code:

65245 /*(FE DD)*/

Now obviously it's byte swapped, but what is causing that? I'm assuming if I just stream that out to a file byte by byte it'll be fine, so it's something with the 16 bit data type (also have seen this issue with a 32 bit data type, where all 4 are in reverse order).

Is there any way to 'fix' it except bit shifts & masks?

Thanks!!

**gemera** · 04-13-2014

CPU's store hex numbers in two main formats - Big Endian and Little Endian.

What you describe is typical of x86/x64 and thus most computers with Windows and Linux OS which use Little Endian format.

Either you store your data in the buffer in Little Endian format instead of Big Endian as you have now, or you can convert the data in the buffer to Little Endian by writing your own function.

The other alternative is to use network conversion functions like ntohs() and ntohl().

**Salem** · 04-13-2014

Endianness - Wikipedia, the free encyclopedia

> Is there any way to 'fix' it except bit shifts & masks?
Use a big endian architecture. Otherwise, byte swapping is your only choice.

> var = *(ptr+1);
If ptr is a pointer to a uint_8, I don't see how you managed to read a 16 bit number by dereferencing the pointer.

Further, even if ptr is uint_16*, you can't just point at random buffer locations and dereference the pointer.
Data structure alignment - Wikipedia, the free encyclopedia
It might work for you today, but on another machine, it would just get you a bus error exception.

The only safe thing to do (ptr is uint8*)
memcpy( &var, ptr+1, sizeof(var) );

**Syndacate** · 04-14-2014

Originally Posted by Salem

Endianness - Wikipedia, the free encyclopedia

> Is there any way to 'fix' it except bit shifts & masks?
Use a big endian architecture. Otherwise, byte swapping is your only choice.

> var = *(ptr+1);
If ptr is a pointer to a uint_8, I don't see how you managed to read a 16 bit number by dereferencing the pointer.

Further, even if ptr is uint_16*, you can't just point at random buffer locations and dereference the pointer.
Data structure alignment - Wikipedia, the free encyclopedia
It might work for you today, but on another machine, it would just get you a bus error exception.

The only safe thing to do (ptr is uint8*)
memcpy( &var, ptr+1, sizeof(var) );

Maybe I should elaborate, this is pointing to a value in a bitstream buffer, so I need a method of capturing bits. If I need say, 13 bits, I'm using this method to grap the next 16, then masking what I don't need.

As I said, it was a buffer of indefinite length - that was only a sample I came up with. I probably should have put an ellipses after it.

Yeah, I'm not a big fan of that conversion, either. Why is memcpy safe? If that byte is the last byte of a buffer, and var is of size 2 (16 bits, in this particular case), then what's the difference of using an equal sign to copy the 16 bits or the memcpy? I don't see how one could raise a memory exception but the other can't. Memcpy isn't endian safe, is it?

Unless I use endian safe socket calls, sounds like byte swapping is the only reasonable method (unless memcpy does some byte magic I don't know about).

**Syndacate** · 04-14-2014

Originally Posted by gemera

CPU's store hex numbers in two main formats - Big Endian and Little Endian.

What you describe is typical of x86/x64 and thus most computers with Windows and Linux OS which use Little Endian format.

Either you store your data in the buffer in Little Endian format instead of Big Endian as you have now, or you can convert the data in the buffer to Little Endian by writing your own function.

The other alternative is to use network conversion functions like ntohs() and ntohl().

Can't change the buffer format, world has standards it runs on >.<. As I said, this isn't the first time this happened, why the hell does most of the 'sane' world (I use the term loosely, haha) all use big endian buffers though x86/x64 little endian archs are by far the most common?

I'd like to avoid the network conversions if possible :-\.

**Salem** · 04-14-2014

memcpy copies memory one byte(*) at a time, so it is immune to alignment issues.

If you have a uint_16 pointer (say), and it is pointing at an odd address, then there are NO guarantees as to what happens when you try and dereference that pointer.
I've seen all these happen.
- it just works
- it works with an expensive trap into the OS to "do the right thing"
- the lsb is silently truncated to zero and it reads the wrong data for you.
- the OS traps the alignment exception and kills your program dead.

(*) many implementations detect compatible alignments and copy words or longs whenever possible.

Writing memcpy() is trivial, as is writing your own versions of ntohs() for example.

> If that byte is the last byte of a buffer, and var is of size 2 (16 bits, in this particular case), then what's the difference of using an equal sign to copy the 16 bits or the memcpy?
If it is the last byte, then both are wrong anyway.
Both would be accessing a memory location which is undefined.

You're not going to solve this with simple 1-line casting magic. You need to step back and design a proper stream access function which takes your stream and returns the next 'n' bits as you require.

Code:

struct streambuf {
  uint8_t *buff; // the buffer
  size_t buffsize; // number of bytes in buff
  size_t byteoffset; // initially 0
  uint8  bitoffset;  // initially 0
};
int readbits ( streambuf *sb, int nbits, uint32_t *result );

So if you're at the last byte, and you try and read say 13 bits, it returns some kind of error diagnostic.

**Syndacate** · 04-14-2014

Originally Posted by Salem

memcpy copies memory one byte(*) at a time, so it is immune to alignment issues.

If you have a uint_16 pointer (say), and it is pointing at an odd address, then there are NO guarantees as to what happens when you try and dereference that pointer.
I've seen all these happen.
- it just works
- it works with an expensive trap into the OS to "do the right thing"
- the lsb is silently truncated to zero and it reads the wrong data for you.
- the OS traps the alignment exception and kills your program dead.

(*) many implementations detect compatible alignments and copy words or longs whenever possible.

Writing memcpy() is trivial, as is writing your own versions of ntohs() for example.

> If that byte is the last byte of a buffer, and var is of size 2 (16 bits, in this particular case), then what's the difference of using an equal sign to copy the 16 bits or the memcpy?
If it is the last byte, then both are wrong anyway.
Both would be accessing a memory location which is undefined.

You're not going to solve this with simple 1-line casting magic. You need to step back and design a proper stream access function which takes your stream and returns the next 'n' bits as you require.

Code:

struct streambuf {
  uint8_t *buff; // the buffer
  size_t buffsize; // number of bytes in buff
  size_t byteoffset; // initially 0
  uint8  bitoffset;  // initially 0
};
int readbits ( streambuf *sb, int nbits, uint32_t *result );

So if you're at the last byte, and you try and read say 13 bits, it returns some kind of error diagnostic.

I didn't mean it as an equal sign is 'right' - I meant that will memcpy (or similar) do anything that an equal sign won't do when dealing with 16 bits? The buffer fullness is verified prior, so I know that it's not biting off more than it can deal with. I'm not seeing how a memcpy is any different than an equal sign copying 16 bits. Besides, memcpy will have to be defined anyway, and in the implementation it will likely copy 8 bits using a uint8_t ptr or similar, so I feel like I'm abstracting for the sake of abstracting.

I get what you mean, it's not proper in terms of alignment, and I'd like to avoids unsafe calls, as you point out, many things can happen, though at the end of the day, there's really no 'failsafe' way to do it. I mean even if I write this 'readbits' function you speak of, I still have to deal with it being unsafe in there... In the end that function doesn't do anything regarding the solving of the problem, just the matter of which it's accessed.

It sounds like a memcpy followed by a byte-swap is the best way to do it. That may work for x86 but if the endianness is swapped it won't - it doesn't sound like there's anything I can do about that.

**grumpy** · 04-14-2014

Originally Posted by Syndacate

As I said, this isn't the first time this happened, why the hell does most of the 'sane' world (I use the term loosely, haha) all use big endian buffers though x86/x64 little endian archs are by far the most common?

Endianness is not some law of nature. It is an implementation choice. There are actually distinct advantages and disadvantages to both big-endian and little-endian architectures, and the vendors who produced different endian machines were targeting different markets .... and in competition, so not exactly consulting each other on implementation decisions.

Originally Posted by Syndacate

I'd like to avoid the network conversions if possible :-\.

Then don't run on a network, or design your software so it only runs correctly on a big-endian or a little-endian machine. If someone with a different machine objects, tell them they're insane, to junk their hardware, buy your preferred hardware. That will guarantee your future success .... or something.

In the real world, you will aim to write your code so it works equally well regardless of endianness.

BTW: big and little endianness are not the only possibilities. They are the most common ones. Look up middle endian. Some machines are also bi-endian (meaning they can be configured one way or the other, not that they magically ensure a program that assumes a particular endianness will work).

**Syndacate** · 04-14-2014

Originally Posted by grumpy

Endianness is not some law of nature. It is an implementation choice. There are actually distinct advantages and disadvantages to both big-endian and little-endian architectures, and the vendors who produced different endian machines were targeting different markets .... and in competition, so not exactly consulting each other on implementation decisions.

Standards, not architecture. Stream specs, to be specific. Don't worry, though, that was more of a rant than anything else.

Originally Posted by grumpy

Then don't run on a network, or design your software so it only runs correctly on a big-endian or a little-endian machine. If someone with a different machine objects, tell them they're insane, to junk their hardware, buy your preferred hardware. That will guarantee your future success .... or something.

In the real world, you will aim to write your code so it works equally well regardless of endianness.

BTW: big and little endianness are not the only possibilities. They are the most common ones. Look up middle endian. Some machines are also bi-endian (meaning they can be configured one way or the other, not that they magically ensure a program that assumes a particular endianness will work).

In the 'real world' I write code that couldn't get much more architecture specific, but this isn't that, and I want this to be a bit more portable.

Ill have to look into those other ways of byte ordering, never heard of them.

**anduril462** · 04-14-2014

Originally Posted by Syndacate

I didn't mean it as an equal sign is 'right' - I meant that will memcpy (or similar) do anything that an equal sign won't do when dealing with 16 bits?

The memcpy works as multiple single-byte assignment statements on contiguous spots in memory. Single-byte assignments in C often turn into single-byte MOV (or similar) machine instructions that do not suffer alignment issues. A 16-bit assignment in C often turns into a 16-bit MOV (or similar) instruction by the compiler, which does have the potential for alignment issues if the source or destinations addresses are not properly aligned. Smart compilers might be able to mitigate this in a few circumstances, but certainly not every circumstance, and it's not something I would want to rely on since such behavior is not part of any standard and thus subject to the whim of compiler writers.

Originally Posted by Syndacate

I get what you mean, it's not proper in terms of alignment, and I'd like to avoids unsafe calls, as you point out, many things can happen, though at the end of the day, there's really no 'failsafe' way to do it. I mean even if I write this 'readbits' function you speak of, I still have to deal with it being unsafe in there... In the end that function doesn't do anything regarding the solving of the problem, just the matter of which it's accessed.

If written properly, with bounds checking and memcpy, then the readbits function as Salem intended it (as far as I can tell) should be pretty fail safe. It ensures there was enough data in the stream/buffer to actually read the required number of bits, to avoid an out of bounds error on the memcpy. And using memcpy fixes the alignment issues. If you have your "ntohX" or "htonX" functions (your own, or system-provided), that properly do the byte swapping to the correct order, then you have nice generic code.

Originally Posted by Syndacate

It sounds like a memcpy followed by a byte-swap is the best way to do it. That may work for x86 but if the endianness is swapped it won't - it doesn't sound like there's anything I can do about that.

Yes, memcpy + byte-swap (if required) is the best way to do it -- with the bounds checking in the readbits (or whatever you want to call it) function. All you really need to make this portable is a portable ntohX and htonX functions that do the right swapping based on the host architecture's endianness (which is not hard to determine programatically). Note that, on a big endian architecture, that already matches the network (or whatever your stream is) byte order, your htonX and ntohX functions will do no actual byte swapping, they simply return the value as-is. The code that calls ntohX or htonX shouldn't know or care about what those functions actually do (or don't do) to get the bytes in the right order.

I feel like you're digging your heels in unnecessarily here -- the solution is fairly simple and very portable. Why do you really want to "avoid the network conversions if possible"? What's the downside to performing them? You've probably spent more time whining/ranting about it in this thread then it would have taken to implement a solution that can handle virtually any endianness.

**Syndacate** · 04-14-2014

Originally Posted by anduril462

The memcpy works as multiple single-byte assignment statements on contiguous spots in memory. Single-byte assignments in C often turn into single-byte MOV (or similar) machine instructions that do not suffer alignment issues. A 16-bit assignment in C often turns into a 16-bit MOV (or similar) instruction by the compiler, which does have the potential for alignment issues if the source or destinations addresses are not properly aligned. Smart compilers might be able to mitigate this in a few circumstances, but certainly not every circumstance, and it's not something I would want to rely on since such behavior is not part of any standard and thus subject to the whim of compiler writers.

If written properly, with bounds checking and memcpy, then the readbits function as Salem intended it (as far as I can tell) should be pretty fail safe. It ensures there was enough data in the stream/buffer to actually read the required number of bits, to avoid an out of bounds error on the memcpy. And using memcpy fixes the alignment issues. If you have your "ntohX" or "htonX" functions (your own, or system-provided), that properly do the byte swapping to the correct order, then you have nice generic code.

Yes, memcpy + byte-swap (if required) is the best way to do it -- with the bounds checking in the readbits (or whatever you want to call it) function. All you really need to make this portable is a portable ntohX and htonX functions that do the right swapping based on the host architecture's endianness (which is not hard to determine programatically). Note that, on a big endian architecture, that already matches the network (or whatever your stream is) byte order, your htonX and ntohX functions will do no actual byte swapping, they simply return the value as-is. The code that calls ntohX or htonX shouldn't know or care about what those functions actually do (or don't do) to get the bytes in the right order.

I feel like you're digging your heels in unnecessarily here -- the solution is fairly simple and very portable. Why do you really want to "avoid the network conversions if possible"? What's the downside to performing them? You've probably spent more time whining/ranting about it in this thread then it would have taken to implement a solution that can handle virtually any endianness.

The reason I'd like to avoid network conversions is simply that I'd rather not have dependencies on socket libraries. Though as you pointed out, I can just write my own. I figured more-so that there's a way to do this "right" in just C/cstdlib/etc that would allow the code to be portable and potential issues with alignment, etc. to be avoided, though it seems the only way to get around the byte ordering without socket dependencies is to bake the functionality in - which is what I did. Being a higher level language, I would have thought that when doing it correctly the language could intrinsically get around the issue, thought it appears I'm mistaken. That's pretty much what it comes down to I guess.

In the end I used memcpy + swaps if needed. I didn't know that a greater than 1 byte MOV operation had alignment issue potential. How do you know if it's safe to use one or not?

**grumpy** · 04-15-2014

Originally Posted by Syndacate

Standards, not architecture. Stream specs, to be specific. Don't worry, though, that was more of a rant than anything else.

You never heard of standards-based architecture?

Anyway, it's all well and good to encourage adoption of standards. But that doesn't work well when implementation choices where made - because they had to be - before a standard had been created, and different vendors made different choices. Implementation decisions concerning endianness predated relevant standards. In fact, the standards were written with various freedoms (undefined behaviour, unspecified behaviour, the list goes on) because they were created AFTER implementation of the first technologies.

Originally Posted by Syndacate

In the 'real world' I write code that couldn't get much more architecture specific, but this isn't that, and I want this to be a bit more portable.

Then write your code so it doesn't rely on byte order. One standard way is to use the functions that convert to/from network format (htons(), etc) when passing binary data around. Writing code so it doesn't rely on byte order also limits some of your options related to bit fiddling. Do I/O using formatted text (which doesn't rely on byte order in a protocol) rather than binary formats.

**anduril462** · 04-15-2014

Originally Posted by Syndacate

The reason I'd like to avoid network conversions is simply that I'd rather not have dependencies on socket libraries. Though as you pointed out, I can just write my own.

Yes, but socket libraries are basically built-in to most common OSes, and even many not-so-common OSes. All you would need is some #ifdef statements:

Code:

#if defined(_WIN32)
#include <winsock2.h>
#elif defined(__linux__)  // maybe include other *nix OSes here
#include <arpa/inet.h>
...  // for every platform/OS you wish to support
#else
#include "your_hand_rolled_byte_swapping.h"  // use #error "Unsupported platform" or something if you don't want to support platforms that don't have ntohX/htonX built-in
#endif

This is better because you can use the built-in versions that are typically far more reliable, better tested and often better optimized, than whatever you write yourself. You still have to write it yourself if you want to support a platform that doesn't have it built in, but those aren't too common.

Originally Posted by Syndacate

I figured more-so that there's a way to do this "right" in just C/cstdlib/etc that would allow the code to be portable and potential issues with alignment, etc. to be avoided, though it seems the only way to get around the byte ordering without socket dependencies is to bake the functionality in - which is what I did. Being a higher level language, I would have thought that when doing it correctly the language could intrinsically get around the issue, thought it appears I'm mistaken. That's pretty much what it comes down to I guess.

I think you misunderstand "higher level language". That has nothing to do it's introspection into the underlying architecture and alignment requirements, it refers to the fact that it is closer to human thought/logic (loops to repeat, conditional branches, etc) than machine logic (individual assembly instructions). It can't intrinsically get around the alignment issues because the language was designed to be architecture independent, and not all architectures handle alignment issues the same way. An architecture isn't even required to have alignment issues, some arches may allow "unaligned" data access at no penalty. Thus it would be wrong for the C language to require any alignment-specific behavior, it merely "allows for the possibility". Due to the nature of the language -- that it allows things like pointer arithmetic and loose-ish typing and type casting, it's virtually impossible for the compiler to determine object locations and thus alignment locations at compile time (which it would need to generate instructions to work around unaligned data). Furthermore, techniques like address space layout randomization (for security) make this even more difficult for compilers. As Salem mentioned in post #6, the closest to a generic workaround is probably some sort of OS/kernel trap, not any C feature.

Originally Posted by Syndacate

In the end I used memcpy + swaps if needed. I didn't know that a greater than 1 byte MOV operation had alignment issue potential. How do you know if it's safe to use one or not?

You know by studying the C standard and your architecture, OS and implementation (compiler+libraries) documentation.

I hope that clears it up, I was having a hard time with the wording of that second paragraph.

**Syndacate** · 04-15-2014

Originally Posted by grumpy

You never heard of standards-based architecture?

Anyway, it's all well and good to encourage adoption of standards. But that doesn't work well when implementation choices where made - because they had to be - before a standard had been created, and different vendors made different choices. Implementation decisions concerning endianness predated relevant standards. In fact, the standards were written with various freedoms (undefined behaviour, unspecified behaviour, the list goes on) because they were created AFTER implementation of the first technologies.

I see. Yeah, they have to be made eventually I guess, and whatever they choose won't please everybody!

Originally Posted by grumpy

Then write your code so it doesn't rely on byte order. One standard way is to use the functions that convert to/from network format (htons(), etc) when passing binary data around. Writing code so it doesn't rely on byte order also limits some of your options related to bit fiddling. Do I/O using formatted text (which doesn't rely on byte order in a protocol) rather than binary formats.

It sounds like writing code that is truly byte independent is rather difficult, especially once many bit operations get involved.

Originally Posted by anduril462

Yes, but socket libraries are basically built-in to most common OSes, and even many not-so-common OSes. All you would need is some #ifdef statements:

Code:

#if defined(_WIN32)
#include <winsock2.h>
#elif defined(__linux__)  // maybe include other *nix OSes here
#include <arpa/inet.h>
...  // for every platform/OS you wish to support
#else
#include "your_hand_rolled_byte_swapping.h"  // use #error "Unsupported platform" or something if you don't want to support platforms that don't have ntohX/htonX built-in
#endif

This is better because you can use the built-in versions that are typically far more reliable, better tested and often better optimized, than whatever you write yourself. You still have to write it yourself if you want to support a platform that doesn't have it built in, but those aren't too common.

Yeah, that's a good point, I can always see what's around before using my own...and yeah, they'll definitely be better supported. I'll give that a shot, thanks.

Originally Posted by anduril462

I think you misunderstand "higher level language". That has nothing to do it's introspection into the underlying architecture and alignment requirements, it refers to the fact that it is closer to human thought/logic (loops to repeat, conditional branches, etc) than machine logic (individual assembly instructions). It can't intrinsically get around the alignment issues because the language was designed to be architecture independent, and not all architectures handle alignment issues the same way. An architecture isn't even required to have alignment issues, some arches may allow "unaligned" data access at no penalty. Thus it would be wrong for the C language to require any alignment-specific behavior, it merely "allows for the possibility". Due to the nature of the language -- that it allows things like pointer arithmetic and loose-ish typing and type casting, it's virtually impossible for the compiler to determine object locations and thus alignment locations at compile time (which it would need to generate instructions to work around unaligned data). Furthermore, techniques like address space layout randomization (for security) make this even more difficult for compilers. As Salem mentioned in post #6, the closest to a generic workaround is probably some sort of OS/kernel trap, not any C feature.

Ah, I see what you mean. Yeah, it definitely becomes fuzzy when you're comparing stuff that's taken for granted as "just works" versus what's actually required in the C spec and how to write code that's true to the spec.

Originally Posted by anduril462

You know by studying the C standard and your architecture, OS and implementation (compiler+libraries) documentation.

I hope that clears it up, I was having a hard time with the wording of that second paragraph.

Yeah, that cleared it up fine - thanks. I'll read up in the spec when I have a chance what assignment operators actually do guarantee, so I can base future decisions on that.

**rcgldr** · 04-15-2014

Originally Posted by Syndacate

value in a bitstream buffer

Normally bit streams are stored in big endian format, so the code to get the next byte of the stream into an unsigned 32 bit int variable, using var for the variable, and ptr for the poitner to an unsigned char:

Code:

    var = (var << 8) | (unsigned int)(*ptr);
    ptr++;

however there are some bit streams (at least one, dclz) that are little endian format, the code is:

Code:

    var = (var >> 8) | (((unsigned int)(*ptr))<<24);
    ptr++;

Thread: Byte Ordering Question

Thread Tools

Search Thread

Display

Byte Ordering Question

Similar Threads

byte array plus integer literal question

File Comparision byte by byte

reading files byte by byte

Possibly a stupidly simple question about byte arrays in structs…

byte and bit order question