hexadecimal to ASCII-decimal little indian-big indian

**TonyMontana21** · 06-26-2013

Greetings, what i am trying to do is to get as an input a name.wav file and print it's header. I have trouble findining a way to get 4 or 2 bytes from the file in hexadecimal form, if it's little indian change the position of the bytes and then convert it and put it in a struct. An atempt i made that strangely worked is the following. I don't yet understand how it got the right values since some elements were in little indian form. The only problem it has is that audioFormat and numChannels has wrong vallue but other like bloackAlign that are also 2 byte and in little indian form got the right value.

Here is my code:

Code:

//byte = unsigned char
//word = unsigned short int
//dword = unsigned int
typedef struct wavHeader
{   //RIFF CHUNK
    byte chunckID[4];
    dword chunckSize;
    byte format[4];
    //FMT SUB-CHUNK
    byte subchunk1ID[4];
    word subchunk1Size;
    word audioFormat;
    word numChannels;
    dword sampleRate;
    dword byteRate;
    word blockAlign;
    word bitsPerSample;
    //DATA SUB-CHUNK
    byte subchunk2ID[4];
    dword subchunk2Size;
}header;

int checkFileName(char *filename);
void printHeader(header *wav_header);

void list(char **array)
{
    header wavHeader;
    FILE *pFile;
    printf("%s%s\n", "filename: ", array[2]);
    if(checkFileName(array[2]) == 0)
    {
        printf("wrong file name\n");
        exit(1);
    }
    pFile = fopen (array[2] ,"r");
    if( pFile != NULL)
    {
        fread(&wavHeader, sizeof(header), 1, pFile);
        fclose(pFile);
        printHeader(&wavHeader);
    }
    else
    {
        printf("file doesn't exist\n");
        exit(1);
    }
}

**Elkvis** · 06-26-2013

one thing to note is that the word is not indian. it's endian. it refers to where the least significant byte goes, whether at the little end (lowest memory address) or the big end (highest memory address).

**dmh2000** · 06-26-2013

what is your platform? if you are on a PC, windows or linux, little endian is the native byte ordering so as long as your structure layout maps exactly to the format of the file header, you don't need to swap bytes.

edit: as for the numchannels and audioformat being wrong, there are a lot of variations of content in .wav files and the values you are seeing probably represent one of the extended formats.

**Nominal Animal** · 06-26-2013

See here for details on the WAV file format.

If the identifier is RIFF, multi-byte values are in little-endian byte order, except for chunk IDs and the format field in the RIFF chunk.

If the identifier is RIFX, multi-byte value are in big-endian byte order.

On any byte order, as long as ints are at least 32-bit:

Code:

#include <stdint.h>

/* For RIFF data (data in little-endian byte order) */
uint16_t u16 = (uint16_t)(data[offset + 0] + 256U * data[offset + 1]);
int16_t  s16 =  (int16_t)(data[offset + 0] + 256U * data[offset + 1]);
uint32_t u32 = (uint32_t)(data[offset + 0] + 256U * data[offset + 1] + 65536U * data[offset + 2] + 16777216U * data[offset + 3]);
int32_t  i32 =  (int32_t)(data[offset + 0] + 256U * data[offset + 1] + 65536U * data[offset + 2] + 16777216U * data[offset + 3]);

/* For RIFX data (data in little-endian byte order) */
uint16_t u16 = (uint16_t)(data[offset + 1] + 256U * data[offset + 0]);
int16_t  s16 =  (int16_t)(data[offset + 1] + 256U * data[offset + 0]);
uint32_t u32 = (uint32_t)(data[offset + 3] + 256U * data[offset + 2] + 65536U * data[offset + 1] + 16777216U * data[offset + 0]);
int32_t  i32 =  (int32_t)(data[offset + 3] + 256U * data[offset + 2] + 65536U * data[offset + 1] + 16777216U * data[offset + 0]);

The above is not the fastest way to do it, but it is fast enough. Let your compiler worry about optimizing it. The above work correctly no matter what the byte order is on the architecture that runs the code.

You can rewrite the above into accessor functions (used e.g. channels = get_u16(wavdata + 22);) with very little effort. I think that makes WAV parsing/generating code much more readable too.

You don't want to just use casting, because on many architectures, larger integer types need specific alignment, or unaligned accesses are extremely slow.

**TonyMontana21** · 06-27-2013

Originally Posted by dmh2000

what is your platform? if you are on a PC, windows or linux, little endian is the native byte ordering so as long as your structure layout maps exactly to the format of the file header, you don't need to swap bytes.

edit: as for the numchannels and audioformat being wrong, there are a lot of variations of content in .wav files and the values you are seeing probably represent one of the extended formats.

Thank you all for your replies. The only thing i changed is to make the subchunk1Size from word type to dword type. I use linux ubuntu 12.04 and i worry if this way will work also on other pcs..

**dmh2000** · 06-27-2013

for portability instead of using the homebrew word, dword etc types, use the standard types in stdint.h, like uint32_t, uint16_t etc. because for example in 64 bit programs, long and int might be the same or might be different sizes.
64-bit computing - Wikipedia, the free encyclopedia
the other issue you would run into would be structure padding which would depend on the platform and compiler. luckily in this case the wav header is laid out so it probably would not be padded. but you can't assume a struct definition is laid out a particular way without checking, and gnu and visual C both support '#pragma pack" which would let you specify the structure to be packed without padding.

**TonyMontana21** · 06-28-2013

I see but it's actually an assignment ...so i have to use word byte dword and the struct. And also we haven't learn something like #pragma...so i can't use it :/ (although i tried but it didn't seem to work). Right now i am trying to find a way to replace fread(&wavHeader, sizeof(header), 1, pFile); . A way that i would take 4 bytes at a time(or 2 bytes depends on the header part), put this 4 bytes or 2 bytes inside something..i don't know maybe a temporary array of unsigned int or something, if it's little endian call little endian function to manipulae the bytes and then transform it to ascii-decimal and put to the right field of the struct. I believe that if i succed on doing this part i will have enough information to solve the whole exercise but i am confused and stuck. thank you for your time

**Nominal Animal** · 06-28-2013

Well, you could use something like

Code:

unsigned long fget_be32(FILE *const in)
{
    int byte[4];

    byte[0] = getc(in);
    byte[1] = getc(in);
    byte[2] = getc(in);
    byte[3] = getc(in);

    /* Premature end of input, or read error?
     * Only need to check the last getc()'d one!
     * After EOF, you get EOFs, until you seek or rewind
     * or clear the error flag on the stream. */
    if (byte[3] == EOF)
        return 0UL;

    return 16777216UL * (unsigned long)byte[0]
         + 65536UL * (unsigned long)byte[1]
         + 256UL * (unsigned long)byte[2]
         + (unsigned long)byte[3];
}

to read the needed bytes (chars) from the stream, check that you didn't get EOF (it's enough to test the last one), and return the logical value the bytes form.

For WAV files, you'll also need to implement fget_le32() and fget_le16() for RIFF WAV files, and fget_be16 for RIFX WAV files. There are no single-byte fields that I remember, but if you do happen to need it, just use fgetc().

I used unsigned long as the result type, because if your instructor insists on using dword, they are either an idiot who should not be teaching programming, or work for Microsoft and the name of the course is actually "C programming in the Microsoft Windows environment".

Do not expect the contents of such course to be C; it will be Microsoft C. Much of the things you learn is limited to the Windows environment, and you'll likely have to un-learn many things before you can write acceptable C for other environments.

C code, especially C99 and POSIX.1-2001 is common on other architectures -- things like microcontrollers, embedded systems, Linux, and FreeBSD -- but not that common on Windows. Newer POSIX.1-2008 is very widely supported, and makes many tasks even easier (just check out getline for example -- no more line length limits).
Of course, Microsoft does not support C99, nor POSIX.1.

This means that the course you're on limits you to a small fraction of the domain where C skills are useful. Instead of opening up the entire C programmer work market, you're being limited to the small corner that is Microsoft C.

Such courses are a sad waste of resources and effort in my opinion, unless you're heavily supported by Microsoft or a Microsoft-dependant company. If you're paying for your tuition.. Well, I wouldn't. Not for that course.

Thread: hexadecimal to ASCII-decimal little indian-big indian

Thread Tools

Search Thread

Display

hexadecimal to ASCII-decimal little indian-big indian

Similar Threads

Happy (Indian) Independence Day

how can i find out whether my cpu is low indian or high indian?

decimal to hexadecimal

Indian Music

decimal to binary, decimal to hexadecimal and vice versa