Please please help me with a C++ problem!

**seemorgh** · 08-24-2009

Hi there

I have a problem with my code and am in a big rush to finish it and will be most grateful if someone could point me in the right direction.

The problem is suppose you want to compress an array of integers so that it uses less space which I'm showing in the code sample below.

I have several issues here. First, I don't know if I'm going in the right direction by converting int to char or short that use less space and result in the data being compressed. If I'm wrong here how else can I acheive compression? If I'm right, how can I store the resulting char and/or short and/or integer values togther in memory to use them later?

Second, How can I use bitwise operations instead of converting small int values to char/short to achieve compression?

I would really really appreciate if someone could show me some sample code instead of pointing me to numerous pages of algorithms on the web, as I'm very much short in time. I'll be most grateful.

Many thanks

The code:

Code:

void CompressData()
{
        int iIndex = 0;
        int iTempValue = 0;
        int iDeltaValue = 0;
        int iOriginalValue = 0;
        char charValue = 0;
        short shortValue = 0;
        const iBufferSize = 10;
 
        int dataBuffer[iBufferSize] = {67082, 67033, 67019, 67149, 67044, 67012, 66984, 66866, 66693, 65223};
 
        // Encode the data using delta encoding
        for (iIndex = 0; iIndex < iBufferSize; iIndex++)
        {
                iOriginalValue = dataBuffer[iIndex];
                                
                iDeltaValue = pDataBuffer[iIndex] - iTempValue;
 
                dataBuffer[iIndex] = iDeltaValue ;
 
                iTempValue = iOriginalValue;
 
                if (IsOneByteValue(iDeltaValue))
                {
                        charValue = (char) iDeltaValue;
                }
                else if (IsTwoByteValue(iDeltaValue))
                {
                        shortValue = (short) iDeltaValue;
                }
 
                // WHAT TO DO HERE? HOW CAN I SAVE ALL OF THE RESULTING CHAR OR SHORT OR INTEGER IN MEMORY SO THAT I CAN USE THEM LATER TO DECOMPRESS?
        }
 
}
 
bool IsOneByteValue(int iValue)
{
        if (iValue < 0)
        {
                if (iValue >= -128)
                        return true;
        }
        else
        {
                if (iValue <= 127)
                        return true;
        }
        
        return false;
}
 
 
bool IsTwoByteValue(int iValue)
{
        if (iValue < 0)
        {
                if (iValue >= -32768)
                        return true;
        }
        else
        {
                if (iValue <= 65535)
                        return true;
        }
        
        return false;
}

**dwks** · 08-24-2009

Another idea: you could use "run-length encoding". Wikipedia will be able to tell you about it. That's no good unless you array has runs of elements which are the same value, though.

The way you have it set up, you'd probably have to create some sort of data structure, with a few bits to indicate whether the next field was an absolute value, or a char offset from the previous value, or a short offset. To be honest, it's a *lot* more trouble than it's worth. You're using, what, 10*4 = 40 bytes of memory? Who cares?

If you have a lot of data that needs to be compressed, I'd use an external compression program for that (or you can make your own Huffman encoding or something, but be warned, it's far from easy). If you have a lot of data that won't all fit into memory, store most of it on disk and only load some of it at a time into memory. But compression schemes such as what you're trying to create probably won't be very useful . . . .

By the way, this

Code:

bool IsOneByteValue(int iValue)
{
        if (iValue < 0)
        {
                if (iValue >= -128)
                        return true;
        }
        else
        {
                if (iValue <= 127)
                        return true;
        }
        
        return false;
}

can be written much more concisely as

Code:

bool IsOneByteValue(int iValue)
{
        if (iValue >= -128 && iValue <= 127)
               return true;
        return false;
}

or even

Code:

bool IsOneByteValue(int iValue)
{
        return (iValue >= -128 && iValue <= 127);
}

**seemorgh** · 08-24-2009

Originally Posted by dwks

Another idea: you could use "run-length encoding". Wikipedia will be able to tell you about it. That's no good unless you array has runs of elements which are the same value, though.

The way you have it set up, you'd probably have to create some sort of data structure, with a few bits to indicate whether the next field was an absolute value, or a char offset from the previous value, or a short offset. To be honest, it's a *lot* more trouble than it's worth. You're using, what, 10*4 = 40 bytes of memory? Who cares?

If you have a lot of data that needs to be compressed, I'd use an external compression program for that (or you can make your own Huffman encoding or something, but be warned, it's far from easy). If you have a lot of data that won't all fit into memory, store most of it on disk and only load some of it at a time into memory. But compression schemes such as what you're trying to create probably won't be very useful . . . .

By the way, this

Code:

bool IsOneByteValue(int iValue)
{
        if (iValue < 0)
        {
                if (iValue >= -128)
                        return true;
        }
        else
        {
                if (iValue <= 127)
                        return true;
        }
        
        return false;
}

can be written much more concisely as

Code:

bool IsOneByteValue(int iValue)
{
        if (iValue >= -128 && iValue <= 127)
               return true;
        return false;
}

or even

Code:

bool IsOneByteValue(int iValue)
{
        return (iValue >= -128 && iValue <= 127);
}

Thanks for the comment. I probably did not explain enough. I do need to implement the delta encoding algorithm and no other ones as this is a test for me and a proof of concept. Also amount of original data is not important and one can assume a small amount of data, say, an array of type integet with 30 values in it.

The change you made in the code works ok as long as the byte value is a signed value between -128 and 127. But it won't work if the byte value is unsinged which could be between 0 and 255.

**Zlatko** · 08-24-2009

I do need to implement the delta encoding algorithm

That sounds like an interesting problem. So, your compressed array would start with the first value, and then only deltas to the subsequent values. From looking at your code, I assume that you intend to have a variable number of bytes for each delta so that a small delta is stored in 1 byte, and a larger one stored in 2 or three bytes. Is that correct?

If so, you'd need some indication about how many bytes the next delta consumes, so you'd need to come up with an encoding scheme and make an assumption about the largest delta possible. Say the largest delta was between -16384 and 16383. You could fit that into fifteen bits or 2 bytes with 1 bit left over for encoding the length of the delta.

As you read the most significant byte of the delta you'd check the most significant bit. If it is a 0, the delta fits in the single byte. If it is a 1, the delta takes up 2 bytes.

In this way, a 1 byte delta really has only 7 usable bits, and a 2 byte delta has 15. A function like IsOneByteValue would return true if the argument is >= -64 and <= +63.

With such a narrow range for a 1 byte delta, you might be justified in fixing your deltas to 2 bytes but if you can guarantee that the delta fits in 15 bits, the variable length encoding will give better compression.

I hope that gets you started in thinking about different ways to interpret the problem. Let me know how it goes.

Thread: Please please help me with a C++ problem!

Thread Tools

Search Thread

Display

Please please help me with a C++ problem!

Similar Threads

Need help understanding a problem

Memory problem with Borland C 3.1

Someone having same problem with Code Block?

A question related to strcmp

WS_POPUP, continuation of old problem