Hello,
I have a buffer and I want to compress the contents of the buffer. And also be able to decompress.
Can anyone point me in the right direction where I can find out more information on how to do this?
Many thanks,
Steve
Hello,
I have a buffer and I want to compress the contents of the buffer. And also be able to decompress.
Can anyone point me in the right direction where I can find out more information on how to do this?
Many thanks,
Steve
There are many different sorts of compressions:
- Huffman coding (use shorter bitpatterns for the most common datasets, longer for the rarer ones). You can either use a dynamic approach, where the bitpatterns and their translation is generated based on the current data, and a translation table stored in the compressed data. The other altiernative, which relies on statistical knowledge of the data, uses a fixed set of translation table, where the (statistically) most common occurances use the shortest bitpatterns.
- RLE - Run length encoding (AAAAAB turns into 5(A)B)
- Byte Pair Compression - Take a block of data, find an unused byte-value, encode the most common two bytes of data in the block as that unused byte-value. Repeat until no further compression is possible for this block.
Of course, if you want to make it simple, just write the data out to a file or pipe, and run that through for example gzip...
--
Mats
Compilers can produce warnings - make the compiler programmers happy: Use them!
Please don't PM me for help - and no, I don't do help over instant messengers.
Here's one compression algorithm. That's also a very good search term to plug into clusty or google.
cprogramming.com wrote an article about this subject: http://cprogramming.com/tutorial/com...y/huffman.html
Yes indeed it is a good search term. Huffman trees are widely used in many compression programs mainly because of the decompression speed. Arithmetic encoding is another obscure way to compress, but not recommended as a starting attempt.
Taking a look at wikipedia one can find the most common algorithms nicely explained. I impleented huffman using only wiki reference.
A typical example of ...cheap programming practices.Code:... goto johny_walker_red_label; johny_walker_blue_label: exit(-149$); johny_walker_red_label : exit( -22$);
zlib.
Rolling your own will take months, and it will do worse than zlib.
Waaay too little information I'm afraid. Here's a few things you need to provide information on:
- Buffer size
- Type, source or characteristics of the data (image, text, machine code, binary data files etc)
- Lossy or lossless compression
- What balance of speed vs compression ratio
My homepage
Advice: Take only as directed - If symptoms persist, please see your debugger
Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"