Thread: Huffman compression

  1. #1
    Registered /usr
    Join Date
    Aug 2001
    Newport, South Wales, UK

    Cool Huffman compression

    After reading through the "Self-referential structures" chapter in The C Programming Language, I modified the binary tree code in it to count the number of occurances of each byte inside a file. This works great, and now I want to advance it to use this information to perform Huffman compression on the input file.
    When looking at examples of Huffman algorithms, I notice that the byte frequencies are usually stored in an array of 256 items (equal to the possible values of a byte). Should I now "flatten" my frequency tree into an array and use that as a base for building a Huffman tree, or is there a way I have not considered that would allow me to utilise the tree in its current form?

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    The edge of the known universe
    Say you have something like this
    typedef struct _node {
      char ch;
      int frequency;
      struct _node *left;
      struct _node *right;
    } node;
    I'm guessing that 'ch' is the key at present, so you compare the current char with 'ch' at the current node to decide whether to go left or right.
    But this tree will be balanced according to the order of characters in the file (so if 'z' was the first character, then 'z' would be at the root node, and your tree will be off balance.

    It seems to me that you need to do two things
    1. you need to replace each frequency with the appropriate Huffman code (the most frequent letter having the shortest code).
    2. Ideally, the most frequent letter also needs to be at the root node, to minimise the search time.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    But seeing as Huffman trees are usually built recursively (parents are generated from children and take the sum of their frequencies), would I need to modify the node struct to include a pointer to a parent node?

    typedef struct _node {
    	char ch;
    	int frequency;
    	struct _node *parent;
    	struct _node * left;
    	struct _node *right;
    } node;

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. The Huffman Algorithm
    By Desolation in forum C++ Programming
    Replies: 3
    Last Post: 05-26-2006, 12:30 PM
  2. Huffman Compression Program
    By ChadJohnson in forum C++ Programming
    Replies: 13
    Last Post: 01-23-2005, 02:49 PM
  3. compression
    By X PaYnE X in forum C Programming
    Replies: 20
    Last Post: 01-11-2005, 05:14 PM
  4. efficient compression algorithm pls
    By gooddevil in forum C Programming
    Replies: 7
    Last Post: 05-08-2004, 03:33 AM
  5. IDEA: Huffman compression function
    By confuted in forum Contests Board
    Replies: 4
    Last Post: 07-01-2003, 07:05 PM