huffman code

This is a discussion on huffman code within the C Programming forums, part of the General Programming Boards category; hello, I wish to implement huffman code in C..I know the logic but stuck in implementing..i m finding it pretty ...

  1. #1
    Registered User
    Join Date
    Feb 2008
    Posts
    146

    huffman code

    hello,

    I wish to implement huffman code in C..I know the logic but stuck in implementing..i m finding it pretty difficult , I mean first sorting and counting frequency is ok but then building the tree is where I am totally stuck, and then again sorting?? Pls help if anyone can...

    Thanks,
    Edesign

  2. #2
    uint64_t...think positive xuftugulus's Avatar
    Join Date
    Feb 2008
    Location
    Pacem
    Posts
    355
    Well from my memory there are at least a dozen implementations of huffman out there.
    The tree is nothing more than a simple binary tree with leafs containing the symbols that
    are encoded, and the path to the symbol is the bitwise encoding.
    You can refer to wikipedia for the algorithm, or study a ready made implementation
    to see how actual details are handled.
    Code:
    ...
        goto johny_walker_red_label;
    johny_walker_blue_label: exit(-149$);
    johny_walker_red_label : exit( -22$);
    A typical example of ...cheap programming practices.

  3. #3
    Registered User
    Join Date
    Feb 2008
    Posts
    146
    Hi,

    You are right, googling it, I got a few codes, even in that code I am facing problem in understanding sorting part, especially after base nodes are defined...

    Code:
    // huffman.cpp : Defines the entry point for the console application.
    #include "stdafx.h"
    #include <stdio.h>
    #include <stdlib.h>
    #include <malloc.h>
    #include <assert.h>
    #include <math.h>
    #include <string.h>
    
    #define         NUMBYTES                          256
    #define         NUMBITS                            16
    #define         INPUTFILENAME             "input.huf"
    #define         OUTPUTFILENAME           "output.huf"
    #define         TEMPFILENAME            "huffman.tmp"
    
    typedef struct _HUFFNODE {
            char              byte;
            long              freq;
            struct _HUFFNODE *parent;
            struct _HUFFNODE *left;
            struct _HUFFNODE *right;
    } HUFFNODE;
    
    typedef struct _HUFFTAB {
            char              byte;
            char              numbits;
            long              freq;
            long              bits[NUMBITS];
    } HUFFTAB;
    
    FILE       *f;
    long       fsize, numnodes, numbasenodes;
    long       freq_tab[NUMBYTES];    // Frequency table
    HUFFNODE  *nodelist[NUMBYTES];
    HUFFNODE  *basenode[NUMBYTES];
    HUFFNODE  *templist[NUMBYTES];
    HUFFTAB    code_tab[NUMBYTES];
    
    /////////////////////////////////
    //  Return file size in bytes  //
    /////////////////////////////////
    long filesize(FILE *f)
    {
      long pos, fsize;
    
      pos = ftell(f);
      fseek(f, 0L, SEEK_END);
      fsize = ftell(f);
      fseek(f, pos, SEEK_SET);
    
      return fsize;
    }
    
    /////////////////////////////////////////////////////////
    //  Read input file to memory to speed up compression  //
    /////////////////////////////////////////////////////////
    char *huffman_readinputstream(char *inputstream)
    {
      char *stream;
    
      f = fopen(inputstream, "r+b");
      assert(f != NULL);
      fsize = filesize(f);
    
      stream = (char *)calloc(fsize, sizeof(char));
      assert(stream != NULL);
    
      fread(stream, fsize, 1, f);
      fclose(f);
    
      return stream;
    }
    
    
    ////////////////////////////
    //  Make frequency table  //
    ////////////////////////////
    void huffman_buildfreq(char *stream)
    {
      long  k;
    
      for (k = 0; k < NUMBYTES; k++)
        freq_tab[k] = 0;
    
      for (k = 0; k < fsize; k++)
        freq_tab[stream[k]]++;
    }
    
    /////////////////////////////////////////////////////////
    //  Sort node list table using straight insertion sort //
    /////////////////////////////////////////////////////////
    void huffman_sortnodelist(HUFFNODE *nodes[])
    {
      HUFFNODE  *node;
      long       i, j, k;
    
      for (i = 0; i < numnodes-1; i++) {
        k = i;
        node = nodes[i];
    
        for (j = i+1; j < numnodes; j++)
    	      if (nodes[j]->freq < node->freq) {     //Surprisingly breaks at this point in 2nd sort...
            k = j;
            node = nodes[j];
          }
    
        if (k != i) {
          node = nodes[i];
          nodes[i] = nodes[k];
          nodes[k] = node;
        }
      }
    }
    
    //////////////////////////
    //  Build huffman tree  //
    //////////////////////////
    void huffman_buildtree(char *stream)
    {
      HUFFNODE  *node;
      HUFFNODE  *node1, *node2;  // Two nodes with smallest frequency
      long       k, n;
    
      // Build frequency table
      huffman_buildfreq(stream);
    
      // Build base nodes
      numbasenodes = 0;
      for (k = 0; k < NUMBYTES; k++) {
        if (freq_tab[k] > 0) {
    
          node = (HUFFNODE *)calloc(1, sizeof(HUFFNODE));
          assert(node != NULL);
    
          node->byte = k;
          node->freq = freq_tab[k];
          node->parent = NULL;
          node->left = NULL;
          node->right = NULL;
    
          nodelist[numbasenodes] = node;
          basenode[numbasenodes] = node;
          templist[numbasenodes] = node;
    
          numbasenodes++;
        }
      }
    
      // Build huffman tree
      numnodes = numbasenodes;
      n = numnodes;
      while (--n > 0) {
    
        // Sort node list go get two nodes with minor frequency
        huffman_sortnodelist(nodelist);
    
        node1 = nodelist[0];
        node2 = nodelist[1];
    
        // Make new node and link nodes
        node = (HUFFNODE *)calloc(1, sizeof(HUFFNODE));
    
        node->byte = 'x';
        node->freq = node1->freq + node2->freq;
        node->left = node1;
        node->right = node2;
        node->parent = NULL;
    
        node1->parent = node;
        node2->parent = node;
    
        // Add new node and remove old two
        nodelist[0] = node;
        for (k = 1; k < n; k++)
          nodelist[k] = nodelist[k+1];
        nodelist[n] = NULL;
    
        templist[numnodes++] = node;
    
      }
    
      for (k = 0; k < numnodes; k++)
        nodelist[k] = templist[k];
    }
    ////////////////////
    //  Main program  //
    ////////////////////
    int main()
    {
      char   *stream;
      long    k;
    
      // Read input stream
      stream = huffman_readinputstream(INPUTFILENAME);
    
      // Build huffman tree
      huffman_buildtree(stream);
    
      return 0;
    }
    have just mentioned sort and tree build part

    Code:
    for (j = i+1; j < numnodes; j++)
    	      if (nodes[j]->freq < node->freq)
    after 1st sort, it always breaks at this point...I can't understand why..

  4. #4
    Registered User
    Join Date
    Jan 2008
    Posts
    287
    Who wrote this code?? :/

    First off, their insertion sort sucks. It basically guarantees O(n^2) comparisons. I would suggest replacing it with this:
    Code:
    void huffman_sortnodelist(HUFFNODE *nodes[])
    {
      HUFFNODE  *node;
      long       i, j;
    
      for (i = 1; i < numnodes; i++) {
        node = nodes[i];
    
        for (j = i-1; j >= 0 && nodes[j]->freq > node->freq; j--)
          nodes[j+1] = nodes[j];
    
        nodes[j+1] = node;
      }
    }
    I have no idea what templist is, what its used for, or why its even needed. You can probably get rid of it.

    Now here's the main problem:
    Code:
     // Add new node and remove old two
        nodelist[0] = node;
        for (k = 1; k < n; k++)
          nodelist[k] = nodelist[k+1];
        nodelist[n] = NULL;
    
        templist[numnodes++] = node;
    Notice, the comment even says that its adding a node and removing two. Explain to me how that should result in incrementing the numnodes count....? The sort routine needs to know how many nodes its dealing with, otherwise it will attempt to access NULL pointers!

    [edit] I just checked to see if you replied, and noticed that I had an off-by-one error in the sort. I fixed it. [/edit]
    Last edited by arpsmack; 03-22-2008 at 12:51 PM.

  5. #5
    Registered User
    Join Date
    Feb 2008
    Posts
    146
    Hello Sir,

    Thanks for help..
    I left it in between..thought I should start with simpler one...so started working on Run length coding..perhaps I can get back to huffman after I sharpen my poor programming skills...

    Thanks,
    Edesign

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Proposal: Code colouring
    By Perspective in forum A Brief History of Cprogramming.com
    Replies: 28
    Last Post: 05-14-2007, 07:23 AM
  2. Values changing without reason?
    By subtled in forum C Programming
    Replies: 2
    Last Post: 04-19-2007, 10:20 AM
  3. Updated sound engine code
    By VirtualAce in forum Game Programming
    Replies: 8
    Last Post: 11-18-2004, 11:38 AM
  4. Interface Question
    By smog890 in forum C Programming
    Replies: 11
    Last Post: 06-03-2002, 05:06 PM
  5. Replies: 4
    Last Post: 01-15-2002, 11:04 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21