Thread: Patricia trie - skipping redundant entries

  1. #1
    Registered User
    Join Date
    Sep 2010
    Posts
    12

    Patricia trie - skipping redundant entries

    A question about patricia trie (two questions, actually).
    I am playing with an example of trie shown here: TRIE, a C++ approach… - Shamail's Blog .
    1) I want to test loading a lot of numbers from a text file. How would I skip entries that already exist in a trie?
    2) How would I load these entries from a text file to begin with? I found some links to “ifstream file” but my trouble is I have no idea how to pass file contents into - code line by line.
    Here is an example of the text file.
    34323234434434434
    34534645654566664
    44564564564564566
    44564564564564566 <- duplicate entry to be skipped

    Much obliged.

  2. #2
    Bored Programmer
    Join Date
    Jul 2009
    Location
    Tomball, TX
    Posts
    428
    1. I don't know about tries, but I'd assume a comparison. If you are using the link above's EXACT source code you could probably change reportsearch to a boolean and then
    Code:
    getline(myfile, line); // getline(the file to read from, the string the line is saved as)   
    if(!t.reportsearch(line.c_str()))
    {
      t.insert(line.c_str());
    }
    >2. How to read from a file to begin with
    Input/Output with files

    and a post from yesterday on this board

    http://cboard.cprogramming.com/cplus...e-sav-how.html

    There are at least 3-4 examples in the second link. The first link is the one I used to originally figure it out myself. Again I've never used a trie and I did not attempt to compile my example, but I'm hoping it'll point you in the right direction, besides someone smarter usually shows up to correct me within a few hours ;o). Good luck.

    P.S. You still didn't post any code of how you've worked on it already. (Third Post I can recall that you aren't posting your own attempt at creating a working example, this isn't going to work in your favor if your habits continue as such)

  3. #3
    Registered User
    Join Date
    Sep 2010
    Posts
    12

    Added to file read block but it does not work

    Thanks for the reply, Lesshardtofind.

    I changed to trie code by adding a block that reads from a text file, but it errors out. Here is the new code:

    Code:
    /* trie.cc - short and sweet data storage API :-)
     * Implementation of TRIE data structure, a bare bone.
     * Fri Oct  9 22:49:25 IST 2009
     * Author: Shamail Tayyab [[email protected]]
     * Creative Commons BY-NC-ND.
     * Insertion: O(n). Searching: O(n).
     */
     
    #include <iostream>
    #include <string>
    #include <cstring>
    #include <cstdlib>
    #include <fstream>
    
    #define MAXDICTELEMENTS 227          /* Using a-z and space */
    #define MAXDEPTH 220                 /* Supporting 20 character depth of dictionary */
     
    using namespace::std;
     /* Start open file block */
       
         ifstream::pos_type size;
         char * memblock;
    
         int main () {
         ifstream file ("words.txt", ios::in|ios::ate);
         if (file.is_open())
         {
          size = file.tellg();
          memblock = new char [size];
          file.seekg (0, ios::beg);
          file.read (memblock, size);
          file.close();
    
          cout << "\n+++++++++++++++++++\nFile is in memory\n+++++++++++++++++++\n\n";
    
          delete[] memblock;
      }
      else cout << "Unable to open file";
      
      /* End open file block */
    
    
    /* Our dictionary node */
    struct Node {
       int wordEnd;   /* Represents that a word is ending here also */
       Node *next[MAXDICTELEMENTS];
    };
     
    /* Trie represented as class */
    class Trie {
       private:
          Node *start;                  /* Our start pointer */
          char dict[MAXDICTELEMENTS];   /* Our char dictionary or supported characters in TRIE */
          Node *createNode ();                /* Create and mallocs a node */
          int printThisNode ( Node * );       /* Debugging function */
          int isEmpty ( Node *, char );       /* Detects if we have an empty node at given depth */
          int getIndexForChar ( char );       /* Returns ith position from dict array */
       public:
          Trie ();                            /* Constructor function to initialize the TRIE */
          ~Trie ();                           /* FREEing up of memory. FIXME required */
          int insert ( char * );              /* Insert a word in TRIE */
          void reportSearch ( char *word );   /* Searches and reports on stderr the result in printed form */
          int search ( char *word );          /* API to search */
    };
     
    Trie::Trie () {
       start = (Node *)malloc ( sizeof ( Node ) );
       strcpy ( dict, "abcdefghijklmnopqrstuvwxyz " );    /* Add here for any other word list, also update MAXDICTELEMENTS*/
    }
     
    Trie::~Trie () {
       free (start);
    }
     
    /*
     * Array index for TRIE's next.
     */
    int Trie::getIndexForChar ( char c ) {
       for ( int i = 0 ; i < MAXDICTELEMENTS ; i++ ) {
          if ( dict[i] == c ) {
             return i;
          }
       }
       return ( -1 );
    }
     
    /*
     * For debugging, prints which all nodes are free in a given level of TRIE.
     */
    int Trie::printThisNode ( Node * treeNode ) {
       if ( treeNode == NULL ) {
          return EXIT_FAILURE;
       }
       for ( int i = 0 ; i < MAXDICTELEMENTS ; i++ ) {
          if ( treeNode -> next[i] == NULL ) {
             cout << "Empty(" << i << ")\t";
          }
          else {
             cout << "Filled(" << i << ")\t";
          }
          cout << endl;
       }
    }
     
    /*
     * If the TRIE has an empty node at this treeNode at cth position?
     */
    int Trie::isEmpty ( Node *treeNode, char ch ) {
       if ( treeNode -> next[getIndexForChar(ch)] == NULL )
          return (1);
       else
          return (0);
    }
     
    /*
     * Brings one node from the memory.
     */
    Node *Trie::createNode () {
       Node *created = (Node *) malloc ( sizeof ( Node ) );
       created -> wordEnd = 0;
       return (created);
    }
     
    /*
     * Insert a word into the TRIE.
     * Quick Algo: Traverse to the right depth, insert.. O(n).
     */
    int Trie::insert ( char *word ) {
       if ( strlen ( word ) > MAXDEPTH ) {
          cerr << "MAXDEPTH Exceeded! Cannot insert(" << word << ")" << endl;
          return EXIT_FAILURE;
       }
       cerr << "Adding word " << word << " to the dictionary..." << endl;
       Node *operatingOn = start;
       for ( int i = 0 ; i < strlen ( word ) ; i++ ) {
          if ( isEmpty ( operatingOn, word[i] ) ) {
             Node *childHere = createNode ();
             operatingOn -> next[getIndexForChar(word[i])] = childHere;
             operatingOn = childHere;
          }
       }
       operatingOn -> wordEnd = 1;
       return EXIT_SUCCESS;
    }
     
    /* Searches in a TRIE the given word.
     * Quick Algo: Iterate on chars, if the pointer is null anywhere, its not found. else accent into one level on this character.
     *             If even all pointers are fixed and last pointer has 0 in its wordEnd, still not found.
     *             Else found.
     */
    int Trie::search ( char *word ) {
       Node *iterator = start;
       if ( iterator == NULL )
          return (EXIT_FAILURE);
       if ( strlen ( word ) > MAXDEPTH ) {
          cerr << "MAXDEPTH Exceeded!" << endl;
          return (EXIT_FAILURE);
       }
       for ( int i = 0 ; i < strlen(word) ; i++ ) {
          if ( iterator -> next[getIndexForChar(word[i])] == NULL )
             return EXIT_FAILURE;
          iterator = iterator -> next[getIndexForChar(word[i])];
       }
       if ( iterator -> wordEnd == 0 ) {
          return EXIT_FAILURE;
       }
       else {
          return EXIT_SUCCESS;
       }
     
    }
     
    /*
     * Perform search and print result on STDERR.
     */
    void Trie::reportSearch ( char *word ) {
       if ( search ( word ) ) {
          cerr << "Not Found!" << endl;
       }
       else 
          cerr << "Found!" << endl;
    }
     
    /* 
     * Test case 
     */
    int main ( int argc, char *argv[] ) {
       Trie t;
       t.insert ( (char *)"545645645645645645645677777777777777776567567567567567567567567567567" );
       t.insert ( (char *)"45645645656r456456456" ); 
       t.insert ( (char *)"789789789789789789789" );
       t.insert ( (char *)"4564565633453455" );
       t.insert ( (char *)"45456456456" );
       t.insert ( (char *)"456456456456" );
       t.insert ( (char *)"5886456456456456456" );
       t.insert ( (char *)"45645645656r456489789789789" );
       t.insert ( (char *)"4564565883453455" );
       t.insert ( (char *)"45457878787878456" );
       t.insert ( (char *)"4578787878788786456" );
       t.insert ( (char *)"5456456456456456456456" );
       t.insert ( (char *)"45645645656r456456456" ); 
       t.insert ( (char *)"789789789789789789789" );
       t.insert ( (char *)"4564565633453455" );
       t.insert ( (char *)"45456456456" );
       t.insert ( (char *)"456456456456" );
       t.insert ( (char *)"5886456456456456456" );
       t.insert ( (char *)"45645645656r456489789789789" );
       t.insert ( (char *)"4564565883453455" );
       t.insert ( (char *)"45457878787878456" );
       t.insert ( (char *)"4578787878788786456" );
       t.insert ( (char *)"5456456456456456456456" );
       t.insert ( (char *)"45645645656r456456456" ); 
       t.insert ( (char *)"789789789789789789789" );
       t.insert ( (char *)"4564565633453455" );
       t.insert ( (char *)"45456456456" );
       t.insert ( (char *)"456456456456" );
       t.insert ( (char *)"5886456456456456456" );
       t.insert ( (char *)"45645645656r456489789789789" );
       t.insert ( (char *)"4564565883453455" );
       t.insert ( (char *)"45457878787878456" );
       t.insert ( (char *)"4578787878788786456" );
       t.insert ( (char *)"5456456456456456456456" );
       t.insert ( (char *)"45645645656r456456456" ); 
       t.insert ( (char *)"789789789789789789789" );
       t.insert ( (char *)"4564565633453455" );
       t.insert ( (char *)"45456456456" );
       t.insert ( (char *)"456456456456" );
       t.insert ( (char *)"5886456456456456456" );
       t.insert ( (char *)"45645645656r456489789789789" );
       t.insert ( (char *)"4564565883453455" );
       t.insert ( (char *)"45457878787878456" );
       t.insert ( (char *)"4578787878788786456" );
       t.insert ( (char *)"5456456456456456456456" );
       t.insert ( (char *)"45645645656r456456456" ); 
       t.insert ( (char *)"789789789789789789789" );
       t.insert ( (char *)"4564565633453455" );
       t.insert ( (char *)"45456456456" );
       t.insert ( (char *)"456456456456" );
       t.insert ( (char *)"5886456456456456456" );
       t.insert ( (char *)"45645645656r456489789789789" );
       t.insert ( (char *)"4564565883453455" );
       t.insert ( (char *)"45457878787878456" );
       t.insert ( (char *)"4578787878788786456" );
       
    
       system ("pause");
       return 0;
    }
    How do I change it so it actually reads words.txt and passes the lines from it instead of using the /* Test cases */ block?

    Thanks to all who replied.

  4. #4
    Bored Programmer
    Join Date
    Jul 2009
    Location
    Tomball, TX
    Posts
    428
    Hey one question... why isn't

    Code:
     /* Start open file block */
       
         ifstream::pos_type size;
         char * memblock;
    
         int main () {
         ifstream file ("words.txt", ios::in|ios::ate);
         if (file.is_open())
         {
          size = file.tellg();
          memblock = new char [size];
          file.seekg (0, ios::beg);
          file.read (memblock, size);
          file.close();
    
          cout << "\n+++++++++++++++++++\nFile is in memory\n+++++++++++++++++++\n\n";
    
          delete[] memblock;
      }
      else cout << "Unable to open file";
      
      /* End open file block */
    iside a function?

    I see you have a function as part of the code, but you also named it int main()... sometimes naming things according to what they do can help you understand your code alot more. Also the int main has a opening curly brace {, but no closing curly brace } before the declaration of struct node.

    Did you compile this code? And on the note of how to save the data from the file into your progrom.. the links I posted above already answer that question.
    Virtual reality hello world http://www.rodneybrothers.com/vr/vrh...rld/index.html in html and javascript.
    Viewable with dodocase, google cardboard, OR, and other compatible VR gear.

  5. #5
    Registered User
    Join Date
    Sep 2010
    Posts
    12
    How would I put it inside the function? Can you paste the correct code, please? Thank you so much for your help.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. pointer to pointer realloc problem
    By prakash0104 in forum C Programming
    Replies: 14
    Last Post: 04-06-2009, 08:53 PM
  2. Serializing/deserializing problem
    By vsla in forum C Programming
    Replies: 3
    Last Post: 04-21-2008, 03:55 PM
  3. Missing Entries in Hashtable
    By wuzzo87 in forum C Programming
    Replies: 3
    Last Post: 05-13-2007, 12:46 AM
  4. Parsing and Fwrite Help Please.
    By Freestyler in forum C Programming
    Replies: 11
    Last Post: 12-06-2005, 11:38 AM