Thread: indexing

  1. #1
    ckyuen
    Guest

    indexing

    i need to write a program phone takes one argument, a filename, from which it reads data. The data format is:

    surname tab firstname tab phonenumber
    The program reads the data into an array. Each record in the array has three fields. For fields that are variable in length, you should store the data in a space-efficient manner.

    You will construct a tree-structured index for the surname and phonenumber fields using the data structures specified below. The tree index provides an efficient means for searching for records matching the search key in that particular field. For example, the record Smith John 98591444 would be found when searching the surname field on the key Smith.

    The nodes in the tree store only the index in the data array of the item, not the actual record or any of the keys. For example, a record stored in position a[17] of data array a would be referred to in the index node simply as 17.

    i must use:

    a randomized BST for the surname index, and

    a binary search tree for the phonenumber index.

    i wanna do the BST for the phone no first? how could i change my code ? i know it's not working......cananyone give me some hints/code......i just have the concept, but i don;t know how to implement it
    thanks very much
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #define MAXITEM 2000
    #define maxN 10
    
    typedef int Key;
    typedef struct
    {
        char *sname;
        char *lname;
        char phone[maxN];
    } Item;
    
    Item NULLitem = {"","",-1};
    
    int eq(Key k1,Key k2){
        return k1 == k2;
    }
    int less(Key k1, Key k2){
        return k1 < k2;
    }
    Key key(Item item){
        return item.phone;
    }
    
    typedef struct STnode* link;
    struct STnode{
           Item item;
           link left;
           link right;
           int N;
    };
    static link head,z;
    
    void STinit()
    { head =(z = NEW(NULLitem,0,0,0));}
    
    link NEWnode(Item item, link left, link right){
         link x =malloc(sizeof *x);
         if (x==NULL){
            printf("error allocating memory.\n");
            exit(EXIT_FAILURE);
        }
        x->item  =item;
        x->left  =left;
        x->right =right;
        return x;
    }
    static link root;
    
    Item search(link x, Key k){
         if(x == NULL)
             return NULLitem;
         else if (eq(k, key(x->item))
              return x->item;
         else if (less(k,key(x->item))
              return search(x->left,k)
         else
              return search(x->right,k);
    }
    
    Item STsearch(Key k){
         return search(root,k);
    }
    
    link insert(link x,Item item){
         if (x==NULL)
            return NEWnode(item, NULL, NULL);
         else if (less(Key(item), key(x->item))
              x->left = insert(x->left,item);
         else
              x->right = insert(x->righ, itm);
         return x;
    }
    
    void STinsert(Item item ){
         head = insert(root, item);
    }
    
    
    
    void  usage( char *progname )
    /***************************/
    {
       fprintf( stderr,
                "Usage: %s \"source-file\"\n",
                progname );
    
       exit( EXIT_FAILURE );
    }
    
    int main (int argc, char *argv[])
    {
        FILE *stream;
        Item records[MAXITEM];
        int indices[MAXITEM];
        int i,count = 0, x , y,t,N=0;
        char temp_sname[100];
        char temp_lname[100];
        char query[10];
        char *v;
    
        if (argc != 2)
            usage (argv[0]);
        if( (stream = fopen( argv[1], "r" )) != NULL )
        {
              while( !feof( stream ) )
              {
                 if ( fscanf( stream, "%s\t%s\t%d", temp_sname, temp_lname, records[count].phone ) != 3 )
                     break;
    
                 if( (records[count].sname = malloc( strlen( temp_sname ) + 1 )) != NULL )
                 {
                     strcpy( records[count].sname, temp_sname );
                 }
    
                 if( (records[count].lname = malloc( strlen( temp_lname ) + 1 )) != NULL )
                 {
                     strcpy( records[count].lname, temp_lname );
                 }
    
                 i++;
    
                 if (count >= MAXITEM)
                 {
                  fprintf(stderr,"too many items\n");
                  exit (1);
                 }
               }
    
               STinit(maxN);
               for (i=0;i<N;i++) STinsert(records[i]);
               while (gets(query)!=NULL)
                     if(!null(v=STsearch(query)))
                            printf("%s\n", query);
                     else printf("(not found) %s\n",query);
    
             }
    
          else
              {
                 printf( "Problem opening the file\n" );
               }
    
    }

  2. #2
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    If all you want in the tree are indices then you'll want to change Item to int and use the random number generation loop that I suggested in your previous thread asking the same question. Finally, replace
    Code:
    treeInsert(records[r]);
    with your own:
    Code:
    STinsert(index);
    Note: Be wary of code you pinch from that book, sometimes it works, but most of the time it's bad code used as an example and not meant to be run.
    My best code is written with the delete key.

  3. #3
    ckyuen
    Guest
    i've read the code, but it's not for indexing a BST, right?
    only for the Randomized BST
    and if you donn't mind , can u give me some hints to do it with my program and how to change the code....i really have no idea how to write that. even i got the idea....thanks

  4. #4
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >i've read the code, but it's not for indexing a BST, right?
    As I understand it you want a binary search tree of indices that will be used to index an array of records.

    >only for the Randomized BST
    It's a good idea to use a randomized tree anyway. Reading records from a file is risky as there's a high probability that they will be sorted, this is very bad for a tree. A randomized tree will at least have a decent balance with sorted data.

    >can u give me some hints to do it with my program and how to change the code....
    Here is a quick modification that gets things running (as I understand what you want). I also removed unecessaries for testing the tree, so you'll have to add them back:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #define MAXITEM 2000
    #define MAXN    10
    #define FILLED  1
    
    typedef int Key;
    typedef struct
    {
        char *sname;
        char *lname;
        char phone[MAXN];
    } Item;
    
    typedef struct STnode* link;
    
    struct STnode{
        Key  item;
        link left;
        link right;
    };
    
    static link head;
    
    static link NEWnode(Key item){
        link x =malloc(sizeof *x);
        
        if (x==NULL){
            printf("error allocating memory.\n");
            exit(EXIT_FAILURE);
        }
        
        x->item  =item;
        x->left  =0;
        x->right =0;
        
        return x;
    }
    
    static link insert(link x, Key item){
        if (x==NULL)
            return NEWnode(item);
        else if (item < x->item)
            x->left = insert(x->left,item);
        else
            x->right = insert(x->right, item);
    
        return x;
    }
    
    static void STinsert(Key item){
        head = insert(head, item);
    }
    
    static void walk(link node)
    {
        if (node == NULL)
            return;
    
        walk(node->left);
        printf("%d\n", node->item);
        walk(node->right);
    }
    
    int main(void)
    {
        FILE *stream;
        Item records[MAXITEM];
        int  indices[MAXITEM] = {0};
        int  i;
        int  count = 0;
        char temp_sname[100];
        char temp_lname[100];
        
        stream = fopen("input.txt", "r");
    
        if(stream != NULL)
        {
            while(fscanf(stream, "%s\t%s\t%s", temp_sname, temp_lname, records[count].phone) == 3)
            {
                if((records[count].sname = malloc(strlen(temp_sname) + 1)) != NULL)
                    strcpy( records[count].sname, temp_sname );
                
                if((records[count].lname = malloc(strlen(temp_lname) + 1)) != NULL)
                    strcpy( records[count].lname, temp_lname );
                
                if (++count >= MAXITEM)
                    break;
            }
            
            i = 0;
    
            while (i < count)
            {
                int r = (int)(rand() / (double)RAND_MAX * count);
    
                if (indices[r] == FILLED)
                    continue;
    
                i++;
                indices[r] = FILLED;
    
                STinsert(r);
            }
    
            walk(head); /* Test that it worked */
        }
        else
            perror( "Problem opening the file\n" );
        
        return 0;
    }
    My best code is written with the delete key.

  5. #5
    ckyuen
    Guest
    thanks a lot,
    if u don;t mind could u explain the code in brief?
    i'm not sure the following part
    Code:
     int r = (int)(rand() / (double)RAND_MAX * count);
    
                if (indices[r] == FILLED)
                    continue;
    
                i++;
                indices[r] = FILLED;
    is that generate a random trees by putting a new node at the root of the tree;
    why do i have to skip the code if indices[r]==1?
    what's this code do? generate a randomzed bst for the record?
    what's missing...? search part?
    much obliged...

  6. #6
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >is that generate a random trees by putting a new node at the root of the tree;
    It inserts a random index into the tree. It creates a randomized tree by not inserting sorted values, thus keeping balance.

    >why do i have to skip the code if indices[r]==1?
    Think about it. If you've inserted say, index 5 into the tree, why would you want to insert it again if the random number generator gives you 5 again? That's why you save indices and skip the insertion if you get repeated numbers, so that the tree contains only one occurance of each index.

    >what's this code do? generate a randomzed bst for the record?
    It creates a balanced binary search tree of indices using random number generation. This is precisely what you've asked for several times.

    >what's missing...? search part?
    Yes, but searching is pretty much the same for every type of binary search tree, so there was no reason for me to keep it there cluttering the code and hiding what I wanted to show you.
    My best code is written with the delete key.

  7. #7
    ckyuen
    Guest
    thanks
    some more questions:
    1. is the code generate a Randomized BST indexing one record as a whole, rather that using BST for phone no,Randomized BST for surname,
    2.if i wanna count the no of key comparsions used in constructing each index, can i just simply use a global var, say comps=0
    then put it in the code:
    Code:
    static link insert(link x, Key item){
        if (x==NULL)
            return NEWnode(item);
        else if (item < x->item){
             comps++;
            x->left = insert(x->left,item);
            }
        else
            x->right = insert(x->right, item);
    
        return x;
    }

  8. #8
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    > 1. is the code generate a Randomized BST indexing one record as a whole,
    >rather that using BST for phone no,Randomized BST for surname,
    This is one tree, it holds numbers, each number corresponding to the array of records. For example, if in your array you have 5 records, the tree might look like this:
    Code:
        4
      3   5
    1   2
    You would have two trees, one for the phone number and one for the surname. Each would make a different comparison in the insertion function, but the tree itself would still hold just numbers. Then your search function would return the number and you use it to index the array:
    Code:
    int index = search(tree, somekey);
    
    /* Do something with records[index] */
    >if i wanna count the no of key comparsions used in constructing each index, can i just simply use a global var
    Yes.
    My best code is written with the delete key.

  9. #9
    ckyuen
    Guest
    >You would have two trees, one for the phone number and one for the surname. Each would make a different comparison in the insertion function, but the tree itself would still hold just numbers.

    i'm sorry, i don't quite understand.....is that mean i've already build the tree,all i need is to insert surname or phone no in the tree, and write a search function if the key is matching to the record, return the record that would be refered by the index.....

  10. #10
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >is that mean i've already build the tree,all i need is to insert surname or phone no in the tree
    I've been trying to explain this ever since your first post, surname and phone number do not go in the tree. The records are already in an array, you place indices to those records in the tree. You have two different trees with indices as the value of a node, the insertion function of one tree compares surnames and the insertion function of the other tree compares phone numbers. Likewise, the search functions for each tree compares the item that you want to search for.

    Though I'm beginning to think that you should just have two trees with copies of the records since you're having so much trouble with this concept.
    My best code is written with the delete key.

  11. #11
    ckyuen
    Guest
    thanks for your patient.
    so i've done placing indics for the records in the array, what i need to do is write a search function to compare the node and what i wanna search for,
    one more question, the requirement states ,the program needs to print to stdout the number of key comparisons used in constructing each index, and continues to take commands from stdin.
    i've posted aquestion about it, but i think i should print somthing like this
    bst: 1000 comparisons
    randomized bst: 75 comparisons

    if i just put it a global var......that's not gonna be working.......right?
    how can i implement it?thanks very much

  12. #12
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >how can i implement it?
    Since you're using a global variable, why not make it an array? Or use two variables, one for each tree?
    My best code is written with the delete key.

  13. #13
    ckyuen
    Guest
    is the searching function is something like this
    Code:
    static link search(link h, Key item){
        Key t = key(h->item);
        if (h==NULL) return NULLitem;
        if eq(item,t) return h->item;
        if less(item,t)
            return (h->left,item);
        else return (h->right, item);
    }
    thanks

  14. #14
    ckyuen
    Guest
    sorry . again
    [/code]
    #define eq(A,B) ((A) == (B))
    #define NULLitem {"","",-1}
    #define less(A,B) (A<B)
    ....

    static link search(link h, Key item){
    Key t = key(h->item);
    if (h==NULL) return NULLitem;
    if eq(item,t) return h->item;
    if less(item,t)
    return (h->left,item);
    else return (h->right, item);
    }

  15. #15
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >is the searching function is something like this
    Yes.
    My best code is written with the delete key.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Indexing 1
    By joshua in forum C Programming
    Replies: 1
    Last Post: 11-02-2005, 11:32 PM
  2. indexing overload concept idea
    By Trauts in forum C++ Programming
    Replies: 6
    Last Post: 05-18-2003, 10:00 PM
  3. resources for network media indexing application
    By goose in forum Windows Programming
    Replies: 2
    Last Post: 10-14-2002, 02:04 AM
  4. Replies: 4
    Last Post: 10-11-2002, 06:58 AM
  5. Array Indexing w/ strlen()
    By c++_n00b in forum C++ Programming
    Replies: 8
    Last Post: 06-04-2002, 11:18 AM