hashing & searching

This is a discussion on hashing & searching within the C++ Programming forums, part of the General Programming Boards category; I made some algorithms for sorting and searchimg. I've read that hashing can be efficient for searching (without comparing). I ...

  1. #1
    Registered User Micko's Avatar
    Join Date
    Nov 2003
    Posts
    712

    hashing & searching

    I made some algorithms for sorting and searchimg. I've read that hashing can be efficient for searching (without comparing). I don't know how to make hashtable, insert and search for element. I searched this board and found interesting stuff but pretty
    much complicated.
    I wrote simple function that implement binary search (just for demonstartion of how it's works). I decide to implement hash function key%capacity.My key is of course element in array. But I don't know can I search element through table and collision resolve using open adressing. I really don't
    understand this completely without code. So if you could give me some example code (with open adressing because it is simple) to fit in with these simple functions?
    Code:
    #include <iostream>
    using namespace std;
    const int capacity=5;//capacity of the hash table
    void bubble_sort(int array[],int length)
    {
    	int swapflag=1;
    	
    	for(int i=0;i<(length-1) && swapflag;i++)
    	{
    		swapflag=0;
    		for(int j=0;j<(length-(i+1));j++)
    			if(array[j]>array[j+1])
    			{
    				swap(array[j],array[j+1]);
    				swapflag=1;
    			}	
    	}
    
    }
    int* binary_search(int array[],int value,int length)
    {
    	int head=0,end=length-1,middle;
    	bubble_sort(array,length);
    	while(head<=end)
    	{
    		middle=(head+end)/2;
    		if(array[middle]==value)
    			return &array[middle];
    		else if(value>array[middle])
    			head=middle+1;
    		else
    			end=middle-1;
    	}
    	return 0;
    }
    //and with hashing...?
    int hash_function(int key)
    {
    	return key%capacity;
    }
    
    
    
    int main ( ) 
    {
    	int array[]={10,8,9,5,6,7,3,2,1,4};
    	int* x;
    	x=binary_search(array,5,10);
    	if(!x)
    	{
    		cout<<"element doesn't exist";
    		return 0;
    	}
    	cout<<*x;
    	return 0;
    }
    ;
    Thanks

  2. #2
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,669
    "Hash tables"s and "binary search" do not go together.

    gg

  3. #3
    Registered User Micko's Avatar
    Join Date
    Nov 2003
    Posts
    712
    Yes, I know, but I'm just wondring can I use hash table (and how) to search array like that in example?
    Binary search is for other purpose and I 'didn't intend to use it in hash table

  4. #4
    Registered User
    Join Date
    Mar 2002
    Posts
    1,595
    My understanding of the basic idea of hashing is that you order a group of data based on some algorhythm. Ascending order could be considered a very primitive hash, but usually it is something more complicated than that, like maybe taking a string, summing the ASCI value of each non-null char in the string to achieve a value and store the string at the index which is the sum of the ASCI values of the string. Then, rather than having to search the entire array to see if a given string is present you can do the calculation and look at just one (or maybe a few) elements. The hope is that doing the calculations and searching maybe a few elements is faster than doing a search using a different approach. The biggest problem is conflict resolution, which means what do you do if two entries end up with the same value based on the hashing algorhythm. You're free to deal with that problem in any way you wish, just like you're free to devise any hashing algorhythm you wish.

  5. #5
    Registered User Micko's Avatar
    Join Date
    Nov 2003
    Posts
    712
    Ok, I 've read osmewhere taht hashing is different approach in search algorithms. Binary search for example use key value to split sorted array for faster searching. On the other hand hashing offers avoiding comparisions. So I tried to implement some hashing technique to search my array just to compare these searches.
    Maybe hashing is not for searching at all. Is there any way to implement hashing with my example above or not?
    Last edited by Micko; 03-14-2004 at 12:29 PM.

  6. #6
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,796
    >On the other hand hashing offers avoiding comparisions
    A better explanation would be that binary search repeatedly divides the working range in half until the item is found or the search cannot continue. On the hand, hashing uses a transformation of the data being searched for to create an index in an unordered table.

    >Maybe hashing is not for searching at all.
    Of course it is. You just neglected to create a hash table.

    >Is there any way to implement hashing with my example above or not?
    There are many ways to implement hashing with your example. Here is one way:
    Code:
    #include <iostream>
    
    using namespace std;
    
    const int capacity = 11; // Hash table big enough for all possible elements
    int hash_table[capacity]; // Existence hash table 0==not present, 1==present
    
    void bubble_sort(int array[],int length)
    {
      int swapflag=1;
    
      for(int i=0;i<(length-1) && swapflag;i++)
      {
        swapflag=0;
        for(int j=0;j<(length-(i+1));j++)
          if(array[j]>array[j+1])
          {
            swap(array[j],array[j+1]);
            swapflag=1;
          }	
      }
    }
    
    int* binary_search(int array[],int value,int length)
    {
      int head=0,end=length-1,middle;
      bubble_sort(array,length);
      while(head<=end)
      {
        middle=(head+end)/2;
        if(array[middle]==value)
          return &array[middle];
        else if(value>array[middle])
          head=middle+1;
        else
          end=middle-1;
      }
      return 0;
    }
    
    int hash_function(int key)
    {
      return key;
    }
    
    void build_hash_table(int array[],int length)
    {
      for(int i=0;i<length;i++)
        hash_table[hash_function(array[i])]=1;
    }
    
    int main ( ) 
    {
      int array[]={10,8,9,5,6,7,3,2,1,4};
      int* x;
      bubble_sort(array,10);
      x=binary_search(array,5,10);
      if(!x)
        cout<<"element doesn't exist\n";
      else
        cout<<"binary search succeeded\n";
      build_hash_table(array,10);
      int y=hash_table[hash_function(5)];
      if(!y)
        cout<<"element doesn't exist\n";
      else
        cout<<"hash table search succeeded\n";
      return 0;
    }
    My best code is written with the delete key.

  7. #7
    Registered User Micko's Avatar
    Join Date
    Nov 2003
    Posts
    712
    Thanks for your advices!
    I tried this:
    Code:
    #include <iostream>
    
    using namespace std;
    
    const int capacity=10;
    int hash_table[capacity];
    void build_hash_table()
    {
    	memset(hash_table,-1,capacity*sizeof(int));
    }
    
    int hash_function(int key)
    {
      return key%capacity;
    }
    int rehash(int i)
    {
    	if(i<capacity-1)
    		return ++i;
    	else
    		return 0;
    
    }
    int Insert(int key)
    {
    	int index=hash_function(key);
    	cout<<"key is "<<index<<endl;
    	if(hash_table[index]==-1)//indicate free space 
    	{
    		hash_table[index]=1;
    		return 1;
    	}
    	else
    	{
    		cout<<"collision detected!"<<endl;
    		int new_index=rehash(index);
    		while(new_index!=index)
    		{
    			if(hash_table[new_index]==-1)
    			{
    				hash_table[new_index]=1;
    				return 1;
    			}
    				
    			new_index=rehash(new_index);
    		}
    		return 0;
    		
    	}
    }
    
    int main ( ) 
    {
      int array[]={10,8,9,1,7,7,3,2,1,4};
      
      build_hash_table();
      
     for(int i=0;i<capacity;i++)
    	 Insert(array[i]);
    
     for(i=0;i<capacity;i++)
    	 cout<<hash_table[i]<<endl;
    int y=hash_table[hash_function(6)];//how to test if element is there
    
      return 0;
    }
    Try to simulate collision and to use hash function key%capacity.
    That was starttin gpoint. In case of collision one should use linear probing.
    But all I can manage to do is to screw up a lot.
    If I have following conditions:
    use simple array such as integer array,
    use hash function key%capacity
    use linear probing to solve collision,
    What values should I put in hash table ( I use -1 to indicate free space and 1 occupied)?
    How can I search for element?
    Can you help me?
    I intend to use this simple array example as demonstration of hashing, but obviously it is much more harder then I thought
    Thanks all for help

  8. #8
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,796
    >In case of collision one should use linear probing.
    Actually, I find separate chaining to be a more generally useful collision resolution method. Linear probing is better if chaining takes up too much space, but that's not often the case.

    >What values should I put in hash table
    The most common hash tables store the values themselves where the index is the hashed value. It's an associative container where the value is both the stored object and the key after a transformation has been made.

    >How can I search for element?
    Start by getting the index. Then if the item doesn't match, perform a linear search or call the rehash function:
    Code:
    int index = hash ( search_key );
    while ( hash_table[index] != search_key && hash_table[index] == OCCUPIED )
      index = rehash ( index );
    >Can you help me?
    Yes.

    >but obviously it is much more harder then I thought
    Only until you wrap your mind around it. Then you'll find hash tables to be surprisingly simple and powerful.
    My best code is written with the delete key.

  9. #9
    Registered User Micko's Avatar
    Join Date
    Nov 2003
    Posts
    712
    I manage to made this:
    Code:
    #include <iostream>
    
    using namespace std;
    
    const int capacity=5;
    int hash_table[capacity];
    void build_hash_table()
    {
    	memset(hash_table,-1,capacity*sizeof(int));
    }
    
    int hash_function(int key)
    {
      return key%capacity;
    }
    int rehash(int i)
    {
    	if(i<capacity-1)
    		return ++i;
    	else
    		return 0;
    
    }
    int Insert(int key)
    {
    	int index=hash_function(key);
    	cout<<"key is "<<index<<endl;
    	if(hash_table[index]==-1)//indicate free space 
    	{
    		hash_table[index]=key;
    		return 1;
    	}
    	else
    	{
    		cout<<"collision detected!"<<endl;
    		int new_index=rehash(index);
    		while(new_index!=index)
    		{
    			if(hash_table[new_index]==-1)
    			{
    				hash_table[new_index]=key;
    				return 1;
    			}
    				
    			new_index=rehash(new_index);
    		}
    		return 0;
    		
    	}
    }
    int Find(int key)
    {
    	int index=hash_function(key);
    	if(hash_table[index]==key)
    		return index;
    	else
    	{
    		int new_index=rehash(index);
    		while(new_index !=index)
    		{
    			if(hash_table[new_index]==key)
    				return new_index;
    			new_index=rehash(new_index);
    		}
    		return -1;
    	}
    }
    int main ( ) 
    {
      int array[]={10,8,9,1,7,7,3,2,1,4};
      
      build_hash_table();
      
     for(int i=0;i<capacity;i++)
    	 Insert(array[i]);
    
     for(i=0;i<capacity;i++)
    	 cout<<hash_table[i]<<endl;
     int y=Find(5);
    if(y==-1)
    cout<<"Element not exists!";
      return 0;
    }
    Please, examine this code.
    I sthis good in principe (according to definitions and method of using hash table)?
    Maybe there are some features that should be changed?

  10. #10
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,796
    Aside from the table being way too small, I don't see a problem. For a maximally efficient hash table, a good rule of thumb is to keep it 1/2 to 2/3 full at any given time. So set the size to a suitable prime number (for correctness in as many possible cases) such as 19, and you can fit the whole of array in the table:
    Code:
    const int capacity=19;
    ...
    int main()
    {
      ...
      int size = 10;
    
      build_hash_table();
      for(int i=0;i<size;i++)
        Insert(array[i]);
      ...
    My best code is written with the delete key.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Creating Linked List Using Hashing and Stack
    By m0ntana in forum C Programming
    Replies: 2
    Last Post: 04-07-2007, 07:11 AM
  2. Logical errors with seach function
    By Taka in forum C Programming
    Replies: 4
    Last Post: 09-18-2006, 05:20 AM
  3. C++ std routines
    By siavoshkc in forum C++ Programming
    Replies: 33
    Last Post: 07-28-2006, 12:13 AM
  4. Visual C++ 2005 linking and file sizes
    By Rune Hunter in forum C++ Programming
    Replies: 2
    Last Post: 11-12-2005, 09:41 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21