-
hashing & searching
I made some algorithms for sorting and searchimg. I've read that hashing can be efficient for searching (without comparing). I don't know how to make hashtable, insert and search for element. I searched this board and found interesting stuff but pretty
much complicated.
I wrote simple function that implement binary search (just for demonstartion of how it's works). I decide to implement hash function key%capacity.My key is of course element in array. But I don't know can I search element through table and collision resolve using open adressing. I really don't
understand this completely without code. So if you could give me some example code (with open adressing because it is simple) to fit in with these simple functions?
Code:
#include <iostream>
using namespace std;
const int capacity=5;//capacity of the hash table
void bubble_sort(int array[],int length)
{
int swapflag=1;
for(int i=0;i<(length-1) && swapflag;i++)
{
swapflag=0;
for(int j=0;j<(length-(i+1));j++)
if(array[j]>array[j+1])
{
swap(array[j],array[j+1]);
swapflag=1;
}
}
}
int* binary_search(int array[],int value,int length)
{
int head=0,end=length-1,middle;
bubble_sort(array,length);
while(head<=end)
{
middle=(head+end)/2;
if(array[middle]==value)
return &array[middle];
else if(value>array[middle])
head=middle+1;
else
end=middle-1;
}
return 0;
}
//and with hashing...?
int hash_function(int key)
{
return key%capacity;
}
int main ( )
{
int array[]={10,8,9,5,6,7,3,2,1,4};
int* x;
x=binary_search(array,5,10);
if(!x)
{
cout<<"element doesn't exist";
return 0;
}
cout<<*x;
return 0;
}
;
Thanks
-
"Hash tables"s and "binary search" do not go together.
gg
-
Yes, I know, but I'm just wondring can I use hash table (and how) to search array like that in example?
Binary search is for other purpose and I 'didn't intend to use it in hash table
-
My understanding of the basic idea of hashing is that you order a group of data based on some algorhythm. Ascending order could be considered a very primitive hash, but usually it is something more complicated than that, like maybe taking a string, summing the ASCI value of each non-null char in the string to achieve a value and store the string at the index which is the sum of the ASCI values of the string. Then, rather than having to search the entire array to see if a given string is present you can do the calculation and look at just one (or maybe a few) elements. The hope is that doing the calculations and searching maybe a few elements is faster than doing a search using a different approach. The biggest problem is conflict resolution, which means what do you do if two entries end up with the same value based on the hashing algorhythm. You're free to deal with that problem in any way you wish, just like you're free to devise any hashing algorhythm you wish.
-
Ok, I 've read osmewhere taht hashing is different approach in search algorithms. Binary search for example use key value to split sorted array for faster searching. On the other hand hashing offers avoiding comparisions. So I tried to implement some hashing technique to search my array just to compare these searches.
Maybe hashing is not for searching at all. Is there any way to implement hashing with my example above or not?
-
>On the other hand hashing offers avoiding comparisions
A better explanation would be that binary search repeatedly divides the working range in half until the item is found or the search cannot continue. On the hand, hashing uses a transformation of the data being searched for to create an index in an unordered table.
>Maybe hashing is not for searching at all.
Of course it is. You just neglected to create a hash table. :)
>Is there any way to implement hashing with my example above or not?
There are many ways to implement hashing with your example. Here is one way:
Code:
#include <iostream>
using namespace std;
const int capacity = 11; // Hash table big enough for all possible elements
int hash_table[capacity]; // Existence hash table 0==not present, 1==present
void bubble_sort(int array[],int length)
{
int swapflag=1;
for(int i=0;i<(length-1) && swapflag;i++)
{
swapflag=0;
for(int j=0;j<(length-(i+1));j++)
if(array[j]>array[j+1])
{
swap(array[j],array[j+1]);
swapflag=1;
}
}
}
int* binary_search(int array[],int value,int length)
{
int head=0,end=length-1,middle;
bubble_sort(array,length);
while(head<=end)
{
middle=(head+end)/2;
if(array[middle]==value)
return &array[middle];
else if(value>array[middle])
head=middle+1;
else
end=middle-1;
}
return 0;
}
int hash_function(int key)
{
return key;
}
void build_hash_table(int array[],int length)
{
for(int i=0;i<length;i++)
hash_table[hash_function(array[i])]=1;
}
int main ( )
{
int array[]={10,8,9,5,6,7,3,2,1,4};
int* x;
bubble_sort(array,10);
x=binary_search(array,5,10);
if(!x)
cout<<"element doesn't exist\n";
else
cout<<"binary search succeeded\n";
build_hash_table(array,10);
int y=hash_table[hash_function(5)];
if(!y)
cout<<"element doesn't exist\n";
else
cout<<"hash table search succeeded\n";
return 0;
}
-
Thanks for your advices!
I tried this:
Code:
#include <iostream>
using namespace std;
const int capacity=10;
int hash_table[capacity];
void build_hash_table()
{
memset(hash_table,-1,capacity*sizeof(int));
}
int hash_function(int key)
{
return key%capacity;
}
int rehash(int i)
{
if(i<capacity-1)
return ++i;
else
return 0;
}
int Insert(int key)
{
int index=hash_function(key);
cout<<"key is "<<index<<endl;
if(hash_table[index]==-1)//indicate free space
{
hash_table[index]=1;
return 1;
}
else
{
cout<<"collision detected!"<<endl;
int new_index=rehash(index);
while(new_index!=index)
{
if(hash_table[new_index]==-1)
{
hash_table[new_index]=1;
return 1;
}
new_index=rehash(new_index);
}
return 0;
}
}
int main ( )
{
int array[]={10,8,9,1,7,7,3,2,1,4};
build_hash_table();
for(int i=0;i<capacity;i++)
Insert(array[i]);
for(i=0;i<capacity;i++)
cout<<hash_table[i]<<endl;
int y=hash_table[hash_function(6)];//how to test if element is there
return 0;
}
Try to simulate collision and to use hash function key%capacity.
That was starttin gpoint. In case of collision one should use linear probing.
But all I can manage to do is to screw up a lot.
If I have following conditions:
use simple array such as integer array,
use hash function key%capacity
use linear probing to solve collision,
What values should I put in hash table ( I use -1 to indicate free space and 1 occupied)?
How can I search for element?
Can you help me?
I intend to use this simple array example as demonstration of hashing, but obviously it is much more harder then I thought
Thanks all for help
-
>In case of collision one should use linear probing.
Actually, I find separate chaining to be a more generally useful collision resolution method. Linear probing is better if chaining takes up too much space, but that's not often the case.
>What values should I put in hash table
The most common hash tables store the values themselves where the index is the hashed value. It's an associative container where the value is both the stored object and the key after a transformation has been made.
>How can I search for element?
Start by getting the index. Then if the item doesn't match, perform a linear search or call the rehash function:
Code:
int index = hash ( search_key );
while ( hash_table[index] != search_key && hash_table[index] == OCCUPIED )
index = rehash ( index );
>Can you help me?
Yes. :)
>but obviously it is much more harder then I thought
Only until you wrap your mind around it. Then you'll find hash tables to be surprisingly simple and powerful.
-
I manage to made this:
Code:
#include <iostream>
using namespace std;
const int capacity=5;
int hash_table[capacity];
void build_hash_table()
{
memset(hash_table,-1,capacity*sizeof(int));
}
int hash_function(int key)
{
return key%capacity;
}
int rehash(int i)
{
if(i<capacity-1)
return ++i;
else
return 0;
}
int Insert(int key)
{
int index=hash_function(key);
cout<<"key is "<<index<<endl;
if(hash_table[index]==-1)//indicate free space
{
hash_table[index]=key;
return 1;
}
else
{
cout<<"collision detected!"<<endl;
int new_index=rehash(index);
while(new_index!=index)
{
if(hash_table[new_index]==-1)
{
hash_table[new_index]=key;
return 1;
}
new_index=rehash(new_index);
}
return 0;
}
}
int Find(int key)
{
int index=hash_function(key);
if(hash_table[index]==key)
return index;
else
{
int new_index=rehash(index);
while(new_index !=index)
{
if(hash_table[new_index]==key)
return new_index;
new_index=rehash(new_index);
}
return -1;
}
}
int main ( )
{
int array[]={10,8,9,1,7,7,3,2,1,4};
build_hash_table();
for(int i=0;i<capacity;i++)
Insert(array[i]);
for(i=0;i<capacity;i++)
cout<<hash_table[i]<<endl;
int y=Find(5);
if(y==-1)
cout<<"Element not exists!";
return 0;
}
Please, examine this code.
I sthis good in principe (according to definitions and method of using hash table)?
Maybe there are some features that should be changed?
-
Aside from the table being way too small, I don't see a problem. For a maximally efficient hash table, a good rule of thumb is to keep it 1/2 to 2/3 full at any given time. So set the size to a suitable prime number (for correctness in as many possible cases) such as 19, and you can fit the whole of array in the table:
Code:
const int capacity=19;
...
int main()
{
...
int size = 10;
build_hash_table();
for(int i=0;i<size;i++)
Insert(array[i]);
...