I need to create a hashing function in which I use a name (c string) as the key. Is there any way to do this without treating it as an int?
I need to create a hashing function in which I use a name (c string) as the key. Is there any way to do this without treating it as an int?
basically, i need to read a file that contains records. i want to use the NAME as my hashing key. however, i am new to hashing, and i am not sure if i can create a hash table with a string or if i first need to convert that string in to an int, and then perform my hash function? like i said, i am working with a binary file and this is new to me. please help out if you can?
>however, i am new to hashing, and i am not sure if i can create a hash table with a string or if i first need to convert that string in to an int
It's a good idea to convert the string (hash) to an int so that you can use that value for direct access to the table. That's basically what a hash table is:
Here's a good general hash function that converts a string to unsigned long:Code:int index = hash ( string ); index %= TABLE_SIZE; // Do something with table[index]
My appologies if the algorithm is off, I kind of wrote it from memory.Code:unsigned long elf ( const char *key ) { const char *it = key; unsigned long ret = 0, i; while ( *it != '\0' ) { ret = ( ret << 4 ) + *it; i = ret & 0xf0000000; if ( i != 0 ) ret ^= i >> 24; ret &= ~i; it++; } return ret; }
-Prelude
My best code is written with the delete key.
you gave me two sections of code...THANKS!
however, i am unsure where i put each different segment?
also, you stated this:
int index = hash ( string );
where you have string, should i put the name of variable that is a string or just keep it as string?
thanks, i just need a little more explanation.
>thanks, i just need a little more explanation.
I can do better, I'll give you a simple implementation to pore over:
-PreludeCode:#include <iostream> #include <cstring> // Hash using ELF format unsigned long elf ( const char *key ) { const char *it = key; unsigned long ret = 0, i; while ( *it != '\0' ) { ret = ( ret << 4 ) + *it; i = ret & 0xf0000000; if ( i != 0 ) ret ^= i >> 24; ret &= ~i; it++; } return ret; } int main() { char table[13][20]; // Small table as an example for ( int i = 0; i < 13; i++ ) table[i][0] = '\0'; // Initialize table for ( int i = 0; i < 5; i++ ) { char name[20]; std::cout<<"Enter a name: "; std::cin.getline ( name, 20 ); unsigned long index = elf ( name ) % 13; // First hash // If the index already has an item, rehash to the next index while ( table[index][0] != '\0' ) index = elf ( name ) % 13 + 1; strcpy ( table[index], name ); } for ( int i = 0; i < 13; i++ ) std::cout<< table[i] <<std::endl; }
My best code is written with the delete key.
awesome, you guys are helping a lot here!
BUT, you know me...I still have some trouble.
the last chunk of code is close to what I think I need. However, the table I need has to have exactly 3479 records. How would I change the code given to make it work? I tried, but I'm having some trouble with the strcpy ALSO, I am inputing my information from a binary file. so how would I account for that as well? something like myInput >> name >> address >> etc;
thanks guys and gals!
ALSO, I noticed that I seem to be getting collissions within my table with the code I have. How do I go about implementing linear probing to avoid collisions? Would it be placed within the code above or outside of it?
>However, the table I need has to have exactly 3479 records.
Just change the size of the first dimension for the table.
>but I'm having some trouble with the strcpy
Could you be more specific?
>I am inputing my information from a binary file. so how would I account for that as well?
It makes your job considerably easier, just read an appropriately sized block into the input array and then insert exactly as shown above.
>How do I go about implementing linear probing to avoid collisions?
This is your lucky day, the code I gave you implements linear probing for collision resolution.
-Prelude
My best code is written with the delete key.
Prelude,
alright, so it is my luck day! Nice!
you said that I just need to change my first dimension--however, do i need the second dimension at all???
so i need something like this:
char table[3479][13]; ??
secondly, with binary files you said:
>It makes your job considerably easier, just read an appropriately sized block into the input array and then insert exactly as shown above.
do i set up a while(!myInput.eof())
{
myInput >> name >> etc;
}
i'm not sure what you mean by block size?
forgive me, i'm new...but i learn quick!
>you said that I just need to change my first dimension--however, do i need the second dimension at all???
Yes, the first dimension is the number of elements in the table, the second dimension is the number of characters in an element. Assuming that the table is an array of strings.
>i'm not sure what you mean by block size?
Reading from a binary file is not much different from reading a text file. Here is the input program:
And the program that wrote the binary file:Code:#include <fstream> #include <iostream> int main() { std::ifstream inb ( "test.txt", std::ios::binary ); char block[13]; while ( inb.read ( block, sizeof block ) ) std::cout<< block <<std::endl; inb.close(); }
Piece of cake, no harder than text files. :)Code:#include <fstream> #include <iostream> int main() { std::ofstream outb ( "test.txt", std::ios::binary ); char block[13]; while ( std::cin.getline ( block, sizeof block ) ) { outb.write ( block, sizeof block ); for ( int i = 0; i < sizeof block; i++ ) block[i] = '\0'; // Clear the block } outb.close(); }
-Prelude
My best code is written with the delete key.
once again, results are taking place...just a few questions:
can I do both of those in the same program with no problems?
where should I place this code you just gave me with respect to the code from before dealing with the hashing? should it be inserted where you had the user inputing names?
with my binary file, i need to read 4 values (a struct), and use one value as the hash key. would i just have 4 read statements in a row?
I appreciate the help Prelude!
Oh my, so I thought I was on the right track--then I spoke with my peer advisor and he's what he had to say:
your program is not allowed to create huge arrays to hold the records and your program is not allowed to hold more than 2 or 3 baby names records at any one time. So, making the table array is the wrong thing to do. You must use hash files, not hash table arrays. The concept is the same, but the storage must be on disk. You do not initialize to zero; the disk must hold baby struct images. You must create an initialization baby record to write to the file; that may contain 0 in some fields.
Any help now? Was I going in the complete wrong direction with the code above? This is horrible.
>Any help now?
I think you should consult www.google.com with the search string "hash files". I could spell it out here, but I know there are plenty of web sites that do this already, thus saving me all of that typing.
-Prelude
My best code is written with the delete key.
Hey Prelude...
I'll see what I can figure out, but I did have one question:
how do I initialize a hash file??? with a table, it was easy. Also, do I still need to use an array of any sort or not with hash files?