I have an assignment to complete . It says I have to read a file which contains millions & millions of strings.
I have to read the file and build a structure to hold the strings. This system must be able to answer the question "is this new string present?"
I AM also expected to break the list down into "buckets" of strings so the 'string to match' is able to chose the correct bucket to search in (quickly) and that bucket should contain no more than total/hashMask strings or so (ie 3,000,000 / 0xFFF == 732 objects per bucket).
Now I have created a structure of hash table and function to read a file and after that I am sort of clueless.
Below is my sample code
Code:
#define MAX_NAME 100
typedef struct hashTable
{
char key[MAX_NAME];
struct hashTable *next;
};
/*
this function will read the file (assume one string per line)
and create the list of lists (list of buckets), adding one object per string.
*/
HashList *loadDataSet(char *filename, int hashMask)
{
// to read a file
char readString[ MAX_NAME];
File *fp ;
if( (fp = fopen(filename, "r") )== NULL)
{
printf(" failed to open the file\n");
exit(0);
}
while( fgets ( readString,MAX_NAME -1, fp ) != NULL)
{
//need to break the list down into "buckets" of strings so the 'string to match'
// is able to chose the correct bucket to search in (quickly)
//and that bucket should contain no more than total/hashMask strings
or so (ie 3,000,000 / 0xFFF == 732 objects per bucket).
}
fclose(fp);
}