I would use a binary search tree, it's already sorted by definition and a snap to check if a new word already exists :-)
Code:
tree *add(tree *root, char *word)
{
int cmp;
if (root == 0) /* Doesn't exist, add it */
{
root = malloc(sizeof *root);
root->word = malloc(strlen(word)+1);
/* Error checking not implemented for simplicity */
root->frequency = 1;
root->left = root->right = 0;
strcpy(root->word,word);
}
else if((cmp = strcmp(word, root->word)) < 0)
{
root->left = add(root->left, word);
}
else if(cmp > 0)
{
root->right = add(root->right, word);
}
else /* Match */
{
root->frequency++;
}
return root;
}
With word frequencies you also don't have to worry about tree balancing unless the input file is a dictionary or something equally sorted, but that's not likely since you're counting occurances. :-)