On indexed searching:
This is a way to speed up the search for a matching word. You may not want or need it now, but it's good to know it's possible.
I call this an indexed search and it is faster than the normal binary search. It can be used as an "extra" with the binary search, to make it smarter and faster, or it can be used "full strength" with no binary search portion. The version I'll describe here, is the "extra" one.
Let's say you have 100,000 words, and you're able to put them all into one array. (This works even better with a file, but we'll use an array for the example.)
Normally a binary search would begin with midpoint (mid) = (0 + 100,000) / 2. That leaves us to start at index 50,000, and we may need as many as 16 more loops to find out if we have the word we're searching for or not. That's 16 possible string comparisons, which are much slower than a simple integer comparison.
Index searching to the rescue!
You'll have a small array of structs with two struct members: "low" and "high", both integers. Call the struct "go". The small array is made up by your program, when it first starts. As the words are loading into the word array, it counts how many words there are, and records the word number, every time the first letter of the word list, changes.
Now your little go array, has 'a' words starting at 0 index, and stopping at index 4558 (whatever).
When you go to search for "apple", your binary search won't start at 'k', because the index shows that go.low == 0, and that go.high == 4558. Now your binary search will be limited to indices 0 to 4558. Your search will begin then, at 2279.
With just a small number of comparing with one letter, you've removed over 95,000 words from the search space. With luck, and tweaking, you can set it up so you are fitting what you need into fast cache memory, and not causing page or cache memory flushing.
This is most impressive when the cost (in time), of comparisons is higher, like a slow disk or network access. It can also be optimized more, if you need it.
Note: you can quickly find the right index for go, by using the ASCII values of the letter. Say the first letter of the word is 'A', consider that when 'A' is lower cased, then:
go[firstLetter - 'a'] == index 0 (which is the index to that first letter, in the go array)
go['z' - 'a'] == index 26 (the index for "Zanzibar").