Earlier I posted a program as a solution to a K&R exercise that reads a file/text stream and prints out a histogram of number of occurrences of word lengths. I think I've worked out nearly all the bugs and this is what I now have:
I noticed that my program failed to count the last word in the file if the file did not end on a newline and had no trailing whitespace after the last word. So I added this particular snippet to take care of that:Code:/* Corresponding K&R section: 1.6 */ /* Prints vertical histogram of lengths of words in input*/ #include <stdio.h> /* NOTE: given enough horizontal space, can scale to any two-digit maximum by modifying this alone... gets messy with three digits, but I could make the numbers on the x-axis display vertically down to make it infinitely scalable. */ #define MIN_WORD_LENGTH 1 #define MAX_WORD_LENGTH 20 int main(void) { //Holds characters int c = 0; int lastchar = 0; //current number of contiguous non-whitespace chars int currentLength = 0; //holds number of occurrences of each length int wordLengthFrequencies[MAX_WORD_LENGTH] = {0}; //highest number of occurrences encountered int maxFrequency = 0; /* ---------------- Collect length data --------------- */ while((c = getchar()) != EOF) { //Are we currently inside a word? if(currentLength >= MIN_WORD_LENGTH) { //Are we encountering whitespace? if(c == ' ' || c == '\t' || c == '\n' || c == '\r' || c == '\v') { /* We've reached the end of a word. Update array */ wordLengthFrequencies[currentLength - 1]++; currentLength = 0; //Reset } /* No whitespace, so we're still inside a word. Update currentLength if it hasn't maxed out*/ else if(currentLength < MAX_WORD_LENGTH) currentLength++; } /* Have we just encountered the start of a word? */ else if(c != ' ' && c != '\t' && c != '\n' && c != '\r' && c != '\v') currentLength = 1; lastchar = c; } /* Handle case where file does not end on newline */ if(lastchar != '\n' && currentLength != 0) wordLengthFrequencies[currentLength - 1]++; /* --------------- Print Histogram ------------- */ printf("\n\n"); int i = 0; /* Determine maximum frequency */ for(i = MIN_WORD_LENGTH; i <= MAX_WORD_LENGTH; i++) { if(wordLengthFrequencies[i - 1] > maxFrequency) maxFrequency = wordLengthFrequencies[i- 1]; } /* Start printing the graph starting from the maximum frequency */ for(c = maxFrequency; c >= 1; c--) { /* Make sure graph will still be aligned even for 7-digit frequencies i.e. if we are operating on a large file */ printf("%7d | ", c); for(i = 0; i < MAX_WORD_LENGTH; i++) { if(wordLengthFrequencies[i] >= c) printf("* "); //fill in where appropriate else printf(" "); } printf("\n"); } /* print the x-axis and legend */ putchar('\t'); for(i = MIN_WORD_LENGTH; i <= MAX_WORD_LENGTH; i++) printf("---"); printf("\n\t"); for(i = MIN_WORD_LENGTH; i <= MAX_WORD_LENGTH; i++) printf("%3d", i); putchar('+'); printf("\n\nx-axis: word length\ny-axis: # of occurrences\n\n"); return 0; }
By my logic, if the last character to be read before EOF was not '\n', this meansCode:/* Handle case where file does not end on newline */ if(lastchar != '\n' && currentLength != 0) wordLengthFrequencies[currentLength - 1]++;
the file did not end on a newline. But, this does not necessarily mean I should
just go ahead and increment the array member corresponding to what's left
in currentLength. It may very well be 0 because I may have trailing whitespace after the last word which would cause currentLength to be reset when it is processed. What's worse, this would mean I'm incrementing the -1 index, which is out of bounds. No matter what whitespace is trailing, currentLength will be 0, so I just make sure it isn't. With that in place, I'm fairly certain my program is solid.
I might be being anal about this for an exercise out of a programming book, but I guess I see little point in trying to learn C with exercises if I don't make damned sure my solutions are airtight, given how easy it is to proverbially "shoot myself in the foot." I was wondering if anyone else could see any flaws in my logic above or if there's something I overlooked.



LinkBack URL
About LinkBacks



