Hi everyone, this is my first post here
I learned some Unix-based C back in college, worked as an IVR developer in clinical trials for 3 years, and have been working for the last 2 years as a Business Analyst. I'm picking C back up again to brush up on my technical skills and eventually get into some hobbyist embedded systems programming.
Currently studying with K&R 2e and am working on Exercise 1.13:
"Write a program to print a histogram of the lengths of words in its input. It is easy to draw the histogram with the bars horizontal; a vertical orientation is more challenging"
I decided to go the route of using a vertical histogram and am looking for some constructive feedback on the code below. As part of my implementation, it was especially challenging to orient the x-axis numbers vertically with correct spacing and place value, but I think I've finally got it.
Note: I tried to include only features of the language that appear in chapter 1. So even though I'm aware of the existence of math.h's pow function and stdlib's exit() to implement error checking, I purposefully omitted them to focus on stdio.h and also get some practice with complex for loops.
Code:
#include <stdio.h>
#define MIN_WORD_LENGTH 1 /* No error checking built in for negative numbers
(have not learned stdlib yet to use exit() ...) */
#define MAX_WORD_LENGTH 25 /* Do not exceed 9999 - y-axis padding only goes up to 4 digits */
#define OUTSIDE_WORD 0
#define INSIDE_WORD 1
int main(void)
{
/* Initialization */
char c = 0;
int state = OUTSIDE_WORD,
i = 0,
j = 0,
wordLength = 0,
maxWordLengthFrequency = 0,
xAxisWidth = (MAX_WORD_LENGTH - MIN_WORD_LENGTH) + 1,
xAxisDepth = 0,
xAxisLayer = 0, /* horizontal layer of the x-axis currently being filled */
xAxisLayerMaxDivisions = 0, /* The maximum number of division operations that can be permitted
for this layer of the x-axis */
xAxisLayerMinPlaceValue = 0, /* The minimum word frequency value for which a number can possibly
be displayed on the current layer of the x-axis
i.e. print a number here, or blank space? */
xAxisCurrentNumber = 0, /* The number for which a digit is currently being printed on the x-axis */
xAxisDigit = 0; /* Target of integer division and modulo operations to extract digits from
the numbers being printed on the x-axis */
int wordLengthFrequencies[MAX_WORD_LENGTH] = { 0 };
/* Process Input and Record Length Frequencies */
while ((c = getchar()) != EOF)
{
//if (c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\b')
/* A "word" will be defined as contiguous set of alphanumeric characters
i.e. Words will terminate if we hit a non-alphanumeric character
(punctuation / special characters will not count towards word lengths) */
if ( ! ((c >= 'A' && c <= 'Z')
|| (c >= 'a' && c <= 'z')
|| (c >= '0' && c <= '9')))
{
if (state == INSIDE_WORD)
{
//Leaving a word, log its length and reset word length counter
wordLengthFrequencies[wordLength - 1]++;
wordLength = 0;
state = OUTSIDE_WORD;
}
}
else //Entering a word
{
//All words at or past the max length are counted as one category maxLength+
if (wordLength < MAX_WORD_LENGTH)
{
wordLength++;
}
state = INSIDE_WORD;
}
}
/* Determine Height of Histogram
(i.e. Max # of Occurrences) */
for (i = MIN_WORD_LENGTH; i <= MAX_WORD_LENGTH; i++)
{
if (wordLengthFrequencies[i - 1] > maxWordLengthFrequency)
{
maxWordLengthFrequency = wordLengthFrequencies[i - 1];
}
}
/* Find Necessary Depth of X-Axis Numbers
(integer division truncates each place value until we
have the number of digits in the max word length) */
i = MAX_WORD_LENGTH;
while (i != 0)
{
i /= 10;
xAxisDepth++;
}
/* Print Histogram Bars and Y-Axis */
printf("\nWord Length Frequency Histogram:\n\n"); //Histogram title
for (i = maxWordLengthFrequency; i > 0; i--)
{
printf("%4d | ", i); //See preprocessing - 9999 max length for 4-digit padding
for (j = MIN_WORD_LENGTH; j <= MAX_WORD_LENGTH; j++)
{
if (wordLengthFrequencies[j - 1] >= i)
{
printf("x "); //histogram bar element
}
else
{
printf(" "); //blank space
}
}
printf("\n");
}
/* Print Histogram X-Axis Bar */
printf(" --");
for (i = MIN_WORD_LENGTH; i <= MAX_WORD_LENGTH; i++)
{
printf("---");
}
printf("\n");
/* Print Histogram X-Axis Numbers
(Most Significant Digit First) */
//Iterate through each vertical layer of the X-axis numbers
for (xAxisLayer = 1, xAxisLayerMaxDivisions = xAxisDepth;
xAxisLayerMaxDivisions > 0;
xAxisLayer++, xAxisLayerMaxDivisions--)
{
printf(" ");
/* Place value will vary for digits occupying the same row
if the larger numbers are of different magnitudes */
for (xAxisLayerMinPlaceValue = 1, i = 1;
i < xAxisLayer;
xAxisLayerMinPlaceValue *= 10, i++);
/* Iterate through each length measurement and print the digit
appropriate for the x-axis layer */
for (xAxisCurrentNumber = MIN_WORD_LENGTH;
xAxisCurrentNumber <= MAX_WORD_LENGTH;
xAxisCurrentNumber++)
{
/* Only print a digit if the number is actually large enough
to reach down to this layer of the x-axis */
if (xAxisCurrentNumber >= xAxisLayerMinPlaceValue)
{
xAxisDigit = xAxisCurrentNumber;
for (i = xAxisLayerMaxDivisions;
i > 1 && xAxisDigit >= 10 && xAxisDigit >= xAxisLayerMinPlaceValue * 10;
i--)
{
xAxisDigit /= 10; //Truncate out the unneeded digits by exploiting integer/integer division
}
xAxisDigit %= 10; //Modulo 10 snags the least significant digit after divisions are applied
printf(" %d", xAxisDigit);
}
else
{
printf(" ");
}
}
printf("\n");
}
/* Print Final + on Last Value */
printf(" ");
for (i = 1; i <= xAxisWidth; i++)
{
printf(" ");
}
printf("+\n");
}
The main thing I'm not totally crazy about having done here was the dual initialization and increment operations in my for loops. I think they might be dense and hard to follow for someone unfamiliar with my program reading it. I tried to mitigate this by using descriptive variable names and comments, but I'd be really interested to hear a different perspective on how well-documented you'd consider this.