I have been trying to understand how to accomplish the tasks needed for this project I'm working on which goes in and returns number of sentences, etc, but I'm pretty sure I'm either leaving out a necessary part, or not comparing the strings appropriately. If a kind person would help me understand what I'm not doing right, and how to fix it, I would greatly appreciate it.

Here is the assignment:

This assignment requires you to use strings, loops, conditionals, and file redirection. So, the main goal of this assignment is for you to show that you have mastered the syntax of strings, string-related functions, and the general application of loops and conditionals, and to show the techniques used to perform some common text-analysis operations.Write a C program that will read a series of lines of input text (each input text line will not exceed 80 characters in length) and determine the number of each unique "word"s found in the input. While finding the words, several statistics about the words and text are to be accumulated.
A "word" is defined for this assignment as a series of non-white-space characters with leading and trailing punctuation characters removed. You are to consider punctuation characters to be all printing characters except letters, digits, and the space. C can automatically identify punctuation characters using theispunct() function. Note that punctuation characters inside a word (such as teacher's) are considered part of the word, and sequences of numbers (like 1234) would also be considered valid words, based on the assignment's definition of a word.
The program should initially display on the screen the problem number, the course number, your name, your email address, and the name of the csp machine on which you ran your program.
The program should determine each word found in a series of input lines, and accumulate the following characteristics about the words:
  • The number of short-length words (words with a length less than 6).
  • The number of medium-length words (words with a length greater than 5 but less than 12).
  • The number of long-length words (words with a length greater than 11).
  • The number of capitalized words.
  • The number of common words ("the", "of", "a", "is", "that", "are"), without worrying about case. Treat the words case-insensitively. Accumulate one total for all of the common words listed.
  • The number of sentences (a sentence is considered to be (for this assignment) to be indicated by a word that has either a period, exclamation point, or a question mark someone in its trailing punctuation).
After all of the words have been determined, display a summary report of each of the characteristics described above.

Here is what I have come up with so far:

Code:
// CSE 1030 Program Three - Alexander Hollis - email - csp_03



#include <stdio.h>
#include <string.h>
#define PUNC ispunct
#define LAST strlen(str)-1


int main() {
    
    char str[80+1];
    int sentences, sWords, MWords, LWords, length, common;
    sentences=sWords=MWords=LWords=length=common=0;
    int capital=0;
    char test[80+1];
    
    printf("\nCSE 1030 Program Three - Alexander Hollis - email - csp_03\n\n");
    
    while(scanf("%s",str) != EOF) {
    
        while(ispunct(LAST))
        {
            if(str[LAST] == '?' || str[LAST] == '.' || str[LAST] == '!')
            {
                sentences++;
            }
            str[LAST] = '\0';


            while(ispunct(str[0]))
            {
                strcat(str,str+1);
            }
            length = strlen(str) +1;


            if(length < 6)
            {
                if ( length < 4 )
                {
                    if(str == "the" || str == "The" || str == "a" || str == "A" || str == "of" || str == "Of" || str == "is" || str == "Is" || str== "that" || str=="That" || str == "are" || str=="Are")
                    {
                        common++;
                        sWords++;
                    }
                    else sWords++;
                }
            }
            if(length >=6 && length < 12 ) MWords++;


            if(length >= 12) LWords++;
            
            strcpy(test,str);


            test[0] = toupper(test[0]);
            
            if(strcmp(test,str)==0) capital++;
        }
        
    
    
    
    
    
    
    
    
    }


    printf("Summary of Results: \n\n");
    printf("%d short length words\n",sWords);
    printf("%d medium length words\n",MWords);
    printf("%d long length words\n",LWords);
    printf("%d capitalized words\n",capital);
    printf("%d common words\n",common);
    printf("%d sentences\n",sentences);
    
    


    return 0;


}
Here is the data.dat file that I give to unix as the input for the scanf:

The teacher announced: "This is a simple example
showing several sentences of text." The students
where instantaneously amazed!


With surprise, the teacher's dog stood up and barked
"Ruff! Ruff!! Ruff!!!" The students were surprised
and burst into applause, much to the teacher's chagrin.


That is all, folks! [86 and out].

(The spaces are intentional for the input)


Here's what the output should look like:

Sample output:
%cat data.dat
The teacher announced: "This is a simple exampleshowing several sentences of text." The studentswhere instantaneously amazed!With surprise, the teacher's dog stood up and barked"Ruff! Ruff!! Ruff!!!" The students were surprisedand burst into applause, much to the teacher's chagrin.That is all, folks! [86 and out].
%./a.out < data.dat

CSE 1030 Program Three - your name - your email address - csp_machine
Summary of Results:
32 short length words
17 medium length words
1 long length words
9 capitalized words
10 common words
8 sentences
Any feedback is greatly appreciated.

- Alex