# Thread: counting number of words in a string that has letters, spaces, and other punctuation

1. ## counting number of words in a string that has letters, spaces, and other punctuation

I am a beginner at c. I have been assigned a project that wants a user to enter text from the keyboard and then the number of words is counted.

Before I have written programs that count number of words by counting spaces; however, this program includes spaces and punctuation.

Basically I am having trouble(spent last 6 hours) trying to figure out how to write a loop. A loop that would notice a the first letter or digit and once it hits an element that is not letter or digit would increment a counting variable by 1.

So the loop would run and do something when it hits a letter or number and stop at first element that does is not letter or number.

This would continue til the end of a string.

Any insight would be greatly appreciated!!

2. a link to another thread discussing an issue like this would be helpful as well to save anyone familiar any unneccessary time

3. Originally Posted by strugglesWithC
I am a beginner at c. I have been assigned a project that wants a user to enter text from the keyboard and then the number of words is counted.

Before I have written programs that count number of words by counting spaces; however, this program includes spaces and punctuation.

Basically I am having trouble(spent last 6 hours) trying to figure out how to write a loop. A loop that would notice a the first letter or digit and once it hits an element that is not letter or digit would increment a counting variable by 1.

So the loop would run and do something when it hits a letter or number and stop at first element that does is not letter or number.

This would continue til the end of a string.

Any insight would be greatly appreciated!!
The first LETTER or the first letter after a space or punctuation mark, should be where your program increments the word count.

So basically, you have two states to contend with as you "walk" through the string, char by char:

1) your program is OUTSIDE a word. No words get incremented here,

or

2) your program is INSIDE a word, and the word counting variable will be incremented, first thing.

Aside from that algorithmic change, the only difference with your previous program is that you will add the punctuation marks, into the same if() statement that you have used before, for spaces.

Code:
`if(char==' ' || char==',' || char=='.' || char=='\n')`
type of thing. Add in all the punctuation marks that you need, in the same way.

Use that kind of if statement, inside your while loop, and you're off to a great start.

4. OK, what if the characters that are not counted include more than just "." "," "' '" and "\0". say for instance there are ?, :, ; , ', any non number or letter. would it be appropriate to use isalnum, and if so I cannot figure out how to write a loop with isalnum that works for me.

Basically for this program I need to use a loop that only counts from when it sees a letter or num to when is does not see one. additionally there could be multiple non letter and non numbers in a row.

Would this change anything?

5. If it helps this is the wording of the project

Write a program that processes a sequence of lines of text entered from the keyboard. The program should report a count of the number of words in those lines. Assume that each line is 80 characters or less, and ends with a new line (enter); and that words are delimited by one or more space characters. There may be punctuation (periods and commas, etc) which, other than being treated as word delimiters, should be ignored. There will be no hyphenated words crossing end of lines. Digits should be treated as letters, but you can assume that no other special characters will be part of the text strings. Your program should prompt the user to enter another line of text, or to stop. Good use of functions will help in solving this problem.

6. Originally Posted by strugglesWithC
OK, what if the characters that are not counted include more than just "." "," "' '" and "\0". say for instance there are ?, :, ; , ', any non number or letter. would it be appropriate to use isalnum, and if so I cannot figure out how to write a loop with isalnum that works for me.

Basically for this program I need to use a loop that only counts from when it sees a letter or num to when is does not see one. additionally there could be multiple non letter and non numbers in a row.

Would this change anything?
If you can use any of the isSOMETHING() functions, then go ahead. If you need to use more than one if() statement, then go ahead:

Code:
```if(char == SOMETHING) {
if(char == SOMETHING ELSE) {

}
}```
How many spaces or non-letters there are - doesn't matter a bit. Remember this - there are only TWO states your program can be in:

1) Outside a word - in which case nothing gets counted, and you don't care how many char's are in this group, or how many are adjacent to each other.

or

2) Inside a word - where the first thing that gets counted is one more word count. Program stays inside a word, until a space or punctuation mark, or newline, is encountered.

the EOF (end of file), does not change the word count, but does end your program.

A samurai sword has two sides to it. So does this program's logic. A samurai cuts his opponent down by the physical manifestation of his intention - the sword is merely an extension of his self. In the same way, your intention to solve this problem, by using this blade of logic, will cut this problem straight through.

Breathe in; breathe out; be centered and feel your Ki welling up inside you. Now cut this problem with the sword of this logic.