problem on counting the total no of words in each paragraph

This is a discussion on problem on counting the total no of words in each paragraph within the C Programming forums, part of the General Programming Boards category; I can succesfully count the first paragraph from the input text file. However, I feel helpless in the part of ...

  1. #1
    Registered User
    Join Date
    Dec 2007
    Posts
    5

    Unhappy problem on counting the total no of words in each paragraph

    I can succesfully count the first paragraph from the input text file. However, I feel helpless in the part of counting the words in other paragraphs.

    Code:
              ch=fgetc(infp);       
              while(ch!='\n')
              {
                if(ch==' ')
                count_pw++;
                ch=fgetc(infp);
              }
              printf("Number of words in paragraph 1: %d\n",count_pw+1);
    
              ch=fgetc(infp);    //this part is to count the total no of paragraphs
              while(ch!=EOF)
              {
               if(ch=='\n')
               count_p++;
               ch=fgetc(infp);
              }
              
              ch=fgetc(infp);   //this part really makes me feel puzzled
              while(ch!=EOF)
              {
                for(i=1;i<count_p+1;i++)
                for(j=0;j!='\n';j++)
                 if(ch==' ')
                 count4++;
                 ch=fgetc(infp);
              }
                printf("Number of words in paragraph %d: %d\n",i+2,count_pw+1);
    Moreover, I want to ask if there is method to detect the first character to count the words(I count words by counting spaces), so that the total number will not include those spaces at the start of each paragraph.

  2. #2
    Registered User
    Join Date
    May 2006
    Posts
    903
    I'm a C++ programmer but I think strtok() is what you want.

  3. #3
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Katy, Texas
    Posts
    2,309
    You're program counts the number of blanks, not the number of words. There is a difference. If you have 3 consecutive blanks, you will get a count of 3, and there might not be ANY words.

    As Desolation said, a simple method is to use strtok(), but that would require you to change your program to work with strings instead of characters.

    If you want to work with characters, then add a state flag: (not syntax checked)

    Code:
    int in_word = 0 ;  /* state flag - set to false initially  */ 
    ch=fgetc(infp);       
    while(ch!='\n') {
        if (ch==' ') { in_word = 0 } ;   // turn state flag off 
        else if (!in_word) {   // if we weren't in a word... 
              count_pw++;    // bump count of words - we just found one 
              in_word = 1 ;   // turn state flag on 
         }
         ch=fgetc(infp);       // Get another char
    }
    Todd

  4. #4
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Katy, Texas
    Posts
    2,309
    As for your other question - ask yourself this. In the following code...

    Code:
           ch=fgetc(infp);    //this part is to count the total no of paragraphs
              while(ch!=EOF)
              {
               if(ch=='\n')
               count_p++;
               ch=fgetc(infp);
              }
              
              ch=fgetc(infp);   //this part really makes me feel puzzled
              while(ch!=EOF)
    When the first EOF is hit, what will it take to reset the EOF flag?

    Todd

  5. #5
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,451
    Well the first step would be for you to define "what is a paragraph"?
    Say for example, one or more lines of text followed by a blank line.

    Having decided that, you can implement a function called 'readParagraph', which does exactly that, no more, no less.

    Taking the output of that function, you can then look at say producing
    - extractSentencesFromParagraph
    - extractWordsFromSentence
    - extractLettersFromWord
    Each basically being a number of calls inside the preceding function.

    Eg.
    Code:
    while ( (p=readParagraph(fp)) != NULL ) {
      printf("Paragraph=&#37;s\n", p );
    }
    Until that works reliably on your input files, there isn't any point in worrying about the other functions.

    In parallel with that, you can then think about say
    Code:
    char par[] = "This is sentence one.  And this is sentence two.";
    while ( (s=extractSentencesFromParagraph(par)) != NULL ) {
      printf("Sentence=%s\n", s );
    }
    No messy file handling needed, you can develop and test the inner level functions without complicating things with say file access. Being entirely self-contained, you'll find the result a lot easier to post if you need to.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.
    I support http://www.ukip.org/ as the first necessary step to a free Europe.

  6. #6
    Registered User
    Join Date
    Dec 2007
    Posts
    5
    Thanks for all of your help. I can now solve my second problem, that is counting no of words but not counting spaces.

    Yet, for the solution of my first problem, I still don't really get it. Is that I should make a function of finding the no of paragraphs? Or what does readParagraph mean and use for in counting the words in each paragraph?

  7. #7
    Registered User ssharish2005's Avatar
    Join Date
    Sep 2005
    Location
    Cambridge, UK
    Posts
    1,682
    >that is counting no of words but not counting spaces.
    What you basically need is to read the file line by line first till you reach EOF. To read a line from a file use fgets function. And once you have read a line, tokenized the line with space being your deliminator. And store in an array or a linked list.

    For tokenizing the string as mentioned before use strtok, but be careful since it uses a static NULL pointer as a base pointer to your string from second call to your strtok. Or perhaps you can use sscanf function to tokenize the string. Search the form to get some sample code.

    More or less that would be solution for your problem.

    ssharish
    Last edited by ssharish2005; 12-29-2007 at 11:12 AM.

  8. #8
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,451
    > Or what does readParagraph mean and use for in counting the words in each paragraph?
    It really stems from your definition of a text file.

    A file is composed of one or more paragraphs.
    A paragraph is one or more sentences.
    A sentence is one or more words.
    A word is one or more characters.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.
    I support http://www.ukip.org/ as the first necessary step to a free Europe.

  9. #9
    Registered User
    Join Date
    Dec 2007
    Posts
    5
    Quote Originally Posted by Salem View Post
    > Or what does readParagraph mean and use for in counting the words in each paragraph?
    It really stems from your definition of a text file.

    A file is composed of one or more paragraphs.
    A paragraph is one or more sentences.
    A sentence is one or more words.
    A word is one or more characters.
    I mean that paragraphs are counted when pressing enter. And after finding total no of paragraphs, I have tried to use strtok after reading its use[URL="http://http://www.cplusplus.com/reference/clibrary/cstring/strtok.html"]

    But now, there is some problem here. Would someone tell me how to solve it? I would be thankful for your help.
    Code:
              ch=fgetc(infp);
              while(ch!=EOF)
              {
               if(ch=='\n')
               count_p++;
               ch=fgetc(infp);
              }
              printf("Total number of paragraph(s):%d\n",count_p+1);
    
              do{
                 t1=strtok(all_word,'\n');
                 count_pw[20]=0;
                 while(fgets(all_word,300,infp)!=NULL)
                 {
                  for(i=0;i<count_p+1;i++)
                  {count_pw[i]++;
                   printf("Total number of words in paragraph %d: %d\n",count_p+1,count_pw[count_p+1]);
                   t1=strtok(NULL,'\n');             
                 }
                }while(fgets(all_word,300,infp)!=EOF);

  10. #10
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,451
    Separate the tasks of reading the file, and analysing the results.

    > t1=strtok(all_word,'\n');
    For example, you don't even use the result of this operation.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.
    I support http://www.ukip.org/ as the first necessary step to a free Europe.

  11. #11
    Registered User
    Join Date
    Nov 2007
    Posts
    73
    first as salem said define a paragraph

    if you have done that:
    >if you find two or more consecutive places replace them with one... you can use isspace() you should also consider tab....
    >after that you need to print the number of words at the point you encounter a new line character....or endoffile
    >also take care to ignore spaces at the beginnig of a paragraph...
    >by replacing spaces here i mean when you find 2 or more consecutive spaces read them as one...
    by this i think you can count number of words

  12. #12
    Registered User
    Join Date
    Nov 2007
    Posts
    73
    printf("Number of words in paragraph &#37;d: %d\n",i+2,count_pw+1);
    doesnt make sense...
    i think that should be in the loop

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. counting the white space seperated words?
    By lilhawk2892 in forum C++ Programming
    Replies: 11
    Last Post: 04-02-2009, 12:00 AM
  2. Help with hw problem: counting characters in a string
    By pinkfloyd4ever in forum C++ Programming
    Replies: 11
    Last Post: 11-04-2007, 10:18 PM
  3. Beginners Contest #2 For those who wanted more!!
    By ILoveVectors in forum Contests Board
    Replies: 16
    Last Post: 08-12-2005, 12:03 AM
  4. Problem with letter and word counting
    By wordup in forum C Programming
    Replies: 3
    Last Post: 10-09-2002, 04:02 PM
  5. ok then, just a tip for my 'value' problem
    By playboy1620 in forum C Programming
    Replies: 5
    Last Post: 03-04-2002, 01:14 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21