Thread: Word frequency (printf problem)

  1. #1
    Registered User
    Join Date
    Nov 2005
    Posts
    4

    Word frequency (printf problem)

    Hi, I'm relatively new to programming, so let me know if I don't make sense and I'll elaborate.

    I've pasted my main function below. Its supposed to take in a data file and count the frequency of each word, then print all the information out in another data file. (so a sample output would be:

    Hello, 2
    John, 4
    I, 3 etc.

    Now the strange thing is, the program ONLY works (the output is generated by the fprintf function subsequent to the return) IF I include the printf function in the while loop (in bold).

    Why should this affect the operation of the program... I mean all it does is just print out a line.
    Any advice is greatly appreciated. My hair's turned grey.

    PS I've attached the code, so you can try compiling it to see what I mean. The input file is called doc.txt... so just get a paragraph, not too long mind, and test it. The output is freq.txt. I've included as many comments as I could to make the program more readable, but its still relatively messy. Sorry if that causes trouble.


    Code:
    int main (void)
    {
    FILE *infile, *outfile;
    char line[100];
    struct WORD w[500]; //assuming that there are no more than 500 different words. structure contains an int freq, and char word[30]
    extern int counter; // counts the number of words (done in different function
    int k=0;
    
    infile = fopen ("doc.txt", "r");
    outfile = fopen ("freq.txt", "w");
        
    while(fscanf(infile, "%c", &line[0]) == 1) 
      {
           k=1;
           do{
               fscanf (infile, "%c", &line[k]);
               k++;
             } while (line[k-1] != '\n');
    
           line[k-1] = '\0';
         //  printf("%s\n", line); /* fprintf function (below) only works */
                                   /* if this printf function is inserted */
           process (line, w); /* each line is taken in and processed. New words are added into w; and existing words increases frequency of the concerned element */
      }
    
    
    for (k=0; k<counter; k++) 
        fprintf(outfile, "%s %i\n", w[k].word, w[k].freq); /* This is the desired output... */
        
    
    return 0;
    }

  2. #2
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Why should this affect the operation of the program... I mean all it does is just print out a line.
    It also flushes the output buffer. (With the '\n'.) Try using
    Code:
    fflush(stdout);
    and see if that works.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  3. #3
    Registered User
    Join Date
    Nov 2005
    Posts
    4
    I have no idea what flushing means, but I gave it a go anyway. Still didn't work... thanks for help anyway

  4. #4
    Registered User
    Join Date
    Mar 2005
    Location
    Mountaintop, Pa
    Posts
    1,058
    Suggestions:

    1.Get rid of all the external references to counter. It's already a global in the file.

    2.Clear out the Word struct prior to using it.
    Code:
     struct WORD w[500]= {0};
    3.Both scanf's should be checking for a != EOF condition

    4.The internal scanf should break on a found EOF condition

    5.Initialize y in process function to 1 in order to ensure the while loop always initially kicks in

    Finally, only execute the process function if k > 1

    Bob
    Last edited by BobS0327; 11-10-2005 at 07:08 PM. Reason: Forgot to mention a point

  5. #5
    Registered User
    Join Date
    Nov 2005
    Posts
    4
    Thanks Bob.
    Really appreciate it (and I'm not saying it only cause everyoen else ssays it).

    Geez. Thanks ha ha. I feel my hair growing back now. It was the y that wasn't initialized. You either read a lot code, or you spent a little extra time on mine; either way, thanks!
    Last edited by Tired_; 11-10-2005 at 07:18 PM.

  6. #6
    Registered User
    Join Date
    Mar 2005
    Location
    Mountaintop, Pa
    Posts
    1,058
    It was the y that wasn't initialized.
    Unfortunately, it was a little more than the y variable not being initialized.

    I recommended that you check for EOF in the following fscanf,line 05 because if you do not check for an EOF condition,
    the fscanf will read undefined memory and the loop will spin until a newline character is found in that undefined memory. Let us
    assume we are starting on the last record read from the input file. We have fscan'd our last record in the inner loop
    and we have replaced the \n with \0 and forwarded the line to process function. Then we return back to the outer loop
    and this will execute into the function because it is NOT the EOF. This fscanf is picking up the last newline
    in the text file. Thus, the outer fscanf will return a 1. The inner scanf will loop indefinitely into unknown memory
    until it finds a newline character. Remember, the inner loop is checking line variable for a newline character in order to
    break out of the loop. It is NOT checking for an EOF condition. The line variable does have the character data from the
    previous processing sans the the newline character. Thus, the loop is out of control since it is checking undefined memory and
    won't break until a newline char is read into the line variable from undefined memory. The statement #08 got rid of the newline
    character from the previous line input processing.

    Your code is listed below. Just follow the processing of the last record to understand the problem. I would also suggest that you open up an ASCII text file in a hex editor and you will understand what I'm trying to explain. You will notice that there are two CR LF's at the end of an ASCII text file.

    Code:
    01.  while(fscanf(infile, "%c", &line[0]) == 1) 
    02.    {
    03.         k=1;
    04.         do{
    05.            fscanf (infile, "%c", &line[k]);
    06.             k++;
    07.           } while (line[k-1] != '\n');
    08.         line[k-1] = '\0';
    09.         if(k >1)
    10.         process (line, w);
    11.    }

  7. #7
    Registered User
    Join Date
    Nov 2005
    Posts
    4
    Okay, I'll be needing some time to digest. Thanks for help.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. please help with binary tree, urgent.
    By slickestting in forum C Programming
    Replies: 2
    Last Post: 07-22-2007, 07:55 PM
  2. Double to Int conversion warning
    By wiznant in forum C Programming
    Replies: 15
    Last Post: 09-19-2005, 09:25 PM
  3. Random word problem
    By goron350 in forum C++ Programming
    Replies: 2
    Last Post: 05-14-2005, 03:44 PM
  4. Replies: 5
    Last Post: 09-28-2004, 12:38 PM
  5. whats wrong with this?
    By petedee in forum C Programming
    Replies: 32
    Last Post: 01-06-2004, 10:28 PM