Thread: search a test file for the number of occurrences of at least two words

  1. #1
    Registered User
    Join Date
    Jul 2007
    Posts
    24

    Unhappy search a test file for the number of occurrences of at least two words

    Hi everyone,
    I'm new to programming and I have to do a project and I'm not sure where I have to start from.
    this is my project:
    You are required to develop a program to search a test file for the number of occurrences of at least two words. Output from the program should be produced both on the display and held in second text file.

    The input file, output file and the search words should be specified as command line arguments for the program.
    i.e mysearchprog inputfile outputfile the but
    where mysearchprog is the name of the program, inputfile and outputfile are the names of the input and output files and the and but are the search words.

    Hint :
    Your application should contain a number of functions.
    Remember that words in the text file can be terminated with spaces, commas, full stops, etc.

    I would really appreciate it if someone could help me on this, I need to complete the project in 10 days,

    Thanks guys

  2. #2
    Chinese pâté foxman's Avatar
    Join Date
    Jul 2007
    Location
    Canada
    Posts
    404
    I think you'll find this link interesting.

    And if you really don't know where to start, well, try to find a solution by written it on a piece of paper in a natural langage (or in pseudo code).

    And you could look what the standard C library has to offer you for string searching and file reading/writing.

  3. #3
    Registered User
    Join Date
    Jul 2007
    Posts
    24
    thanks for the links, it was very helpful,
    I've come up with the code below:
    it takes some strings from the keyboard and it fives out the number of occurrences of each word,
    but how can I get the input from a text file rather than keyboard,
    I'd appreciate it if someone can help me on this.

    Code:
    #include <stdafx.h>
    #include <string.h>
    #define LEN_TOTAL 21
    int _tmain(int argc, _TCHAR* argv[])
    {
    int word_length_occurrence[ LEN_TOTAL ] = { 0 };
       char string[ 700 ] = { 0 },
            tempStr[ 300 ] = { 0 },
            *tokenPtr;
       size_t i, numLines,
              maxLength = 0;
    
       printf( "Number of lines: " );
       scanf( "&#37;d", &numLines );
    
       gets( tempStr ); /* grabs the carriage return */
    
       for ( i = 0; i < numLines; i++ ) {
          fgets( tempStr, sizeof( tempStr ), stdin );
          strcat( string, tempStr );
       }
    
       tokenPtr = strtok( string, " \n" );
    
       while ( tokenPtr != NULL ) {
          maxLength = strlen( tokenPtr );
          word_length_occurrence[ maxLength ]++;
          tokenPtr = strtok( NULL, " \n" );
       }
    
       printf( "%8s%15s\n", "Word Length", "Occurrences" );
    
       for ( i = 1; i < LEN_TOTAL; i++ )
          printf( "%4d%15d%c", i, word_length_occurrence[ i ], '\n' );
    
       printf( "\n" );
       getchar();
       return 0;
    }

  4. #4
    Registered User
    Join Date
    Oct 2001
    Posts
    2,129
    Quote Originally Posted by cyber_tech View Post
    but how can I get the input from a text file rather than keyboard
    redirect the input or open a file.
    Code:
       gets( tempStr ); /* grabs the carriage return */
    http://faq.cprogramming.com/cgi-bin/...&id=1043284351

  5. #5
    Registered User
    Join Date
    Jul 2007
    Posts
    24

    Question

    Hi, I need more help.
    I need to read from a file(input.txt) and get the number of occurrences of each word and produce the result in the seperate text file as well as screen,
    everything works but it doesn't produce the text file (ie: output.txt).
    can someone help please:
    THanks

    Code:
    #include "stdafx.h"
    #include <stdio.h>
    #define LEN_TOTAL 21
    
    int _tmain(int argc, _TCHAR* argv[])
    {
    	 int word_length_occurrence[ LEN_TOTAL ] = { 0 };
    	 char string[700]={0},
    	      tempStr[300]={0},
    		  *tokenPtr;
    	   size_t i, numLines,
              maxLength = 0;
    	
            
    	 FILE * pFile;
                pFile = fopen ("input.txt","r");
         if (pFile==NULL) perror ("Error opening file");
      else
      {
         
    	  while (!feof(pFile)) {
    		  fgets( tempStr, sizeof( tempStr ), pFile );
    		  strcat( string, tempStr );
    	  }
    		   tokenPtr = strtok( string, " \n" );
    	
    	  while ( tokenPtr != NULL ) {
          maxLength = strlen( tokenPtr );
          word_length_occurrence[ maxLength ]++;
          tokenPtr = strtok( NULL, " .,\n" );
    	 }
          printf( "%8s%15s\n", "Word Length", "Occurrences" );
    
    	  for ( i = 1; i < LEN_TOTAL; i++ ) 
          printf( "%4d%15d%c", i, word_length_occurrence[ i ], '\n' );
    	  
      
    	printf( "\n" );
        getchar();
    	fclose (pFile);
    }
       
    FILE * pFile2;
    		pFile2 = fopen ("output.txt","w+");
    		if (pFile!=NULL) {
    		fprintf( pFile2,"%8s%15s\n", "Word Length", "Occurrences" );
    
    	  for ( i = 1; i < LEN_TOTAL; i++ ) 
          fprintf( pFile2,"%4d%15d%c", i, word_length_occurrence[ i ], '\n' );
    	  
    		fclose(pFile2);
    		}
     
      
    
      
      
      return 0;
    }

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    So what happens?

    Does is succed in opening/creating the file at all?

    Perhaps you should print an error message if it fails to open the output file?

    --
    Mats

  7. #7
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Code:
    while (!feof(pFile)) {
    		  fgets( tempStr, sizeof( tempStr ), pFile );
    		  strcat( string, tempStr );
    	  }
    if fgets fails - you still call strcat

    See the faq why not to use feof to control loop.
    Better approach
    Code:
    while (fgets( tempStr, sizeof( tempStr ), pFile ) != NULL)
    {
    	  strcat( string, tempStr );
    }
    And what about \n character - you need it in the resulting string as well?
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  8. #8
    Registered User
    Join Date
    Jul 2007
    Posts
    24
    no it doesn't create the output file at all.

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    And of course, if the file is "large" (larger than 700 bytes - and I have seen more than one text-file bigger than that) it would overflow the string buffer -
    is there any reason why ALL the text needs to be in one string? It seems to me that parsing each line would be sufficient.

    In fact, the tempstr is set to 300 bytes, so you only need two and a bit lines at this length to overflow the buffer.

    And the use of strtok's second parameter is inconsistant - if the piece of text is "Hello, World!", then the result would be wrong because it includes the comma (and the second word too, in this example, since exlamation isn't considered a separator either). You may want to have ONE string that you use for both calls to strtok, and make sure it's got all sorts of separators - an interesting question is if numbers are "valid parts of words" - likewise for "-" which is sometimes used to tie words together - are those one long word or two separate words.

    --
    Mats

  10. #10
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by cyber_tech View Post
    no it doesn't create the output file at all.
    So try to debug why that is - start by printing an error message when it doesn't open the file. You may want to look at what the "errno" is (such as using perror() or strerror()).

    See also the notes on "what else may be wrong in your code".

    --
    Mats

  11. #11
    Registered User
    Join Date
    Jul 2007
    Posts
    24
    Quote Originally Posted by matsp View Post
    So try to debug why that is - start by printing an error message when it doesn't open the file. You may want to look at what the "errno" is (such as using perror() or strerror()).



    --
    Mats
    I tried everything but it still doesn't create the file,
    I put the error check and it doesn't give me the error message,when debug it, it works without creating the output file.

  12. #12
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by cyber_tech View Post
    I tried everything but it still doesn't create the file,
    I put the error check and it doesn't give me the error message,when debug it, it works without creating the output file.

    So you are saying that it doesn't create a file, but your check if the pfile2 is null says that there is a file there? Very strange... I'd check that again - it's not like files are uncommon, and it would probably be fairly noticable for a compiler/c-library to have a broken "fopen" - even with "w+" as a parameter.

    --
    Mats

  13. #13
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    Code:
        fclose (pFile);
    }
    FILE * pFile2;
    pFile2 = fopen ("output.txt","w+");
    if (pFile!=NULL) {
    Houston, I think we have a problem.
    Last edited by hk_mp5kpdw; 08-02-2007 at 08:50 AM.
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  14. #14
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Well spotted hk. A typical drawback of having similar variable names. If they were called "pfout" and "pfin" it would have been much easier to spot, I would think - and you directly know what each of them too.

    --
    Mats

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Can we have vector of vector?
    By ketu1 in forum C++ Programming
    Replies: 24
    Last Post: 01-03-2008, 05:02 AM
  2. Possible circular definition with singleton objects
    By techrolla in forum C++ Programming
    Replies: 3
    Last Post: 12-26-2004, 10:46 AM
  3. Problem with malloc() and sorting words from text file
    By goron350 in forum C Programming
    Replies: 11
    Last Post: 11-30-2004, 10:01 AM
  4. Unknown Memory Leak in Init() Function
    By CodeHacker in forum Windows Programming
    Replies: 3
    Last Post: 07-09-2004, 09:54 AM
  5. simulate Grep command in Unix using C
    By laxmi in forum C Programming
    Replies: 6
    Last Post: 05-10-2002, 04:10 PM