Thread: Letter Frequency counting.

  1. #1
    Registered User
    Join Date
    May 2011
    Posts
    6

    Question Letter Frequency counting.

    Hi,

    I'm still fairly new at C and have been asked to build a program that will count a number of specified letters to be read from a text file. This is the second stage of the task, the first being to write a program that reads a text file, removes all the spaces and converts the message to upper case, saving it in a new output text file.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <ctype.h>
    
    void main()
    {
    
    	FILE  *fpin, *fpout, *fprec;                        // declares two file streams, in and out
    	int tl, f, i=0;
    	int nl=0;
    	char ch, let=0;                                    // ch is variable to be used in conversion
    	char input[80], output[80], record[80];					// declared arrays to store file paths
    	printf("Input input file name:  \n");
    	gets(input);                                // input file location
    	printf("Input output file name: \n");
    	gets(output);								// output file location
    	printf("Input record file name: \n");
    	gets(record);								// record file location
    		
    	
    	if((fpin = fopen(input,"r")) == NULL)		// confirms that file has been opened
    	{      
    		printf("Unable to open input file");
    		exit(0);
    	}
    
    	if((fpout = fopen(output,"w")) == NULL)		// confirms that file has been created
    	{    
    		printf("Unable to open output file\n");
    		exit(0);
    	}
    
    	tl=0;
    
    	while((ch=getc(fpin))!=EOF)                // read input file until the end of the file
    	{
    	    if((ch>='a' && ch<='z') | (ch>='A' && ch<='Z')) // ensures that only alphabetic characters are transferred
    		{   
    		putc(toupper(ch), fpout);               //function to change all the characters to upper case
    	    tl++;
    		}
    		
    	}
    
    	fclose(fpin);								// close input file
    	fclose(fpout);
    
    	fopen(output,"r");
    
    	if((fprec = fopen(record,"w")) == NULL)		// confirms that file has been created
    	{    
    		printf("Unable to open record file\n");
    		exit(0);
    	}
    
    	printf("Total number of letters in the file: %d\n", tl);	
    
    	printf("Which letter would you like the frequency of?\n");
    	scanf("%c", let);
    This is the area I am having difficulty with. My thinking is that if I can get ch into normal string format I will be able to use a for loop to go through each character and determine whether or not it is the character that the user has input.

    Code:
    	while((ch=getc(fpout))!=EOF)                // read input file until the end of the file
    	{
    		char c = tolower(ch);
    		if(let=c)
    			{
    				nl++;
    			}
    	}
    
    	f=(nl/tl)*100;
    		
    	printf("%d", f);
    
    	putc(f, fprec);
    
    	fclose(fpout);
    	fclose(fprec); 
    }
    Unfortunately however, I either don't know how to do this, or as a result of having spent the last 6 hours banging my head on this thing I have simpley forgotten. If anyone can fill me in or see a more logical way of doing this I'm open to all suggestions.

    Thanks for your time.

    James

  2. #2
    Registered User
    Join Date
    Dec 2007
    Posts
    2,675
    Code:
    if(let=c)
    That's an assignment, not a comparison.

  3. #3
    Registered User
    Join Date
    Mar 2011
    Posts
    278
    Do you have warnings turned on, turned up and did you fix them? I'd say no...

    Code:
    	
            scanf("%c", &let);                       // <-- need ampersand
            while((ch=getc(fpout))!=EOF)                // read input file until the end of the file
            {
                    char c = tolower(ch);                   // why go though the trouble to convert data saved to upper, only to reconvert to lower?
                    if(let==c)                          // <-- single '=' is assignment, not comparison

  4. #4
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    If you're only looking for one specific character all you need to do in your old program is add a couple of lines...

    1) You need to get the user's character... that's easy...
    2) in your file loop, catch the character before you filter it with if (ch>='a' ... and increment a counter.
    3) to report either write the count to the file or display it on the screen... This file had 232 occurances of the letter E.

    Should be no more than one or two extra lines...

  5. #5
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    In addition to what's already been said...

    #1.
    Code:
    void main()
    Most people here are sticklers about this one. It's always int main, not void main.


    #2
    Code:
    char ch, let=0;                                    // ch is variable to be used in conversion
    char input[80], output[80], record[80];					// declared arrays to store file paths
    
    ...
    
    while((ch=getc(fpin))!=EOF)                // read input file until the end of the file
    {
    getc returns an int, not a char. This is especially important since you're comparing against EOF.



    #3
    Code:
    printf("Input input file name:  \n");
    gets(input);            // input file location
    printf("Input output file name: \n");
    gets(output);	   // output file location
    printf("Input record file name: \n");
    gets(record);
    gets is considered the bastard child of input as it has no means of preventing the copying of more data to the destination buffer than said buffer is able to hold. Prefer the use of fgets as it allows you to specify the maximum size of the destination buffer therefore preventing this from occurring.



    #4
    Code:
    fopen(output,"r");
    Don't you need to assign the return value of this function call to a FILE pointer variable?
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  6. #6
    Registered User
    Join Date
    May 2011
    Posts
    6
    Thank you very much for the fast responses. About the void thing, from the moment my lecturer told us what void did it always seemed like a silly idea, but she insists on the main function being void. =\

    It's now time for some light revision, so I'll come back to this tomorrow when my brain is fresh. What's been said has made sense and taken on board, thank you all again.

  7. #7
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by Rezi View Post
    Thank you very much for the fast responses. About the void thing, from the moment my lecturer told us what void did it always seemed like a silly idea, but she insists on the main function being void. =\
    And she is WRONG... Main returns a value to the operating system that is used in batch files and other processes for error detction. (Are you sure she's even a programmer?)

    The correct form is...
    Code:
    int main (void)
      {
    
        return 0; }  // or, return an error code; }
    A Batch file example....
    Code:
    MyProg.exe MyFile /x
    IF ERRORLEVEL 3 GOTO ABORT
    MyOtherProg.exe MyFile /c
    GOTO END
    :ABORT
    ECHO An error has occured
    :END
    Last edited by CommonTater; 05-05-2011 at 04:27 PM.

  8. #8
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    To back up hk_mp5kpdw and Tater, there is an overview of the right and wrong ways to declare main here. At the bottom of that page, it links to a nice article that explains the problems of void main in more detail. Show them to your teacher, just be respectful and diplomatic about it.

  9. #9
    Registered User
    Join Date
    May 2011
    Posts
    6
    Quote Originally Posted by CommonTater View Post
    2) in your file loop, catch the character before you filter it with if (ch>='a' ... and increment a counter.
    Still having a pretty rough time with this. I'm also using fgetc now, as this says that the pointer will be a character instead of an integer. Judging from what you've said this will be wrong, but I'm really not sure how to "catch" the letter without using an if statement. This is what I have:
    Code:
    	while((ch=fgetc(fpout))!=EOF)                
    	{	
    		if(ch>='a' && ch==&let)
    				{
    					nl++;
    				}
    	}
    nl is not incremented. I have tried every variation of that if statement that I can think of, but to no avail. It's done when I get that nl int to increment.

    Cheers.

  10. #10
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Is there some reason you're trying to read characters from a file stream opened for output?

    Code:
    while((ch=getc(fpin))!=EOF)                // read input file until the end of the file
      {
    
         // check for the user's counted character here!
    
         if((ch>='a' && ch<='z') | (ch>='A' && ch<='Z')) // ensures that only alphabetic characters are transferred
      {

  11. #11
    Registered User
    Join Date
    May 2011
    Posts
    6
    Yes, I have to have this all in one program. So Input is the first file, I take the data from that, remove spaces and convert to caps, stick it in output. Then take the message in output, count the total letters, count the number of user specified letters and store the result of the equation: f=(number of letters/total number of letters)*100 into the record file. The next stage of the task is to convert it all into seperate functions, which shouldn't be too hard at all.

    I left out:
    Code:
    if((fpout = fopen(output,"r")) == NULL)		
    	{
    		printf("Unable to open output file\n");
    		exit(0);
    	}

  12. #12
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    If your next plan is to break your program down into a few functions, you may want to rethink your approach...
    Rather than opening and closing files for each of the things you need to do, and duplicating a lot of code, you may get better mileage from the idea of opening each file once and calling functions to do specific tasks for you... ReadNextChar(), MakeUpper(), IsUserChar(), SaveChar() etc... (i.e. Think in smaller blobs)

    When I'm programming, it's not uncommon for me to work out 2 or perhaps 3 different ways of solving the same problem. Generally I go for the one that gives me A) the simplest code, B) the least repetition and/or C) nicely compartmentalized functions. If I have to choose between those options C generally wins.

    I know this is only a small program (and most likely homework) but these exercises are the foundation of good programming practices later on.
    Last edited by CommonTater; 05-06-2011 at 08:32 AM.

  13. #13
    Registered User
    Join Date
    Mar 2011
    Posts
    278
    C) nicely compartmentalized functions. If I have to choose between those options C generally wins.
    Absolutely. Once a few options have coalesced in my head, I invariably pick the one that seems to offer me the most testable (compartmentalized) code.

    For example, if you need to read a file, do a bunch of stuff and then write to a file, forget the "bunch of stuff" at first. Start by creating a simple "test" file in a text editor, then write code to open that file and print out what it contains to the screen. Once that works, write code to create a new file and write simple data to it. Once that works, save the source file somewhere safe, and move forward by modifying a copy of it (you may need the original when all hell breaks loose later)! Then in your code combine the two by making your code read one file and write its data to a second file. The work on the "bunch of stuff".

  14. #14
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Further to what Mikey is saying...

    Also keep in mind that a function is intended to do one thing, not 30 things... In your case I would do the file operations in my main() code, calling functions to operate upon the file data... I would not write several monster functions that have to open the files, read the file, do something to the data, write the data out then close the files... This will lead to poorly compartmentalized code where if the output file gets mangled you won't know which function did the dirty... With all that in main()... if the file gets mangled you know to look in main(). I would instead get the data into a variable (or variables) in main() and call functions to operate on that data before writing it out...

    For example:
    Code:
    int IsUserChar(int ch, int userchar)
      { return ch == userchar; }
    
    void WriteChar(int ch, FILE *outfile)
      { fputc(toupper(ch),outfile); }
    
    // main
    int main (void)
      {  // set up variables
       
         // get filenames
         // get user character
    
        // open files
    
        do
          { filechar = fgetc(infile);
             if(IsUserChar(filechar,userchar))
               usercount++;
             if (! isspace(filechar))
               WriteChar(filechar, outfile); }
          while (!feof(infile));
    
        // close files
        // display report
       return 0; }
    Ok, this is a trivial example that will never compile... but it goes to form, not substance... you will find that programs written in "small blobs" that are well compartmentalized are 10 times easier to debug than those using "mega functions" that try to do way too much. Just look how clean that loop in main is... You'll get the idea.
    Last edited by CommonTater; 05-06-2011 at 09:22 AM.

  15. #15
    Registered User
    Join Date
    May 2011
    Posts
    6
    I deffinitely get what you're both talking about, it's logical to keep each function short and sweet, but I'm afraid this damn comparision between ch and let is still doing my head in.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Lowercase letter frequency
    By pauzza in forum C Programming
    Replies: 1
    Last Post: 11-30-2010, 12:32 PM
  2. Counting Frequency using linked list
    By ChoCo in forum C Programming
    Replies: 16
    Last Post: 03-10-2010, 11:48 AM
  3. Character frequency counting
    By zcrself in forum C Programming
    Replies: 2
    Last Post: 03-01-2010, 11:04 AM
  4. letter frequency
    By knoxville in forum C Programming
    Replies: 15
    Last Post: 08-01-2006, 02:31 AM
  5. optimizing loop (frequency counting)... HELP
    By skeptik in forum C Programming
    Replies: 22
    Last Post: 05-24-2004, 09:11 PM

Tags for this Thread