Thread: word count help

  1. #1
    Registered User
    Join Date
    Mar 2010
    Posts
    2

    word count help

    hi all,
    im very new to programming and require some help with an assignment. im trying to count the occurrences of words in a text file and output the results. this is what ive got so far any help would be greatly appreciated.

    Code:
    #include "stdafx.h"
    #include <stdlib.h>
    #include <stdio.h>
    #include <conio.h>
    #include <dos.h>
    #include <windows.h>
    
    int processFile();
    
    int main()
    {
    	processFile();
    
    	return 0;
    }
    
    int processFile()
    {
    	char c , fname[25] , arr[1000][15]= {0} , oneword[15] ;
    	int count[1000] = {0} , x = 0 , y , i = 0 , p ;
    	FILE *file1;
    
    	
    
    	printf("\nenter file to process\n");
    	scanf("%s" , &fname);
    	file1 = fopen(fname , "r");
    	if (file1 == NULL)
    	{
    		printf("error file missing");
    		//menu();
    		return 1;
    	}
     while(!feof(file1))
    	{
    		c = fscanf(file1 , "%s" , &oneword);
    		for( i = 0 ; i < 1000 ; i++)   //arr[i] != '\0'
    		{
    			if(strcmp(arr[i] , oneword)==0)
    			{
    				count[i]++;
    				break;
    			}
    			
    		}
    				x++;
    				strcpy(arr[x] , oneword);
        }
    	for(p = 0 ; p < 1000 ; p++)
    	{
    		printf("%8s%11d\n" , arr[p] , count[p]);
    		Sleep(100);
    	}
    
    	fclose(file1);
    	system("pause");
    	return 0;
    }

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > for( i = 0 ; i < 1000 ; i++)
    Try checking up to x, not up to 1000.

    You might try replacing x with something more meaningful like maxWords

    > while(!feof(file1))
    See the FAQ for why this isn't what you think it is.

    > this is what ive got so far any help would be greatly appreciated.
    Well the indentation could be better.

    Have you tested it?
    Does it work?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Nov 2008
    Location
    INDIA
    Posts
    64

    Thumbs up

    I have debugged your code in linux environment . You have made some simple mistakes there ,

    1. In your code you used 100 seconds sleep.This is not necessary.
    2. You doesn't change the count to 1 when the first occurrence of the word.

    I mention those parts in comments,

    Code:
       #include <stdlib.h>
    #include <string.h>
    #include<stdlib.h>
    #include <stdio.h>
    
    int processFile();
    
    int main()
    {
            processFile();
    
            return 0;
    }
    
    int processFile()
    {
    
                            char c , fname[25] , arr[1000][15]= {0} , oneword[15] ;
                            int count[1000] = {0} , x = 0 , y , i = 0 , p ;
                            FILE *file1;
                            int found=0;
    
    
            printf("\nEnter file to process : ");
            scanf("%s" , &fname);
            file1 = fopen(fname , "r");
            if (file1 == NULL)
            {
                    printf("error file missing");
                    return 1;
            }
            else
            {
                    while(!feof(file1))
                    {
                            found=0;
                            c = fscanf(file1 , "%s" , &oneword);
                            for( i = 0 ; i < 1000 ; i++)   //arr[i] != '\0'
                            {
                                    if(strcmp(arr[i],oneword)==0)
                                    {
                                            count[i]++;
                                            found=1; // Have a indicator that the word already occurred
                                            break;
                                    }
    
                            }
                            if(found==0) // Otherwise the word occurring first
                            {
                            strcpy(arr[x] , oneword);
                            count[x]++; // Add the word to the array and have the count to 1
                            x++;
                            }
    
                    }
                    for(p = 0 ; p < x ; p++)
                    {
                            printf("%8s%11d\n" , arr[p] , count[p]);
                    }
    
                    fclose(file1);
                    return 0;
            }
            }
    cheers.

  4. #4
    Registered User
    Join Date
    Jan 2010
    Posts
    412
    Quote Originally Posted by karthigayan View Post
    1. In your code you used 100 seconds sleep.This is not necessary.
    He is coding this for windows.
    Sleep(milliseconds) != sleep(seconds)

  5. #5
    Registered User
    Join Date
    Mar 2010
    Posts
    2
    thanks guys for all your help especially karthigayan.

    Code:
    for(i= 0 ; i < "infinity" ; i++)
    {
         printf("thankyou\n");
    }

  6. #6
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > I have debugged your code in linux environment . You have made some simple mistakes there ,
    Have you tried it with a file containing only one word?
    Did it count it twice?

    What about a file containing NO words?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  7. #7
    Registered User
    Join Date
    Nov 2008
    Location
    INDIA
    Posts
    64

    Revised code

    Thanks for your feed back Salem ,
    As per your feed back I modified the code. It will now indicate the empty file.Ans also in the previous code the word which occur last will get one count more because of the while loops last iteration.Now I made the alternative thing by coping '\0' to the oneword array.

    Code:
    #include <stdlib.h>
    #include <string.h>
    #include<stdlib.h>
    #include <stdio.h>
    
    int processFile();
    
    int main()
    {
            processFile();
    
            return 0;
    }
    
    int processFile()
    {
    
                            char c , fname[25] , arr[1000][15]= {0} , oneword[15] ;
                            int count[1000] = {0} , x = 0 , y , i = 0 , p ;
                            FILE *file1;
                            int found=0;
    
    
            printf("\nEnter file to process : ");
            scanf("%s" , &fname);
            file1 = fopen(fname , "r");
            if (file1 == NULL)
            {
                    printf("error file missing");
                    return 1;
            }
            else
            {
                    while(!feof(file1))
                    {
                            strcpy(oneword,"\0");
                            found=0;
                            c = fscanf(file1 , "%s" , &oneword);
                            for( i = 0 ; i < 1000 ; i++)   //arr[i] != '\0'
                            {
                                    if(strcmp(arr[i],oneword)==0)
                                    {
                                            count[i]++;
                                            found=1; // Have a indicator that the word already occurred
                                            break;
                                    }
    
                            }
                            if(found==0) // Otherwise the word occurring first
                            {
                            strcpy(arr[x] , oneword);
                            count[x]++; // Add the word to the array and have the count to 1
                            x++;
                            }
    
                    }
                    if(x!=0)
                    {
                    for(p = 0 ; p < x ; p++)
                    {
                            printf("%8s%11d\n" , arr[p] , count[p]);
                    }
                    }
                    else
                            printf("File is empty \n");
    
                    fclose(file1);
                    return 0;
            }
            }

  8. #8
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > scanf("%s" , &fname);
    This should be (excepting all the problems scanf has to begin with, like buffer overflow)
    scanf("%s" , fname);

    > while(!feof(file1))
    Try
    while ( fscanf(file1 , "%s" , oneword) == 1 )
    and then remove the strcpy() to clear the string.
    Which I don't think solves the problem anyway. I wouldn't depend on fscanf NOT touching the result string in case of input failure.

    > for( i = 0 ; i < 1000 ; i++)
    Perhaps
    for( i = 0 ; i < x ; i++)

    Or maybe even this, and eliminate the break statement
    for( i = 0 ; i < x && !found ; i++)
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. bintree and count (withouth using template)?
    By cubimongoloid in forum C++ Programming
    Replies: 7
    Last Post: 05-24-2009, 06:22 AM
  2. word count
    By unity_1985 in forum C Programming
    Replies: 3
    Last Post: 07-29-2007, 10:34 AM
  3. Replies: 7
    Last Post: 06-16-2006, 09:23 PM
  4. word count troubles
    By Hoser83 in forum C Programming
    Replies: 13
    Last Post: 02-09-2006, 09:12 AM
  5. Replies: 5
    Last Post: 09-28-2004, 12:38 PM