Thread: How to calculate an average of all bytes in a given file?

  1. #1
    Registered User
    Join Date
    Aug 2009
    Posts
    12

    How to calculate an average of all bytes in a given file?

    I could get the size of the file in bytes using the function ftell(). I dont know what is an average of all bytes in the file. Thanks for any help! Here is a piece of code:
    Code:
    unsigned char average(const char *filename){
    	unsigned char avg = 1;
    	FILE *fp;
    
    	if ( (fp = fopen(filename,"r")) == NULL){
           		printf("\nThe file did not open\n.");
           		return 255;
    	}
    
    	while(!feof(fp)){
            //TODO:
            ......
    	}
    
    	//avg=(unsigned char)(nlen/nround);
    	return avg;
    }

  2. #2
    Registered User
    Join Date
    Jun 2009
    Location
    US of A
    Posts
    305
    Can you be more specific what you mean by average of all bytes in a file ?

  3. #3
    Registered User
    Join Date
    Aug 2009
    Posts
    12
    Sorry, that's all what I have. I was asked to implement the function which calculates an average of all bytes in a given file shown in the first post. I dont know what is "average" meaning for either.
    Quote Originally Posted by roaan View Post
    Can you be more specific what you mean by average of all bytes in a file ?

  4. #4
    Registered User
    Join Date
    Jun 2009
    Location
    US of A
    Posts
    305
    Quote Originally Posted by ljin View Post
    I could get the size of the file in bytes using the function ftell(). I dont know what is an average of all bytes in the file. Thanks for any help! Here is a piece of code:
    Code:
    unsigned char average(const char *filename){
    	unsigned char avg = 1;
    	FILE *fp;
    
    	if ( (fp = fopen(filename,"r")) == NULL){
           		printf("\nThe file did not open\n.");
           		return 255;
    	}
    
    	while(!feof(fp)){
            //TODO:
            ......
    	}
    
    	//avg=(unsigned char)(nlen/nround);
    	return avg;
    }
    How did you get this piece of code? It seems to me a homework with sections of //TODO.
    Anyhow what is nlen and nround mentioned to you (i suppose that is what is implied by avg of all bytes).

    First tell what do you mean by nlen and nround ?

  5. #5
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Quote Originally Posted by ljin View Post
    I could get the size of the file in bytes using the function ftell(). I dont know what is an average of all bytes in the file. Thanks for any help! Here is a piece of code:
    Code:
    unsigned char average(const char *filename){
        unsigned char avg = 1;
        FILE *fp;
    
        if ( (fp = fopen(filename,"r")) == NULL){
                   printf("\nThe file did not open\n.");
                   return 255;
        }
    
        while(!feof(fp)){
            //TODO:
            ......
        }
    
        //avg=(unsigned char)(nlen/nround);
        return avg;
    }

    An average is simply all the values added together and then divided by the number of values. Problem is, adding a bunch of bytes together using a single byte will likely overflow, so you'll either need a larger data type or use perhaps a clever mathematical technique to overcome that issue.

    >> unsigned char avg = 1;

    You should set that to 0.

    >> while(!feof(fp))

    You're checking the end-of-file condition too early, use that to break out of the loop *after* you have read the next value but *before* you attempt to use the value.

  6. #6
    Registered User
    Join Date
    Aug 2009
    Posts
    12
    I think you are right. here is a piece of my code for this implementation. Any comment?
    Code:
    unsigned char average(const char *filename){
    	unsigned int BLOCK_SIZE=512;
    	unsigned int nlen=0, nround=0;
    	unsigned char avg = 0;
    	FILE *fp;
    	unsigned char tmp[512];
    
    	if ( (fp = fopen(filename,"r")) == NULL){
           		printf("\nThe file did not open\n.");
           		return 255;
    	}
    
    	while(!feof(fp)){
    		if(fread(tmp, 1, BLOCK_SIZE, fp)){
    			nlen+=BLOCK_SIZE;
    			nround++;
    		}else{
    			BLOCK_SIZE=BLOCK_SIZE/2;
    		}
    	}
    
    	avg=(unsigned char)(nlen/nround);
    	return avg;
    }
    Quote Originally Posted by Sebastiani View Post
    An average is simply all the values added together and then divided by the number of values. Problem is, adding a bunch of bytes together using a single byte will likely overflow, so you'll either need a larger data type or use perhaps a clever mathematical technique to overcome that issue.

    >> unsigned char avg = 1;

    You should set that to 0.

    >> while(!feof(fp))

    You're checking the end-of-file condition too early, use that to break out of the loop *after* you have read the next value but *before* you attempt to use the value.

  7. #7
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    First off I still don't get what you mean by average of all bytes in a file. Moreover the code snippet you posted won't cut it because when the expression (nlen/nround) is converted to an unsigned char the high order bits will be simply dropped giving you a wrong answer because a 512 BLOCK_SIZE is too large to fit into an unsigned char.

  8. #8
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Just read an unsigned character at a time. Each character read, add 1 to char_count and also add the unsigned char to an unsigned int accumulator.

    After EOF, divide your accumulator by the char_count and you'll have your average.
    Mainframe assembler programmer by trade. C coder when I can.

  9. #9
    Registered User
    Join Date
    Jun 2009
    Location
    US of A
    Posts
    305
    I am still stumbled on how i can find the average of all the bytes in a file ??????????? I cant imagine how this is possible..... Any clues .......

  10. #10
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    If a file had 3 characters in it: A B and C. The average would be B. (65+66+67)/3 = 66 = B
    Mainframe assembler programmer by trade. C coder when I can.

  11. #11
    Registered User
    Join Date
    Jun 2009
    Location
    US of A
    Posts
    305
    Quote Originally Posted by Dino View Post
    If a file had 3 characters in it: A B and C. The average would be B. (65+66+67)/3 = 66 = B
    Okay that makes sense. Just extending on to what you said that if a file had an integer say 2 in addition to the three characters you said what would be the average in that case.

    Would it be

    50+ 65+ 66+ 67 / 7 (1 for char each and 4 for int)

    Is that correct ?????

  12. #12
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Well, it would be divided by 4 since there are 4 characters now, but yes.
    Mainframe assembler programmer by trade. C coder when I can.

  13. #13
    Registered User
    Join Date
    Jun 2009
    Location
    US of A
    Posts
    305
    Quote Originally Posted by Dino View Post
    Well, it would be divided by 4 since there are 4 characters now, but yes.
    I doubt it would be divided by 4 because the term average is the sum of all the values divided by the total number of elements. But the number of those elements in the case if one of them is an integer value should be 7 because integers dont occupy 1 byte like characters and on a 32 bit platform like mine they occupy 4 bytes. So the average of all the bytes should be sum / 7.

  14. #14
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Oh, I understand how you got 7. No. You would be adding
    (00 + 00 + 00 + 02 +65 + 66 + 67) / 7 (assume big endian for illustration purposes)

    If the '2' were the character 2, then that is 50, but then you would divide by 4.
    Mainframe assembler programmer by trade. C coder when I can.

  15. #15
    Registered User
    Join Date
    Jun 2009
    Location
    US of A
    Posts
    305
    Quote Originally Posted by Dino View Post
    Oh, I understand how you got 7. No. You would be adding
    (00 + 00 + 00 + 02 +65 + 66 + 67) / 7 (assume big endian for illustration purposes)

    If the '2' were the character 2, then that is 50, but then you would divide by 4.
    Yes i was taking into consideration the integer size ....... yes if it were character '2' then that would be 4. Thanks now it makes sense to me as well of how average of all bytes is possible. But just one more doubt as pointed in one of the posts above the max value that any byte can hold is 255 so isnt there a possibility of running into overflow very soon or the sum would be stored in some other variable.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Newbie homework help
    By fossage in forum C Programming
    Replies: 3
    Last Post: 04-30-2009, 04:27 PM
  2. algorithm for duplicate file checking help
    By geekoftheweek in forum C Programming
    Replies: 1
    Last Post: 04-04-2009, 01:46 PM
  3. calculate average from a file
    By mrsirpoopsalot in forum C++ Programming
    Replies: 11
    Last Post: 01-20-2009, 02:25 PM
  4. gcc link external library
    By spank in forum C Programming
    Replies: 6
    Last Post: 08-08-2007, 03:44 PM
  5. Possible circular definition with singleton objects
    By techrolla in forum C++ Programming
    Replies: 3
    Last Post: 12-26-2004, 10:46 AM