# Thread: How to calculate an average of all bytes in a given file?

1. ## How to calculate an average of all bytes in a given file?

I could get the size of the file in bytes using the function ftell(). I dont know what is an average of all bytes in the file. Thanks for any help! Here is a piece of code:
Code:
```unsigned char average(const char *filename){
unsigned char avg = 1;
FILE *fp;

if ( (fp = fopen(filename,"r")) == NULL){
printf("\nThe file did not open\n.");
return 255;
}

while(!feof(fp)){
//TODO:
......
}

//avg=(unsigned char)(nlen/nround);
return avg;
}```

2. Can you be more specific what you mean by average of all bytes in a file ?

3. Sorry, that's all what I have. I was asked to implement the function which calculates an average of all bytes in a given file shown in the first post. I dont know what is "average" meaning for either.
Originally Posted by roaan
Can you be more specific what you mean by average of all bytes in a file ?

4. Originally Posted by ljin
I could get the size of the file in bytes using the function ftell(). I dont know what is an average of all bytes in the file. Thanks for any help! Here is a piece of code:
Code:
```unsigned char average(const char *filename){
unsigned char avg = 1;
FILE *fp;

if ( (fp = fopen(filename,"r")) == NULL){
printf("\nThe file did not open\n.");
return 255;
}

while(!feof(fp)){
//TODO:
......
}

//avg=(unsigned char)(nlen/nround);
return avg;
}```
How did you get this piece of code? It seems to me a homework with sections of //TODO.
Anyhow what is nlen and nround mentioned to you (i suppose that is what is implied by avg of all bytes).

First tell what do you mean by nlen and nround ?

5. Originally Posted by ljin
I could get the size of the file in bytes using the function ftell(). I dont know what is an average of all bytes in the file. Thanks for any help! Here is a piece of code:
Code:
```unsigned char average(const char *filename){
unsigned char avg = 1;
FILE *fp;

if ( (fp = fopen(filename,"r")) == NULL){
printf("\nThe file did not open\n.");
return 255;
}

while(!feof(fp)){
//TODO:
......
}

//avg=(unsigned char)(nlen/nround);
return avg;
}```

An average is simply all the values added together and then divided by the number of values. Problem is, adding a bunch of bytes together using a single byte will likely overflow, so you'll either need a larger data type or use perhaps a clever mathematical technique to overcome that issue.

>> unsigned char avg = 1;

You should set that to 0.

>> while(!feof(fp))

You're checking the end-of-file condition too early, use that to break out of the loop *after* you have read the next value but *before* you attempt to use the value.

6. I think you are right. here is a piece of my code for this implementation. Any comment?
Code:
```unsigned char average(const char *filename){
unsigned int BLOCK_SIZE=512;
unsigned int nlen=0, nround=0;
unsigned char avg = 0;
FILE *fp;
unsigned char tmp[512];

if ( (fp = fopen(filename,"r")) == NULL){
printf("\nThe file did not open\n.");
return 255;
}

while(!feof(fp)){
nlen+=BLOCK_SIZE;
nround++;
}else{
BLOCK_SIZE=BLOCK_SIZE/2;
}
}

avg=(unsigned char)(nlen/nround);
return avg;
}```
Originally Posted by Sebastiani
An average is simply all the values added together and then divided by the number of values. Problem is, adding a bunch of bytes together using a single byte will likely overflow, so you'll either need a larger data type or use perhaps a clever mathematical technique to overcome that issue.

>> unsigned char avg = 1;

You should set that to 0.

>> while(!feof(fp))

You're checking the end-of-file condition too early, use that to break out of the loop *after* you have read the next value but *before* you attempt to use the value.

7. First off I still don't get what you mean by average of all bytes in a file. Moreover the code snippet you posted won't cut it because when the expression (nlen/nround) is converted to an unsigned char the high order bits will be simply dropped giving you a wrong answer because a 512 BLOCK_SIZE is too large to fit into an unsigned char.

8. Just read an unsigned character at a time. Each character read, add 1 to char_count and also add the unsigned char to an unsigned int accumulator.

After EOF, divide your accumulator by the char_count and you'll have your average.

9. I am still stumbled on how i can find the average of all the bytes in a file ??????????? I cant imagine how this is possible..... Any clues .......

10. If a file had 3 characters in it: A B and C. The average would be B. (65+66+67)/3 = 66 = B

11. Originally Posted by Dino
If a file had 3 characters in it: A B and C. The average would be B. (65+66+67)/3 = 66 = B
Okay that makes sense. Just extending on to what you said that if a file had an integer say 2 in addition to the three characters you said what would be the average in that case.

Would it be

50+ 65+ 66+ 67 / 7 (1 for char each and 4 for int)

Is that correct ?????

12. Well, it would be divided by 4 since there are 4 characters now, but yes.

13. Originally Posted by Dino
Well, it would be divided by 4 since there are 4 characters now, but yes.
I doubt it would be divided by 4 because the term average is the sum of all the values divided by the total number of elements. But the number of those elements in the case if one of them is an integer value should be 7 because integers dont occupy 1 byte like characters and on a 32 bit platform like mine they occupy 4 bytes. So the average of all the bytes should be sum / 7.

14. Oh, I understand how you got 7. No. You would be adding
(00 + 00 + 00 + 02 +65 + 66 + 67) / 7 (assume big endian for illustration purposes)

If the '2' were the character 2, then that is 50, but then you would divide by 4.

15. Originally Posted by Dino
Oh, I understand how you got 7. No. You would be adding
(00 + 00 + 00 + 02 +65 + 66 + 67) / 7 (assume big endian for illustration purposes)

If the '2' were the character 2, then that is 50, but then you would divide by 4.
Yes i was taking into consideration the integer size ....... yes if it were character '2' then that would be 4. Thanks now it makes sense to me as well of how average of all bytes is possible. But just one more doubt as pointed in one of the posts above the max value that any byte can hold is 255 so isnt there a possibility of running into overflow very soon or the sum would be stored in some other variable.