# Thread: a histogram program

1. ## a histogram program

hey guys,
i really need help for a project of mine.i'll write what is whole program about. any ideas who is sufficent to solve problem will be appriciated. thanx from now.

“A histogram is an efficient visualization of the distributions. The construction of histograms is simple. The x-axis is the consecutive intervals; y-axis is the frequencies of data that falls into a small interval (number of data that has a value in this interval).”

In this project, you are going to implement a simple histogram that x-axis is the frequencies of the values in the data that falls into a small interval; y-axis is the consecutive intervals. The user enters 20 values. There are 10 intervals (10 frequency values). And you are going to output the histogram of this data.

The code of the following functions are given, do not manipulate them:

int DrawHistogram(int frequency[]);

* ReadNumber function takes a data from the user and assigns the values to array, numbers.
* DrawHistogram prints the character '-' as many as the value of the frequency at each row. As you can see in the function, the index of the array increases at each row.

Write only the following functions (without using printf and scanf and do not forget to enter their parameters in function calls in the main() function!!!) :

void FindBounds(float numbers[],float *min,float *max);

int FindFrequency(float numbers[],int frequency[], float min,float max);

* FindBounds function takes three parameters. You are going to find the minimum and maximum value of the data, subtract 2 from minimum value and add 2 to the maximum value (2 is a constant for the epsilon value that enlarges the interval),then assign to the min and max respectively.

* FindFrequency function takes four parameters. The width of each interval is equal. You have 10 consecutive intervals between min and max parameters. Find the width of the intervals (bin width). Find each frequency by calculating the number of intervals greater than the minimum value of the corresponding interval and less than or equal to the maximum value of the corresponding interval. Fill the frequency array in the increasing order of its indexes as the lower bound of each interval increases.

An example for input output form:

Input:

10.9706

8.8090

9.7007

9.1305

9.8413

13.0703

8.7870

7.3053

10.9388

8.1929

10.0718

8.7449

11.0708

11.1058

9.5926

5.8914

10.2651

13.1859

12.0368

6.8392

Output:

Histogram:

x_axis:Frequency, y_axis:Values

-

-

--

----

-----

----

-

--

2. Well, the information that you gave here can't be more specific!!

3. The use of float variables to do this is terrible on the part of whomever designed this assignment. As for where to begin, I think that writing out the algorithm steps on how you plan to do this, in pseudo code or as a C framework (empty functions, comments, etc) and we'll start off from there. However you need to provide us with something before we can comment on it/help you make it work.

4. yeah folks you are right.instructor sent his example code lately. i hope it works

Code:
#include<stdio.h>

/**

function which reads data from the user

@param numbers[] array where the data is read

*/

{

int i;

for(i=0;i<20;i++)

scanf("%f",&numbers[i]);

}

/**

function that finds the lower and upper bound for the histogram

@param numbers array that the data is stored

@param min minimum number in the data

@param max maximum number in the data

*/

void FindBounds(float numbers[],float *min,float *max)

{

}

/**

function that finds the frequency list for the histogram

@param numbers array that the data is stored

@param frequency the frequencies for each interval

@param min minimum number in the data

@param max maximum number in the data

*/

void FindFrequency(float numbers[],int frequency[], float min,float max)

{

}

/**

function that prints the histogram

@param frequency the frequencies for each interval

*/

void DrawHistogram(int frequency[])

{

int i,j;

printf("\nHistogram:\n");

printf("x_axis:Frequency, y_axis:Values\n");

for(i=0;i<10;i++)

{

// frequency at interval with index i is printed at row i

for(j=1;j<=frequency[i];j++)

printf("-");

printf("\n");

}

}

/**

Main function. Reads the data, calculate frequencies and prints the histogram.

@return 0

*/

int main(){

float numbers[20],min,max;

int frequency[10]={0};

printf("*****Mini Histogram Tool*****\n");

printf("Enter data:\n");

// Readnumbers is called for reading data

// FindBounds is called for finding the upper and lower bound for the histogram

FindBounds(/*Enter parameters!!!*/);

// FindFrequency is called for calculating the frequencies

FindFrequency(*Enter parameters!!!*/);

// DrawHistogram for printing the histogram

DrawHistogram(frequency);

system("pause");

return 0;

}

5. Originally Posted by claudiu
The use of float variables to do this is terrible on the part of whomever designed this assignment.
Why? Is creating a histogram of floating point values an illegitimate thing to do?

6. Originally Posted by brewbuck
Why? Is creating a histogram of floating point values an illegitimate thing to do?
Because floats have a terrible precision and cannot be used to represent certain numbers or hold true results of real number arithmetic operations.

This becomes significant in developing a histogram because a histogram deals with intervals between real numbers, and misrepresentation of these numbers using floats can lead to a wrong histogram.

An instructor should really know better.

7. Originally Posted by claudiu
Because floats have a terrible precision and cannot be used to represent certain numbers or hold true results of real number arithmetic operations.

This becomes significant in developing a histogram because a histogram deals with intervals between real numbers, and misrepresentation of these numbers using floats can lead to a wrong histogram.

An instructor should really know better.
That's so insane I don't know how to respond.

8. Originally Posted by claudiu
Because floats have a terrible precision and cannot be used to represent certain numbers or hold true results of real number arithmetic operations.

This becomes significant in developing a histogram because a histogram deals with intervals between real numbers, and misrepresentation of these numbers using floats can lead to a wrong histogram.

An instructor should really know better.
"misrepresentation" would be a problem if the histogram was supposed to have upwards of 16 million individual bars. With only a mere ten bars, "misrepresentation" is not going to be a problem!

How misleading the incorrect categorisation of a borderline value is no more significant than how misleading having happened to pick exactly ten bars was to begin with. Pick 9 or 11 bars and you can get a very different shape, with different borderline cases. Whether a certain value is borderline or not does not outweight the fact that a histogram is a gross approximation of the distribution to begin with. One could even say that there is no "right" histogram.

In this context we can thus confidently state that floats have copious quantities of precision!

9. Originally Posted by iMalc
"misrepresentation" would be a problem if the histogram was supposed to have upwards of 16 million individual bars. With only a mere ten bars, "misrepresentation" is not going to be a problem!

How misleading the incorrect categorisation of a borderline value is no more significant than how misleading having happened to pick exactly ten bars was to begin with. Pick 9 or 11 bars and you can get a very different shape, with different borderline cases. Whether a certain value is borderline or not does not outweight the fact that a histogram is a gross approximation of the distribution to begin with. One could even say that there is no "right" histogram.

In this context we can thus confidently state that floats have copious quantities of precision!
So what you are saying is that because a histogram is just a gross interpretation of the real distribution it is perfectly OK to do a sloppy job in implementing it. That's just marvelous!

The REAL issue here is that you can pick values for the input array so that you will get different histograms while running that same code on the same input.

10. Originally Posted by claudiu
So what you are saying is that because a histogram is just a gross interpretation of the real distribution it is perfectly OK to do a sloppy job in implementing it. That's just marvelous!
"Oh no, my approximation is approximate! Whatever will I do?"

The quantization error of an IEEE float is orders of magnitude smaller than the quantization error of a histogram. People can, and do, create histograms of floating point data all the time. I have several on my screen right now.

But hey, me and a few other DSP guys around here got quite a chuckle out of it. Maybe you can educate us experts -- if you were given a set of floating point values, and asked to plot them in histogram form, how would you go about it?

11. I believe there are more ways to skew statistics, including histograms, than you can shake a stick at. I have no problem accepting that the datatype used for the input, could be one more way. How big an impact it would have, would depend on the specifics of the input, and how the program was set up. The larger the values, the smaller the skew resulting from floating point datatype being used for the input.

Please correct me if I'm wrong.

@Makonikor:

As I understand it, you've begun your program, from the skeleton functions the instructor provided.

Please let us know how you're doing, and we'll TRY and not hijack your thread into a statistics discussion.

12. Originally Posted by claudiu
So what you are saying is that because a histogram is just a gross interpretation of the real distribution it is perfectly OK to do a sloppy job in implementing it. That's just marvelous!

The REAL issue here is that you can pick values for the input array so that you will get different histograms while running that same code on the same input.
It is true that a float variable cannot handle 10.9706 exactly. Do you actually believe that a double precision can handle it exactly? Do you believe that the difference in the errors (which is of magnitude 2^-19 \approx 0.000002) is relevant to the problem in any way, shape, or form? If so, for goodness' sake, why?

13. Originally Posted by tabstop
It is true that a float variable cannot handle 10.9706 exactly. Do you actually believe that a double precision can handle it exactly? Do you believe that the difference in the errors (which is of magnitude 2^-19 \approx 0.000002) is relevant to the problem in any way, shape, or form? If so, for goodness' sake, why?
Moreover, do you think that the measurement process which produced that value in the first place is actually accurate to that level? You introduced a huge amount of noise just by taking the measurement.

14. Oh no, let's all get really defensive.

1) Yes, a double might not solve the problem entirely but it would definitely work loads better than a float. You know this very well, yet you argue completely different points such as :

" Oh my, but there are other histograms that work just fine for their purpose using floating point values". I am sure there are but that has squat to do with the issue at hand.

2) Yes that magnitude is relevant to the program because nowhere in the description of the program does it say that I can only input numbers with a maximum number of 4 decimal digits.

15. Originally Posted by claudiu
Oh no, let's all get really defensive.
If you say something stupid, I'm going to argue with you about it. End of story.

Popular pages Recent additions