# Thread: Standard Deviation in C

1. ## Standard Deviation in C

I'm trying to write program that calculates the standard deviation of the numbers in an array.

I was looking at other posts on this site trying to figure it out and I came across this one where brewbuck posted saying

sqrt((sum_of_squares - square_of_sums / N) / (N-1))
How would I calculate the sum of squares and sum of square of sums?

2. Best to avoid looking at code, and get the math into your head, first. You need to know this:

Code:
```    The standard deviation is a measure of how spread out your data are. Computation of the standard deviation is a bit tedious. The steps are:

1.  Compute the mean for the data set.   (mean is the average)

2.  Compute the deviation by subtracting the mean from each value.

3.  Square each individual deviation.

4.  Add up the squared deviations.

5.  Divide by one less than the sample size.

6. Take the square root.

Suppose your data follows the classic bell shaped curve pattern. One conceptual way to think about the standard
deviation is that it is a measures of how spread out the bell is.```
And now the code might begin to make sense, but it is pretty dense code. I'd suggest starting with a simpler "one step at a time", approach.

3. For step 2 would I want to create another array just to store the deviation calculations?

4. Originally Posted by skmightymouse
For step 2 would I want to create another array just to store the deviation calculations?
"Parallel arrays" idea comes to mind as a natural way to go. Certainly not the ONLY way. Using a struct for the original array also seems appealing.

What strikes you as good?

5. I'm pretty new to coding so I am currently scouring the internet to figure out what a parallel array is and how to use one. I don't think I want to do a struct of the original array though.

6. Originally Posted by skmightymouse
I'm pretty new to coding so I am currently scouring the internet to figure out what a parallel array is and how to use one. I don't think I want to do a struct of the original array though.
The critical aspect of parallel arrays, is simply that there is a direct relationship between the two arrays' indices. Say you had a list of soups:
Code:
```Name       size      price
==========================
Chicken     10       2.10
Gumbo      14        2.05
Green Pea  12        1.75
Vegetable  10        1.88```
You could use a struct with members like char name, int size and float price, but you could also use parallel arrays, keeping in mind that the index for soup[i].name ALWAYS matched the correct size at soup[i].size and price at soup[i].price

So all the soups data could be printed out using:
Code:
```for(i=0;i<NumSoups;i++)
printf("%12s  %2d  %f\n",soupNames[i], soupSize[i], soupPrice[i]);```
Using the parallel arrays of soupNames[][], soupSize[], and soupPrice[].

IMO the net isn't a good place to learn about things like parallel arrays. You're probably going to wander through the weeds, because both words are much more commonly used, elsewhere.

7. I think that you were trying to over complicate it because I was able to write it like this without any parallel arrays or structs:

Code:
```    int i = 0, j, n;
double sum = 0, variance;

n = numItems;

for (j = 0; j < numItems; j++)
{
array[i] = pow((array[i] - average),2);
sum += array[i];
i++;
}

variance = sum / (n -1);

printf ("\nThe standard deviation is %.1lf\n", sqrt(variance));```
But I still appreciate the help.