1. ## Sorting Functions

Hi all,

Ive been working on sorting numbers in an array,
I can get the sorting to work somewhat but it will not loop through all of the numbers enter on the first try. This is an class project so im not looking for a full answer just some guidance.

Attached is the for loop I'm using with the array declaration.

Code:
```   #include <stdio.h>
#include <math.h>

int n;
int Data[50];
int i ;

int printArray(void) {         // prints all numbers in array
for (i=0;i<n;i++){
printf("%d, ", Data[i]);
}
printf ("\n");
return (0);
}

int getArray(void) {           // scanf numbers into array

printf("Enter a list of Data to compute.\n");
scanf("%d", &n);

for (i=0;i<n;i++){
printf("%d = ", i);
scanf("%d", &Data[i]);
}
return (0);
}

int sort_sm(void){         //Sorts the array from smallest to largest.
int swap = 0;
int temp = 0;
int i = 0;
int n = 5;

for(swap=1;swap<n-1;++swap)
for(i=0;i<n-swap;++i)
{
if(Data[i]>Data[i+1])
{
temp=Data[i];
Data[i]=Data[i+1];
Data[i+1]=temp;
}
}
printArray();                           //Print results

return(0);
}```

2. Several things:

1. Providing us a complete program (i.e. a main function for testing) would be nice next time. The less work we have to do, the better
2. Global variables are bad (link). Declare them in main and pass them as needed.
3. What happens if n > 50? You should do at least a basic check/validation of input.
4. A function should do one thing and do it well. sort_sm should not call printArray, it should only sort. Print the array after sorting from main.
5. Fix your prompts in getArray, they ask for the wrong info, so I broke your program the first time. Something like "Enter number of data points" and "Enter data point %d"
6. Keep your naming and indentation consistent. Either camelCase, or underscore_separated for the same type of identifier (e.g. function name). Consistency, order and readability are important, as it makes your code easier to read, maintain and fix, and harder for you to make mistakes.
7. Learn to use a debugger, so you can step through and examine the state of your program while it's running. That takes a while, so in the short term, try adding some debug output in your sort function to test.
8. Remove math.h. You're not using any functions from there, so you don't need to #include it.

Check line 39 in your post:
Code:
`int n = 5;`
This n shadows the global n, thus it screws up your loop.

3. Thank you for the reply

1. Attached is a fully functioning program with just some logic errors.
2. I will look into the link you posted, Thanks!

As for the rest I will work on cleaning up the program

Code:
```#include <stdio.h>

int n;
int Data[50];
int i ;

int printArray(void) {         // prints all numbers in array
for (i=0;i<n;i++){
printf("%d, ", Data[i]);
}
printf ("\n");
return (0);
}

int getArray(void) {           // scanf numbers into array

printf("Enter number of data points.\n");
scanf("%d", &n);

for (i=0;i<n;i++){
printf("%d = ", i);
scanf("%d", &Data[i]);
}
return (0);
}

int sort_sm(void){         //Sorts the array from smallest to largest.
int swap = 0;
int temp = 0;
int i = 0;

for(swap=1;swap<n-1;++swap)
for(i=0;i<n-swap;++i)
{
if(Data[i]>Data[i+1])
{
temp=Data[i];
Data[i]=Data[i+1];
Data[i+1]=temp;
}
}
printArray();                           //Print results

return(0);
}

int sort_lg(void){                         //Sorts the array from largest to smallest.
int swap = 0;
int large = 0;
int i = 0;
int n = 9;
int x = 0;

do {
swap = 0;                         //Reset swap bit.
for (i = 0; i < n; i++) {           //Loop through array

if (Data[i] < Data[i + 1]){     //If data is out of order, Move data
large = Data[i];
Data[i] = Data[i + 1];
Data[i + 1] = large;
swap = 1;                   //Data was moved.

}
}
x += 1;
i = 0;
} while (swap == 1 && x < 9);                                  //If data was moved, Do again.
printArray();                                                      //Print results

return(0);
}

int find_sm(void){                                               //Finds and outputs the smallest number in the array.
int i = 0;
int small = Data[0];

for (i = 0; i < n; i++) {                                      //Scan through the array.
if (Data[i] < small){                                      //If data 1 is smaller than data 2
small = Data[i];                                    // Data 1 is smaller
}
}
printf("%d is the smallest number in the array.\n", small);
return(0);
}

int find_lg(void){                                             //Finds and outputs the largest number in the array.
int i = 0;
int large = Data[0];

for (i = 0; i < n; i++) {                                   //Scan through the array.
if (Data[i] > large){                                      //If data 1 is greater than data 2
large = Data[i];                                     // Data 1 is larger
}
}
printf("%d is the largest number in the array.\n", large);
return(0);
}

int avg(void){                                                    //Takes the sum then outputs the average of the array.
double avg;
double sum = 0;

for (i=0;i<n;i++){
sum += Data[i];                                        //add the total sum
}
avg = sum / n;                                            //divide by number of data points
printf("The average of the data is %.2f.\n", avg);
return(0);
}

int main(void) {
int choice = 0;

getArray();
printArray();

do {

printf("\n\n");
printf("Choose from the following menu: \n\n");
printf("\t1 - Ascending order\n");
printf("\t2 - Descending order\n");
printf("\t3 - Smallest\n");
printf("\t4 - Largest\n");
printf("\t5 - Average\n");
printf("\t0 - Exit the program\n");
scanf("%d", &choice);
printf("\n");

switch(choice){
case 1:{
sort_sm();
break;}
case 2:
sort_lg();
break;
case 3:
find_sm();
break;
case 4:
find_lg();
break;
case 5:
avg();
break;
default:
if (choice != 0 && choice != 1) {
printf("Your choice (%d) is not valid. \n", choice);
}
break;
}
} while (choice != 0);

return (0);
}```

4. Please be more specific about the errors. Please provide the input you gave, the output you expected to see and the output you actually got.

5. What operating system and compiler are you using? Do you have access to a source level debugger, which would help quite a bit?

6. You still have not addressed the declaration of n inside sort_lg shadowing the global variable.
That would appear to be the cause of your problem, so if you don't know how to fix that now that it has been pointed out, ask more questions.

7. Always write a sort function like qsort(). If you're not familiar with qsort() then look it up, and get it working. Use a list of integers generated by rand() to test it.
Once you've got qsort() working, you can test your own function against qsort(). A good test is to put a counter++ (counter being a global variable) in the comparison function. This tells you how often the comparision function is being called, which is a very good proxy for efficiency.
You're unlikely to beat qsort() because it is extremely highly optimised.

8. I took the liberty of re-indenting your code, and enclosing your existing comments in /* */ because I added some more starting with //. I hope it will help you (I left the naming and the logic of your code intact).

[offtopic ? ]
Adding to what other people suggested in this thread, also consider the following:

Try to avoid global variables. Instead try to define local variables inside the functions that need them. If those functions call other functions that need access to any of those variables, then pass those variables as arguments (parameters) to the functions being called.

If a function being called does not need to return back to its caller the value of a passed argument (modified or not), then the caller just passes the variable to that function.

However, if a function being called modifies the value of a passed argument and needs to return the new value back to the caller through the same argument, then the caller passes the address of the variable to the function (which in turn treats it as a pointer inside its body).

An example of the above would be to define both your Data array and its user-defined length n locally in the main() function, and then pass them as arguments to any other functions being called by main() that need access to those variables.

Here is an example of the suggestions above, for your getArray() and printArray() functions...

Code:
```/* ---------------------------------------------------------- */
void flush_stdin( void )
{
int c;
while ( '\n' != (c=getchar()) && EOF != c )
; /* void */
}

/* ----------------------------------------------------------
*/
int getArray( int Data[50], int *n )
{
int i, temp=0;        /* temp is for validating scanf's return value */

/* keep asking for n until user inputs a valid value */
do {
puts("Enter number of data points.");
temp = scanf("%d", n);

/* on bad input flush stdin (does not work for input like: 12df) */
if ( 1 != temp ) {
flush_stdin();
continue;
}

} while ( *n < 1 || *n > 50 );

for (i=0; i < *n; i++) {
/* keep asking until user inputs a valid int */
for (;;) {
printf("%d = ", i);
fflush( stdout );
temp = scanf("%d", &Data[i]);

/* on bad input flush stdin (does not work for input like: 12df) */
if ( 1 != temp ) {
flush_stdin();
continue;
}
break;
}
}

return 0;
}

/* ----------------------------------------------------------
* Prints all numbers in array.
*/
int printArray( const int Data[50], int n )
{
int i;

/* sanity check */
if ( n < 1 || n > 50 ) {
fputs( "*** internal error!\n", stderr );
return 1;
}

for (i=0; i < n; i++) {
printf("%d, ", Data[i]);
}
putchar('\n');

return 0;
}
/* ---------------------------------------------------------- */
int main( void )
{
int Data[50];    // the array (WITH ROOM FOR UP TO 50 ELEMENTS)
int n;           // <--- user defined total count of array elements (YOU MUST VALIDATE IT ON INPUT)

getArray( Data, &n );
printArray( Data, n );
...

}```
Admittedly, the modified getArray() function is convoluted because it tries to reject non-integer values on input using the return value of scanf(). It also fails to flush the stdin when input is given of the form "123jklsj", but at least it works.

Generic interactive input via standard C functions is sadly quite complicated (here is an enlightening article, if you are up to it), but since this is not the main point in this thread, here's a simplified version of the modified getArray() function...

Code:
```int getArray( int Data[50], int *n )
{
int i;

/* keep asking for n until user inputs a valid value */
do {
puts("Enter number of data points.");
scanf("%d", n);
} while ( *n < 1 || *n > 50 );

for (i=0; i < *n; i++) {
printf("%d = ", i);
fflush( stdout );    /* play it safe on all platforms */
scanf("%d", &Data[i]);
}

return 0;
}```
One thing to remember here, imho, is that Data[50], n, and i are defined in their logically minimal scope, and they are passed as function arguments only if they need to be accessed in deeper scopes.

Well, I didn't express the last paragraph as nicely as I intended, perhaps a native English speaker can revise it using more precise wording.

The other thing to remember, is to always validate the user input before using it.

PS. Is there a tag equivalent for the [offtopic] or [spoiler] tags used in other boards?

[/offtopic ?]

9. Originally Posted by Malcolm McLean
...
You're unlikely to beat qsort() because it is extremely highly optimised.
Hm, I would say that due to its generic nature, the C library qsort() often suffers a performance penalty compared to a custom implementation, specialized for the exact desirable data set and types.

10. Originally Posted by migf1
Hm, I would say that due to its generic nature, the C library qsort() often suffers a performance penalty compared to a custom implementation, specialized for the exact desirable data set and types.
Actually it really depends on how qsort is written. That might sound obvious, but the function qsort does not even need to be quicksort. It can be any function that sorts in O(nlogn) time.

However there are fast implementations.

And you are unlikely to beat qsort, if only because you could do most of the work with memmove, which is well-written. Many implementations speed up memmove by copying word by word, not byte by byte... if you can take advantage of that, you get a significant performance gain.

11. Originally Posted by whiteflags
Actually it really depends on how qsort is written. That might sound obvious, but the function qsort does not even need to be quicksort. It can be any function that sorts in O(nlogn) time.
Yeap, this is true. I think it is even mentioned in the standard that it needs not be implemented as genuine quick-sort (but I may be mistaken, too lazy to look it up now).

However there are fast implementations.

And you are unlikely to beat qsort, if only because you could do most of the work with memmove, which is well-written. Many implementations speed up memmove by copying word by word, not byte by byte... if you can take advantage of that, you get a significant performance gain.
You got me a bit confused. Perhaps you meant to write "And you are likely to beat qsort" in there, or is it my poor English confusing me?

In any case, the generic qsort() also introduces repeated stack and casting overhead which, keeping all other things constant, may seriously hurt performance.

PS. Irrelevant here, but I think it is worth mentioning that quick-sort is horrible for mostly sorted data-sets (I pointed it out only for people that are just starting playing with sorting algs).

12. You got me a bit confused. Perhaps you meant to write "And you are likely to beat qsort" in there, or is it my poor English confusing me?
No. You are unlikely to beat qsort. Not likely. un- is a negative prefix we inherited from Germans.

In any case, the generic qsort() also introduces repeated stack and casting overhead which, keep all other things constant, may seriously hurt performance.
I haven't seen a qsort implementation that actually uses recursion.

PS. Irrelevant here, but I think it is worth mentioning that quick-sort is horrible for mostly sorted data-sets (I pointed out only for people that are just starting playing with sorting algs).
Not really irrelevant. Understanding what makes quicksort suck can help you. You do not have to write the purest quicksort ever, and all(?) vendors won't.

Look at this:
qsort.c

There are all sorts of clever things you can do to speed up qsort. This function actually falls back on insertion sort once the partitions are small enough. And the pointers zoom right past any sorted parts of the arrays once the split occurs. So this qsort does not suck for mostly sorted arrays, either.

13. Originally Posted by whiteflags
No. You are unlikely to beat qsort. Not likely. un- is a negative prefix we inherited from Germans.
Ah ok

I haven't seen a qsort implementation that actually uses recursion.
I was referring to the repeated calls of the generic comparison function.

Look at this:
qsort.c

There are all sorts of clever things you can do to speed up qsort. This function actually falls back on insertion sort once the partitions are small enough. And the pointers zoom right past any sorted parts of the arrays once the split occurs. So this qsort does not suck for mostly sorted arrays, either.
Yeap, I'm aware of this fact. However I recall for example many people complaining about the poor performance of glibc's qsort() a few years back (2005 or so I think). Many of them even provided alternative implementations (I just googled one: belazougui djamel - about glibc qsort ... this guy posted in 2007 though).

All in all, I agree that in the vast majority of cases it is not worth the effort and the time (which more or less means money ) to compete with the built-in implementation, especially nowadays and especially with trusted implementations.

Still, there are niche sectors (embedded for example) that may be even crucial to make sure that your algs are as efficient as possible in your specific environment.

14. I compared microsoft's quick sort (vector sort()) versus microsoft's merge sort (vector stable_sort()) to sort 1048576 pointers to 10 byte text records of random data (8 bytes data, cr, lf), and got these results:

64 bit mode, Windows XP64, Visual Studio 2005 compiler.
Intel i7 2600K 3.4ghz, DP67BG motherboard, 4GB ram.

Code:
```microsoft stable_sort      sortp.cpp      313 ms
microsoft sort             sortp.cpp      359 ms```
32 bit mode, Windows XP, Visual Studio 2005 compiler.
Intel i7 2600K 3.4ghz, DP67BG motherboard, 4GB ram.

Code:
```microsoft stable_sort      sortp.cpp      281 ms
microsoft sort             sortp.cpp      328 ms```
The situation was reversed when sorting 4,194,304 64 bit unsigned integers of psuedo random data:

64 bit mode, Windows XP64, Visual Studio 2005 compiler.
Intel i7 2600K 3.4ghz, DP67BG motherboard, 4GB ram

Code:
```microsoft sort             sortv.cpp     344 ms
microsoft stable_sort      sortv.cpp     375 ms```

15. Originally Posted by migf1
Ah ok
Still, there are niche sectors (embedded for example) that may be even crucial to make sure that your algs are as efficient as possible in your specific environment.
qsort has several problems with it. The main ones are that it doesn't take an arbitrary pointer to the comparison function, so if you want to compare on a user-generated expression, you have to use a global, it's not stable, so you can't keep the existing order for equal items, and it doesn't return an index, so if you have a struct employee you can sort employees by salary very easily, but if you have parallel arrays containing ids and salaries, it's difficult.
So there are lots of reasons you might write a different one. Mostly coding sorts is just a learning exercise, however.