Hi everyone ! It will be quite long so I will be direct.
I aim to execute a series of time tests on several sorting algorithms for some project to compare them.
For each sort (bubble sort, insertion sort etc...), I have to run the algorithm for different sizes among : 100, 500, 5000, 10000, 50000, 100000, 200000, 300000, 400000 & 500000 elements and for each size, I have to execute the code on 20 different arrays which should be filled randomly.
Let have the bubble sort algorithm, I post here the program which, taken an array in disorder, make all the correct operations to return it in ascending order :
Code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MAX 500000
typedef int ARRAY[MAX];
void swap(ARRAY A, int val1, int val2){
int value = A[val1];
A[val1] = A[val2];
A[val2] = value;
}
void bubble_sort(TABLEAU T, int n){ /*
You should note that the function is taking n in argument so that, taken an array of 500000 elements, it sorts only on the n first elements
*/
int i, j;
for(i = n; i > 0; i--)
for(j = 0; j < i; j++)
if (A[j] > A[j+1])
swap(A, j, j+1);
}
Eventualy we would have to display the array in question like this but it is not very important here :
Code:
void display(ARRAY A, int n){/*
Again it only displays the same n first demanded values of the array
that are actually sorted
*/
int i;
for(i = 0; i < n; i++)
printf("%d ", A[i]);
}
I should precise some additional things :
We evaluate a series of tests like this : for each among the 20 arrays to be run, we take the time of execution of the calling of the bubble sort function. We sum up all of these times then we divide by the number of the arrays treated (optimally 20). It gives for each size of array the average time of execution of a sort.
for each series of test of a same size of array, if the time to execute the series is beyond 5mn, I have to stop the series and go for the next size series. For example, with a series of 20 arrays of 50,000 elements, if the sorting of the third ones arrays makes time go over 5mn, I stop and the average time for this series will be : Time elapsed / 3, then I switch to the series of arrays of 100,000 elements.
So I wrote the lines & the main function below to automate the tests :
Code:
void(*pbubble_sort)(ARRAY, int);
int sizes[10] = {100, 500, 5000, 10000, 50000, 100000, 200000, 300000, 400000, 500000};
int main(void){
pbubble_sort = &buble_sort;
ARRAY A;
int i, j, k, nb_array;
float timer;
clock_t start, end;
for(i = 0; i < 10; i++){
timer = 0.0;
nb_array = 0;
for(j = 0; j < 20; j++){
srand(time(0)); /*
Initialize the random numbers generator
*/
for(k = 0; k < sizes[i]; k++)
A[k] = (unsigned int)rand();
start = clock();
(*pbubble_sort)(A, sizes[i]);
end = clock(); /*
Here I take the time of the calling of the function
*/
float t = (end - start) * 1.0 / CLOCKS_PER_SEC;
if (timer + t > 5.0*60.0){
printf("\nNon finished session\n"); /*
Here I indicate whether a series of test is finished or not compared with the time of 5 mns.
*/
break; /*then I go out of the loop*/
}
else {
timer += t;
nb_arrays += 1;
}
}
float average_time = timer / nb_arrays;
printf("\n Size %d : Average = %f\n", sizes[i], average_time);
}
}
It actually works pretty well, but as you know, the bubble sort is the slowest of all the sorting algorithms, so I must wait for nearly half an hour to end the program. From the size of 200,000 elements, as it longs more than 5mn to execute one sorting of one array, none of the 20 arrays can be sorted in time, so the average time is equal to infinite value, which is theorically correct compared with the limit of 5 mns which we have set.
But the problem is that even though the program checks for the 5mn overtaking, it can only check it AFTER the calling of the function. So it isn't useful at all when I begin the series of arrays of 500,000 elements where a simple calling for an array may take huge huge time (20mn for example).
So I would force the for loop inside which assume the calculus of the time to stop automatically after 5mn, even if the calling of the function bubble_sort is still in process and has no returned (and which probably will return 20 mns later) to avoid lose time for Nothing.
So, the total time of execution would cost me 5mn (time maximum for one loop) * 10 (different sizes) to the worst, assuming that it's not the case because when 100 or 500 elements, the time of execution is ridiculously low.
I heard about this problem can be solved with Threads but I don't know how it works. If someone could have helped me on this project, I will greatly thank hiM
I assume that the biggest task for the potential helpers would be to read this post to the very end
I hope I make myself clear and I thank everyone who could help me