Thread: Help..Heapp..Heapp..

  1. #1
    Registered User
    Join Date
    Jan 2011
    Location
    Mississauga
    Posts
    10

    Help..Heapp..Heapp..

    Hi All,

    I wrote a program to read a file then sort it using up and down heapsort.. Yet, it only shows me the result correctly if there are less than 8 values, i have been trying to debug and still can't find where is the logic error.. Would you tell me where's the logic error?

    Thanks..

    Code:
    #include<stdio.h>
    
    int array[300000],maxElm,i,j,k;
    
    void pullValues() {
    	printf("\nSorted Array\n");
            maxElm = j;
    	for(i = 1; i <= j; i++) {
    		printf("%d\n", array[i]);		
    	}
    }
    
    void upSort(int *array,int i) {
            int v = array[i];
    
            while((i > 1) && (array[i/2] < v))
            {
                    array[i] = array[i/2];
                    i = i / 2;
            }
            array[i] = v;
    }
    
    void downSort(int *array,int i,int maxElm) {
            int v = array[i];
            int j = i * 2;
            while(j <= maxElm)
            {
                    if((j < maxElm) && (array[j] < array[j+1])) {
    			j++;
    		}
                    if(array[j] < array[j/2]) {
    			break;
    		}
                    array[j/2] = array[j];
                    j = j * 2;
            }
            array[j/2] = v;
    }
    
    main() {
    	FILE *fp;
    
    	printf("\nPlease input the number of entries (k): ");
    	scanf("%d", &maxElm);
    
    	if((fp = fopen("test_dat.txt", "r")) == NULL) {
    		printf("Failed to open test_dat.txt\n");
    		return 0;
    	} else {
    		printf("Success to open test_dat.txt\n");
    		while(!feof(fp)) {
    			for(i = 1 ; i <= maxElm; i++) {
    				fscanf(fp, "%d", &array[i]);
                  			upSort(array,i);
            		}
    		}
    	}
    
            j=maxElm;
            for(i = 1; i <= j; i++) {
                    int temp;
                    temp=array[1];
                    array[1]=array[maxElm];
                    array[maxElm]=temp;
                    maxElm--;
                    downSort(array,1,maxElm);
            }
    	pullValues();
    }

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Silly question: do you have more than 8 numbers in test_dat.txt?

  3. #3
    Registered User
    Join Date
    Jan 2011
    Location
    Mississauga
    Posts
    10
    Actually, I have 200,000 numbers and each of them are 5 digits.. But I tried wit 10 numbers and it gives me weird results.. Any idea why it works with less than 8 inputs?

  4. #4
    Registered User
    Join Date
    Jan 2011
    Location
    Mississauga
    Posts
    10
    I forget to mention that the result is in ascending order, it gives me what I want with less than 8 inputs.. Thanks

  5. #5
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    The global variables and one-based arrays made me throw up in my mouth a bit, but the rest of the code was tolerable. Just a few notes before wewe get to your algorithm:
    1. main needs to be int main and you should return an integer (preferably zero) when done
    2. Arrays in C are zero-base, i.e. the first element is array[0], not array[1].
    3. Global variables are evil and make debugging more difficult. Make all of them, except your array (because it's so big) local to main and pass them as function parameters.
    4. Also, it would make more sense to leave maxElem the same and make j the value that changes (e.g. at the bottom of main). Less re-purposing of variables means you are less likely to have bugs.


    And for the $64,000 question:
    Quote Originally Posted by c3jcarmy View Post
    Code:
    void downSort(int *array,int i,int maxElm) {
            int v = array[i];
            int j = i * 2;
            while(j <= maxElm)
            {
                    if((j < maxElm) && (array[j] < array[j+1])) {
    			j++;
    		}
                    if(array[j] < array[j/2]) {
    			break;
    		}
                    array[j/2] = array[j];
                    j = j * 2;
                    // finish swapping here
            }
            array[j/2] = v;
    }
    Try moving the rest of that swap (red line) inside the while loop.

  6. #6
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Okay I figured out what you mean by "up and down heapsort".

    That's not the normal way of doing heapsort. You're build the heap the naieve O(n*log n) way whereas it is typically done in O(n). Of course the subsequent extraction of each value from the heap is O(n*log n), so it's still O(n*log n) either way. In practice is makes the process oh say twice as fast.

    However for 200000 values that are only six digits, there must clearly be some duplicates.
    There are certain algorithms that are tailored to this situation that are more efficient. For the best speed you should consider other algorithms such as Flashsort, Bucketsort, Countingsort, or Radixsort. These may easily be ten times or so faster again. Should you care about speed at all that is.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  7. #7
    Registered User
    Join Date
    Jan 2011
    Location
    Mississauga
    Posts
    10
    Thanks to anduril462... It works perfectly fine by now! lol..
    I use A[1] instead of A[0] since it is easier for me to debug how the heap will be.. But yeah, I will try to use based 0 instead of 1!
    About the global variables, I thought it is easier to initiate a global variable rather than initiate the variable in every functions.. Is it more efficient than local variables? Sorry for this silly question..

  8. #8
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Global variables might be a little more efficient in some regards, but not enough to justify their use everywhere. You are exhibiting "pre-optimization", which means you're trying to make your program faster before you know whether it's too slow. This usually leads to problems. Write your code as clearly and solidly as possible, then only optimize the parts that need it and only if they really need it.

    Here is a great article on why globals are bad.

  9. #9
    Registered User
    Join Date
    Jan 2011
    Location
    Mississauga
    Posts
    10
    Wow.. thanks! That answers all my questions.. lol
    I'll try other sorts also, thanks..

  10. #10
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Hint: My homepage as found in my sig below is mostly about a massive collection of sorting algorithms, though I no longer say this in the link title since I don't want beginners to go straight there and just copy the code.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  11. #11
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Nothing wrong with '1' based arrays, especially for heapsort. Its algorithm depends on array indexes being multiplied by 2 which would be ugly if one started with 0. I would declare a pointer int *pnt = &array[-1] and use that.

  12. #12
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    That is not kosher according to others who have quoted me the standard before. Some machine may store that array at address zero, and subtraction of 1 may not go below zero, yadda yadda and other such crap that means negative array index isn't technically allowed, even if you never access it.
    However it's not actually necessary. It's easy enough to do it zero based. See the code on my webpage.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  13. #13
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    OK, iMalc. I didn't realize calculating negative address could be a problem. I guess it depends on whether the compiler generates code for signed vs. unsigned integer at that point, even if it's for an intermediate result.

    I don't wanna go to your webpage. There's no way to multiply the index of array[0] by 2 and get something meaningful. You'll have to put 'if' statements all over the place for that exception. No amount of webpage can make 0 x 2 = 1.

  14. #14
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Your left and right children are i * 2 + 1 and i * 2 + 2, no if statements needed.

  15. #15
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Good point. For some reason I always coded children are i * 2 and i * 2 + 1. I save an addition some of the time.
    Code:
       other way             my way
    
          [0]                  [1]
      [1]     [2]          [2]     [3]
    [3] [4] [5] [6]      [4] [5] [6] [7]
    Last edited by nonoob; 02-03-2011 at 01:49 PM.

Popular pages Recent additions subscribe to a feed