Thread: Having trouble sorting an array of strings using a generic quicksort

  1. #1
    Registered User
    Join Date
    May 2015
    Posts
    228

    Having trouble sorting an array of strings using a generic quicksort

    So I've been trying to implement a generic quick sort for educational purposes and so far my quick sort has been able able to sort any array of integers just fine but it does not work with an array of strings. My only guess is that maybe I'm not offsetting the correct amount of bytes since I'm dealing with void pointers

    Code

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    void quickSort(void *, size_t, size_t, int (*cmp)(const void *, const void *));
    void quickSort_r(void *, int, int, size_t, int (*cmp)(const void *, const void *));
    int qSortPartition(void *, int, int, size_t, int (*cmp)(const void *, const void *));
    int is_sorted(void *, size_t, size_t, int (*cmp)(const void *, const void *));
    int cmp(const void *, const void *);
    void Swap(void *, void *, size_t);
    
    int main()
    {
        unsigned int i;
        char *a[] = {"Luis", "Daniel", "Alfredo", "Salvador", "Bruno"};
    
        size_t numberOfElements = sizeof(a)/sizeof(a[0]);
    
        if(numberOfElements > 1)
        {
            quickSort(a, numberOfElements, sizeof(char *), cmp);
        }
    
        for(i = 0; i < numberOfElements; i++)
        {
            printf("%s ", a[i]);
        }
    
        return 0;
    }
    
    void quickSort(void *base, size_t nitems, size_t memSize, int (*cmp)(const void *, const void *))
    {
         quickSort_r(base, 0, nitems - 1, memSize, cmp);
    }
    
    void quickSort_r(void *base, int first, int last, size_t memSize, int (*cmp)(const void *, const void *))
    {
       if(first < last)
       {
           int pivot;
    
           if(is_sorted(base, last + 1, memSize, cmp))
           {
               return ;
           }
    
           pivot = qSortPartition(base, first, last, memSize, cmp);
    
           quickSort_r(base, first, pivot - 1, memSize, cmp);
           quickSort_r(base, pivot + 1, last, memSize, cmp);
       }
    }
    
    
    int qSortPartition(void *base, int first, int last, size_t memSize, int (*cmp)(const void *, const void *))
    {
        char *carray = (char *)base;
    
        int pivot, mid = (first + last) / 2;
    
        pivot = cmp(&carray[first * memSize], &carray[mid * memSize]) >= 1 ? first : mid;
    
        if(cmp(&carray[pivot * memSize], &carray[last * memSize]) >= 1)
        {
            pivot = last;
        }
    
        Swap(&carray[pivot * memSize], &carray[first * memSize], memSize);
        pivot = first;
    
        while(first < last)
        {
          if(cmp(&carray[first * memSize], &carray[last * memSize]) <= 0)
          {
              Swap(&carray[pivot * memSize], &carray[first * memSize], memSize);
              pivot++;
          }
    
          first++;
        }
    
        Swap(&carray[pivot * memSize], &carray[last * memSize], memSize);
    
        return pivot;
    }
    
    
    int is_sorted(void *base, size_t nitems, size_t memSize, int (*cmp)(const void *, const void *))
    {
      char *carray = (char *)base;
      unsigned int i;
    
      for(i = 0; i < nitems - 1; i++)
      {
         if(cmp(&carray[i * memSize], &carray[i + 1 * memSize]) >= 1)
         {
             return 0;
         }
      }
    
      return 1;
    }
    
    
    int cmp(const void *a, const void *b)
    {
        return (*(int *)a - *(int *)b);
    }
    
    void Swap(void *a, void *b, size_t memSize)
    {
        void *temp = malloc(memSize);
        memcpy(temp, a, memSize);
        memcpy(a, b, memSize);
        memcpy(b, temp, memSize);
        free(temp);
    }

  2. #2
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    You're still comparing ints in your compare function. And you're not actually doing that correctly, either. Consider if 'a' is INT_MIN and 'b' is positive.

    BTW, your swap function is very expensive: a malloc and a free and three memcpy's, for every single swap. That's insane. A bytewise copy would be better (with a single temp char).

  3. #3
    Registered User
    Join Date
    May 2015
    Posts
    228
    Alright I've changed my cmp Function to this.

    Code:
    int cmp(const void *a, const void *b)
    {
       const int *A = a, *B = b;
       return (*A > *B) - (*A < *B);
    }
    You said that I'm still comparing ints but what I've seen online is almost everyone converts them to an int(or to a const int in this example) and check whether a > b, a < b or a == b which is what the cmp function is suppose to do for any data type you compare.
    Last edited by deathslice; 09-04-2016 at 08:37 PM.

  4. #4
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    I don't see how you can sort strings without comparing the strings. For example, with the standard qsort function, your comparison function might be:
    Code:
    int cmp(const void *a, const void *b) {
        return strcmp(a, b);
    }

  5. #5
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Code:
    int cmp(const void *a, const void *b){
        return strcmp((*(const char **)a), (*(const char **)b));
    }

  6. #6
    Registered User
    Join Date
    May 2015
    Posts
    228
    Ok at least now I know it has to do with the way I'm offsetting the memory because I used strcmp and it still did not sort it(but it did with integers). At least know I my swap, cmp and quick sort functions work. I just need to make sure I'm offsetting correctly.

    For quick reference, what I'm doing is casting the void * to char * and storing it to a char *ptr. Then I simply do *(ptr + n * size of type)(n is already offset by 1. No need to do n-1)
    Last edited by deathslice; 09-04-2016 at 09:15 PM.

  7. #7
    Registered User
    Join Date
    May 2015
    Posts
    228
    Wouldn't it just be const char *? why are you casting it to a double ptr and deferencing it to a single ptr? why not just

    Code:
     return strcmp((const char *)a, (const char *)b);
    If you must know, your example caused my program to crash and the reason why it did probably had to with the double ptr casting.
    Last edited by deathslice; 09-04-2016 at 09:18 PM.

  8. #8
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Quote Originally Posted by deathslice View Post
    Wouldn't it just be const char *? why are you casting it to a double ptr and deferencing it to a single ptr?
    That's not how qsort() does it and I don't think any other re-implementation will be different, either.

    When you have a flat array of strings, it's really a 2d array of chars, so you will mostly be treating the data like that. Point into two different rows (each row being a string) and pass those into compare() or whatever it may be called.

  9. #9
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    For qsort, at least, the two ways you might write the compare function depend on whether it's an array of pointers, like yours is, or an array of strings. I wrote my example compare function for an array of strings. whiteflag's example is correct for an array of pointers to strings.
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    int cmpa(const void *a, const void *b) {
        return strcmp(*(const char**)a, *(const char **)b);
    }
    
    int cmpb(const void *a, const void *b) {
        return strcmp(a, b);
    }
    
    int main(void) {
        char *a[] = {"one", "two", "three", "four", "five"};
        char b[][100] = {"one", "two", "three", "four", "five"};
    
        int lena = sizeof a / sizeof *a;
        qsort(a, lena, sizeof *a, cmpa);
        for (int i = 0; i < lena; i++)
            printf("%s\n", a[i]);
    
        printf("----\n");
    
        int lenb = sizeof b / sizeof *b;
        qsort(b, lenb, sizeof *b, cmpb);
        for (int i = 0; i < lenb; i++)
            printf("%s\n", b[i]);
    
        return 0;
    }

  10. #10
    Registered User
    Join Date
    May 2015
    Posts
    228
    Here is an example of me calling cmp and then maybe you can tell me if I'm doing it right

    Code:
    if(cmp(&carray[i * memSize], &carray[i + 1 * memSize]) > 0)
    {
            return 0;
    }
    This is our original array of pointers

    Code:
    char *a[] = {"Luis", "Daniel", "Alfredo", "Salvador", "Bruno"};
    I'm following the prototype that the function qsort has so when I initially pass it to my qsort function, the only way that I can reference it now is through my void pointer by type casting it to a char * and storing the address in a char *(in this case carray). If I then want to access a specific element or row, I would do *(ptr + (n - 1) * size of type) or ptr[(n - 1) * size of type]. That is what I believe I'm doing here in this example.

    I tried debugging and see what would happen if compare Luis with Daniel and what it happened was that it would not enter the if statement even though Luis is lexicographically greater than Daniel. This leads me to believe that what I'm doing is not right.

    I also still don't know why I'm crashing when I change my cmp function to

    Code:
    return strcmp(*(constchar**)a, *(constchar**)b);
    Last edited by deathslice; 09-04-2016 at 09:45 PM.

  11. #11
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    Try printing stuff from your compare routine:
    Code:
    int cmp(const void *a, const void *b) {
        const char *aa = *(const char **)a, *bb = *(const char **)b;
        fprintf(stderr, "comparing: [%s] [%s]\n", aa, bb);
        int ret = strcmp(aa, bb);
        fprintf(stderr, "   result: %d\n", ret);
        return ret;
    }

  12. #12
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    I wonder if you didn't test it very well with integers. I can't get your code to sort this array using the comparison function you gave in post 3:
    Code:
        int a[] = {5, 3, 1, 7, 2};
    Here's a bytewise swap function. It should generally vastly outperform the malloc-based one:
    Code:
    void Swap(void *a, void *b, size_t memSize) {
        char *aa = a, *bb = b;
        do {
            char t = *aa;
            *aa++ = *bb;
            *bb++ = t;
        } while (--memSize > 0);
    }
    Last edited by algorism; 09-04-2016 at 10:42 PM.

  13. #13
    Registered User
    Join Date
    May 2015
    Posts
    228
    Alright I figured out my problem. The problem was that once my array was finally sorted before calling the final quick sort, the changes did not stick and so the output was not fully sorted. Plus once I took whiteflags advice, it finally worked. Can you please explain why it had to be like this

    Code:
    return strcmp(*(const char **)a, *(const char **)b);
    and why

    Code:
    return strcmp((const char *)a, (const char *)b);
    is wrong. Maybe a picture would help illustrate your point better(no pun intended ).

  14. #14
    Registered User
    Join Date
    May 2015
    Posts
    228
    That is weird. I used the comparison function in post 3 and it sorted it just fine.

  15. #15
    Registered User
    Join Date
    May 2015
    Posts
    228
    I'm assuming that what your byteswap function does is tmp holds one byte of information from aa, bb transfer one byte to aa, and then tmp gives one byte of information to bb. This cycle repeats until each byte has been swapped correct?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Sorting strings within an array
    By tictac232434 in forum C Programming
    Replies: 7
    Last Post: 11-03-2012, 02:07 PM
  2. Having trouble sorting an Array that has a zero in it.
    By Cgmgator in forum C Programming
    Replies: 17
    Last Post: 06-29-2011, 09:19 PM
  3. sorting an array of strings
    By porstart in forum C Programming
    Replies: 3
    Last Post: 02-22-2011, 10:43 PM
  4. Sorting strings in an array
    By JLan in forum C Programming
    Replies: 4
    Last Post: 11-29-2003, 05:10 PM
  5. Trouble with array sorting
    By dcj1978 in forum C++ Programming
    Replies: 3
    Last Post: 09-19-2003, 12:16 PM

Tags for this Thread