Thread: Sometimes "goto" is useful

  1. #1
    Registered User
    Join Date
    Apr 2013
    Posts
    1,636

    Sometimes "goto" is useful

    Here is an example of a 4 way (no minheap) bottom up merge sort, which is about 15% faster than a 2 way merge sort and about as fast (sometimes a bit faster) than quick sort, on a processor with 16 registers, such as X86 in 64 bit mode, where the compiler optimization uses 8 of the registers for pointers to the current and end of runs. Trying to remove the goto's in this case would duplicate code and make it more complex. To implement this in Java, which doesn't have goto (but does reserve it), I had to switch to a top down merge sort, which is a bit slower, somewhat defeating the purpose of a 4 way merge sort (for performance).

    One general reason for using goto is when there are multiple paths leading to common code, and this is an example of that.

    Code:
    int * BottomUpMergeSort(int a[], int b[], size_t n)
    {
    int *p0r;       // ptr to      run 0
    int *p0e;       // ptr to end  run 0
    int *p1r;       // ptr to      run 1
    int *p1e;       // ptr to end  run 1
    int *p2r;       // ptr to      run 2
    int *p2e;       // ptr to end  run 2
    int *p3r;       // ptr to      run 3
    int *p3e;       // ptr to end  run 3
    int *pax;       // ptr to set of runs in a
    int *pbx;       // ptr for merged output to b
    size_t rsz = 1; // run size
        if(n < 2)
            return a;
        if(n == 2){
            if(a[0] > a[1])std::swap(a[0],a[1]);
            return a;
        }
        if(n == 3){
            if(a[0] > a[2])std::swap(a[0],a[2]);
            if(a[0] > a[1])std::swap(a[0],a[1]);
            if(a[1] > a[2])std::swap(a[1],a[2]);
            return a;
        }
        while(rsz < n){
            pbx = &b[0];
            pax = &a[0];
            while(pax < &a[n]){
                p0e = rsz + (p0r = pax);
                if(p0e >= &a[n]){
                    p0e = &a[n];
                    goto cpy10;}
                p1e = rsz + (p1r = p0e);
                if(p1e >= &a[n]){
                    p1e = &a[n];
                    goto mrg201;}
                p2e = rsz + (p2r = p1e);
                if(p2e >= &a[n]){
                    p2e = &a[n];
                    goto mrg3012;}
                p3e = rsz + (p3r = p2e);
                if(p3e >= &a[n])
                    p3e = &a[n];
                // 4 way merge
                while(1){
                    if(*p0r <= *p1r){
                        if(*p2r <= *p3r){
                            if(*p0r <= *p2r){
    mrg40:                      *pbx++ = *p0r++;    // run 0 smallest
                                if(p0r < p0e)       // if not end run continue
                                    continue;
                                goto mrg3123;       // merge 1,2,3
                            } else {
    mrg42:                      *pbx++ = *p2r++;    // run 2 smallest
                                if(p2r < p2e)       // if not end run continue
                                    continue;
                                goto mrg3013;       // merge 0,1,3
                            }
                        } else {
                            if(*p0r <= *p3r){
                                goto mrg40;         // run 0 smallext
                            } else {
    mrg43:                      *pbx++ = *p3r++;    // run 3 smallest
                                if(p3r < p3e)       // if not end run continue
                                    continue;
                                goto mrg3012;       // merge 0,1,2
                            }
                        }
                    } else {
                        if(*p2r <= *p3r){
                            if(*p1r <= *p2r){
    mrg41:                      *pbx++ = *p1r++;    // run 1 smallest
                                if(p1r < p1e)       // if not end run continue
                                    continue;
                                goto mrg3023;       // merge 0,2,3
                            } else {
                                goto mrg42;         // run 2 smallest
                            }
                        } else {
                            if(*p1r <= *p3r){
                                goto mrg41;         // run 1 smallest
                            } else {
                                goto mrg43;         // run 3 smallest
                            }
                        }
                    }
                }
                // 3 way merge
    mrg3123:    p0r = p1r;
                p0e = p1e;
    mrg3023:    p1r = p2r;
                p1e = p2e;
    mrg3013:    p2r = p3r;
                p2e = p3e;
    mrg3012:    while(1){
                    if(*p0r <= *p1r){
                        if(*p0r <= *p2r){
                            *pbx++ = *p0r++;        // run 0 smallest
                            if(p0r < p0e)           // if not end run continue
                                continue;
                            goto mrg212;            // merge 1,2
                        } else {
    mrg32:                  *pbx++ = *p2r++;        // run 2 smallest
                            if(p2r < p2e)           // if not end run continue
                                continue;
                            goto mrg201;            // merge 0,1
                        }
                    } else {
                        if(*p1r <= *p2r){
                            *pbx++ = *p1r++;        // run 1 smallest
                            if(p1r < p1e)           // if not end run continue
                                continue;
                            goto mrg202;            // merge 0,2
                        } else {
                            goto mrg32;             // run 2 smallest
                        }
                    }
                }
                // 2 way merge
    mrg212:     p0r = p1r;
                p0e = p1e;
    mrg202:     p1r = p2r;
                p1e = p2e;
    mrg201:     while(1){
                    if(*p0r <= *p1r){
                        *pbx++ = *p0r++;            // run 0 smallest
                        if(p0r < p0e)               // if not end run continue
                            continue;
                        goto cpy11;
                    } else {
                        *pbx++ = *p1r++;            // run 1 smallest
                        if(p1r < p1e)               // if not end run continue
                            continue;
                        goto cpy10;
                    }
                }
                // 1 way copy
    cpy11:      p0r = p1r;
                p0e = p1e;
    cpy10:      while (1) {
                    *pbx++ = *p0r++;                // copy element
                    if (p0r < p0e)                  // if not end of run continue
                        continue;
                    break;
                }
                pax += rsz << 2;            // setup for next set of runs
            }
            std::swap(a, b);                // swap ptrs
            rsz <<= 2;                      // quadruple run size
        }
        return a;                           // return sorted array
    }
    Last edited by rcgldr; 1 Week Ago at 08:06 PM.

  2. #2
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,684
    Of course, an experienced programmer knows when it's wise to ignore blanket statements like "never use goto".

    The reason goto is shunned is probably because many assembly programmers back in the day turned to C but kept using goto instead of the language's higher-level statements.
    Devoted my life to programming...

  3. #3
    Registered User
    Join Date
    Feb 2019
    Posts
    359
    Quote Originally Posted by GReaper View Post
    The reason goto is shunned is probably because many assembly programmers back in the day turned to C but kept using goto instead of the language's higher-level statements.
    Historical fact: Dijkstra, in 1968, published an article in ACM titled Go To Statement considered harmful, a couple of years before C.

  4. #4
    Registered User
    Join Date
    Apr 2013
    Posts
    1,636
    Quote Originally Posted by flp1969 View Post
    Historical fact: Dijkstra, in 1968, published an article in ACM titled Go To Statement considered harmful, a couple of years before C.
    ... and in 1974, Donald Knuth wrote "Structured Programming with go to Statements" (you can do a web search to find copies of the original paper). The point of creating this thread is that I rarely see "multiple paths to common code" mentioned as a reason for using goto, other than for exiting out of a function to clean up code if some type of exception occurs in nested code. The 4 way merge sort is a more generic case of multiple paths to common code.

    In the 4 way merge sort example I've posted, the code uses nested if's to determine which run has the smallest leading element as opposed to using a minheap. There are two possible paths leading to each of the 4 cases. In addition, when nearing the end of an array, or when the end of a run is reached, there's a drop down to a 3 way merge, then 2 way merge, then a copy of the rest of the remaining run, all of which also involve two possible paths to reach.

  5. #5
    Registered User
    Join Date
    May 2019
    Posts
    142
    I thought I'd have a go at this.

    I know the "goto" subject can start religious level wars. I happen to be on the side of flp1969 (and Dijkstra) here.

    I present an example re-reading of the OP algorithm. My point isn't to argue so much as to compare, especially for those interested in the point.

    I decided not to take a 4 way algorithm from the web, though there are many examples, as that wouldn't really demonstrate what I wanted to point out.

    Instead, I read @rcgldr's algorithm as posted, deciphering what it says, not what the algorithm itself does. I implemented the same algorithm, but without goto. I made no alterations to the algorithm itself, only the implementation and some details about it.

    Where I use inline functions, I named them resembling the labels from which they were copied. Most of the code I present is copied, though a few alterations in detail. For example:

    Code:
       while(1){
                    if(*p0r <= *p1r){
                        *pbx++ = *p0r++;            // run 0 smallest
                        if(p0r < p0e)               // if not end run continue
                            continue;
                        goto cpy11;
    I re-state as this in a function:

    Code:
     while(1)
     {
        if(*p[0].r <= *p[1].r)
        {
            *pbx++ = *p[0].r++;            // run 0 smallest
            if(p[0].r >= p[0].e)           // if not end run continue
            {
             MergeCpy11( p );
             break;  
            }
    The test is inverted so as to remove the continue with no change in the result.

    The original algorithm is included with the name "BottomUpMergeSortOrg", whereas the alternative I submit for comparison is "BottomUpMergeSortAlt".

    Code:
    #include <cstdlib>
    #include <iostream>
    #include <ctime>
    
    
    #define ORG
    
    int * BottomUpMergeSortOrg(int a[], int b[], size_t n);
    int * BottomUpMergeSortAlt(int a[], int b[], size_t n);
    
    
    
    
    struct msptr
    {
     int *r;
     int *e;
    };
    
    
    
    
    void Merge3012( msptr *p, int *&pax, int *&pbx, size_t & rsz );
    void Merge201( msptr *p, int *& pax, int *&pbx, size_t &rsz );
    void MergeCpy10( msptr *p, int *&pax, int *&pbx, size_t &rsz );
    
    
    inline void Merge3013( msptr *p )
    {
     p[2] = p[3];
    }
    
    inline void Merge3023( msptr *p )
    {
     p[1] = p[2];
     p[2] = p[3];
    }
    
    inline void Merge3123( msptr *p )
    {
     p[0] = p[1];
     p[1] = p[2];
     p[2] = p[3];
    }
    
    inline void Merge202(  msptr *p  )
    {
     p[1] = p[2];
    }
    
    inline void MergeCpy11( msptr *p )
    {
     p[0] = p[1];
    } 
    
    inline void Merge212( msptr *p )
    {
     p[0] = p[1];
     p[1] = p[2];
    }
    
    inline bool Merge32( msptr *p, int *&pbx )
    {
     *pbx++ = *p[2].r++;                 // run 2 smallest
     if(p[2].r < p[2].e) return true;    // if not end run continue
     return false;
    }
    
    inline bool Merge40( msptr *p, int *&pbx)
    {
     *pbx++ = *p[0].r++;
     if ( p[0].r < p[0].e ) return true;
     Merge3123( p );
     return false;
    }
    
    inline bool Merge41( msptr *p, int *&pbx )
    {
     *pbx++ = *p[1].r++;
     if ( p[1].r < p[1].e ) return true;
     Merge3023( p );
     return false;
    }
    
    
    inline bool Merge42( msptr *p, int *&pbx )
    {
     *pbx++ = *p[2].r++; // run 2 smallest
     if ( p[2].r < p[2].e ) return true;
    
     Merge3013( p );
     return false;
    }
    
    inline bool Merge43( msptr *p, int *&pbx )
    {
     *pbx++ = *p[3].r++;
     if ( p[3].r < p[3].e ) return true;
     return false;
    }
    
    
    
    
    
    void Merge201( msptr *p, int *& pax, int *&pbx, size_t &rsz )
    {
     while(1)
     {
        if(*p[0].r <= *p[1].r)
        {
            *pbx++ = *p[0].r++;            // run 0 smallest
            if(p[0].r >= p[0].e)           // if not end run continue
            {
             MergeCpy11( p );
             break;  
            }
        } 
        else 
        {
            *pbx++ = *p[1].r++;            // run 1 smallest
            if(p[1].r >= p[1].e)           // if not end run continue
            {
             break; 
            }
        }
     }
    
     MergeCpy10( p, pax, pbx, rsz );
    }
    
    void MergeCpy10( msptr *p, int *& pax, int *&pbx, size_t &rsz )               
    {
     *pbx++ = *p[0].r++;         // copy element
     
     while (p[0].r < p[0].e) 
     {
      *pbx++ = *p[0].r++;        // copy element
     }
    
     pax += rsz << 2;            // setup for next set of runs
    }
    
    
    
    
    int * BottomUpMergeSortAlt(int a[], int b[], size_t n)
    {
     std::cout << "New...." << std::endl;
    
     msptr p[4];
     int *pax;       // ptr to set of runs in a
     int *pbx;       // ptr for merged output to b
     size_t rsz = 1;
    
     if(n < 2)
         return a;
    
     if(n == 2)
     {
         if(a[0] > a[1])std::swap(a[0],a[1]);
         return a;
     }
      
     if(n == 3)
     {
         if(a[0] > a[2])std::swap(a[0],a[2]);
         if(a[0] > a[1])std::swap(a[0],a[1]);
         if(a[1] > a[2])std::swap(a[1],a[2]);
         return a;
     }
     
     while(rsz < n)
     {
         pbx = &b[0];
         pax = &a[0];                     
     
         while(pax < &a[n])
         {
             p[0].e = rsz + (p[0].r = pax);
             if(p[0].e >= &a[n])
             {
                 p[0].e = &a[n];
                 MergeCpy10( p, pax, pbx, rsz );
                 Merge201( p, pax, pbx, rsz );
                 continue;
             }
             p[1].e = rsz + (p[1].r = p[0].e);
             if(p[1].e >= &a[n])
             {
                 p[1].e = &a[n];
                 Merge201( p, pax, pbx, rsz );
                 continue; 
             }
             p[2].e = rsz + (p[2].r = p[1].e);
             if(p[2].e >= &a[n])
             {
                 p[2].e = &a[n];
                 Merge3012( p, pax, pbx, rsz );
                 Merge201( p, pax, pbx, rsz );
                 continue;
             }
             p[3].e = rsz + (p[3].r = p[2].e);
             if(p[3].e >= &a[n])
             {
                 p[3].e = &a[n];
             }
     
             // 4 way merge
             while(1)
             {
                 if(*p[0].r <= *p[1].r )
                 {
                     if(*p[2].r <= *p[3].r)
                     {
                         if(*p[0].r <= *p[2].r)
                         {
                          if ( !Merge40( p, pbx ) ) break;
                         } 
                         else 
                         {
                          if ( !Merge42( p, pbx ) ) break;
                         }
                     } 
                     else 
                     {
                         if(*p[0].r <= *p[3].r)
                         {
                          if ( !Merge40( p, pbx ) ) break;
                         } 
                         else 
                         {
                          if ( !Merge43( p, pbx ) ) break;
                         }
                     }
                 } 
                 else 
                 {
                     if(*p[2].r <= *p[3].r)
                     {
                         if(*p[1].r <= *p[2].r)
                         {
                          if ( !Merge41( p, pbx ) ) break;
                         } 
                         else 
                         {
                          if ( !Merge42( p, pbx ) ) break;
                         }
                     }   
                     else 
                     {
                         if(*p[1].r <= *p[3].r)
                         {
                           if ( !Merge41( p, pbx ) ) break;
                         } 
                         else 
                         {
                           if ( !Merge43( p, pbx ) ) break;
                         }
                     }
                 }
             }
             // 3 way merge
     
             Merge3012( p, pax, pbx, rsz );       
             Merge201( p, pax, pbx, rsz );
         }
    
         std::swap(a, b);                // swap ptrs
         rsz <<= 2;                      // quadruple run size
     }
     
     return a;                           // return sorted array
    
    }
    
    
    void Merge3012( msptr *p, int *&pax, int *&pbx, size_t & rsz )
    {
     while(1)
     {
         if(*p[0].r <= *p[1].r)
         {
             if(*p[0].r <= *p[2].r)
             {
                 *pbx++ = *p[0].r++;        // run 0 smallest
                 if(p[0].r >= p[0].e)           // if not end run continue
                 { 
                  Merge212( p );
                  return;
                 }
             } 
             else 
             {
              if ( !Merge32( p, pbx ) ) return;
             }
         } 
         else 
         {
             if(*p[1].r <= *p[2].r)
             {
                 *pbx++ = *p[1].r++;        // run 1 smallest
                 if(p[1].r >= p[1].e)           // if not end run continue
                 {
                  Merge202( p );
                  return;
                 }
             } 
             else 
             {
              if ( !Merge32( p, pbx ) ) return;
             }
         }
     }
    }
    
    
    
    
    // original
    
    
    int * BottomUpMergeSortOrg(int a[], int b[], size_t n)
    {
    int *p0r;       // ptr to      run 0
    int *p0e;       // ptr to end  run 0
    int *p1r;       // ptr to      run 1
    int *p1e;       // ptr to end  run 1
    int *p2r;       // ptr to      run 2
    int *p2e;       // ptr to end  run 2
    int *p3r;       // ptr to      run 3
    int *p3e;       // ptr to end  run 3
    int *pax;       // ptr to set of runs in a
    int *pbx;       // ptr for merged output to b
    size_t rsz = 1; // run size
    
    std::cout << "Org...." << std::endl;
    
        if(n < 2)
            return a;
        if(n == 2){
            if(a[0] > a[1])std::swap(a[0],a[1]);
            return a;
        }
        if(n == 3){
            if(a[0] > a[2])std::swap(a[0],a[2]);
            if(a[0] > a[1])std::swap(a[0],a[1]);
            if(a[1] > a[2])std::swap(a[1],a[2]);
            return a;
        }
        while(rsz < n){
            pbx = &b[0];
            pax = &a[0];
            while(pax < &a[n]){
                p0e = rsz + (p0r = pax);
                if(p0e >= &a[n]){
                    p0e = &a[n];
                    goto cpy10;}
                p1e = rsz + (p1r = p0e);
                if(p1e >= &a[n]){
                    p1e = &a[n];
                    goto mrg201;}
                p2e = rsz + (p2r = p1e);
                if(p2e >= &a[n]){
                    p2e = &a[n];
                    goto mrg3012;}
                p3e = rsz + (p3r = p2e);
                if(p3e >= &a[n])
                    p3e = &a[n];
                // 4 way merge
                while(1){
                    if(*p0r <= *p1r){
                        if(*p2r <= *p3r){
                            if(*p0r <= *p2r){
    mrg40:                      *pbx++ = *p0r++;    // run 0 smallest
                                if(p0r < p0e)       // if not end run continue
                                    continue;
                                goto mrg3123;       // merge 1,2,3
                            } else {
    mrg42:                      *pbx++ = *p2r++;    // run 2 smallest
                                if(p2r < p2e)       // if not end run continue
                                    continue;
                                goto mrg3013;       // merge 0,1,3
                            }
                        } else {
                            if(*p0r <= *p3r){
                                goto mrg40;         // run 0 smallext
                            } else {
    mrg43:                      *pbx++ = *p3r++;    // run 3 smallest
                                if(p3r < p3e)       // if not end run continue
                                    continue;
                                goto mrg3012;       // merge 0,1,2
                            }
                        }
                    } else {
                        if(*p2r <= *p3r){
                            if(*p1r <= *p2r){
    mrg41:                      *pbx++ = *p1r++;    // run 1 smallest
                                if(p1r < p1e)       // if not end run continue
                                    continue;
                                goto mrg3023;       // merge 0,2,3
                            } else {
                                goto mrg42;         // run 2 smallest
                            }
                        } else {
                            if(*p1r <= *p3r){
                                goto mrg41;         // run 1 smallest
                            } else {
                                goto mrg43;         // run 3 smallest
                            }
                        }
                    }
                }
                // 3 way merge
    mrg3123:    p0r = p1r;
                p0e = p1e;
    mrg3023:    p1r = p2r;
                p1e = p2e;
    mrg3013:    p2r = p3r;
                p2e = p3e;
    mrg3012:    while(1){
                    if(*p0r <= *p1r){
                        if(*p0r <= *p2r){
                            *pbx++ = *p0r++;        // run 0 smallest
                            if(p0r < p0e)           // if not end run continue
                                continue;
                            goto mrg212;            // merge 1,2
                        } else {
    mrg32:                  *pbx++ = *p2r++;        // run 2 smallest
                            if(p2r < p2e)           // if not end run continue
                                continue;
                            goto mrg201;            // merge 0,1
                        }
                    } else {
                        if(*p1r <= *p2r){
                            *pbx++ = *p1r++;        // run 1 smallest
                            if(p1r < p1e)           // if not end run continue
                                continue;
                            goto mrg202;            // merge 0,2
                        } else {
                            goto mrg32;             // run 2 smallest
                        }
                    }
                }
                // 2 way merge
    mrg212:     p0r = p1r;
                p0e = p1e;
    mrg202:     p1r = p2r;
                p1e = p2e;
    mrg201:     while(1){
                    if(*p0r <= *p1r){
                        *pbx++ = *p0r++;            // run 0 smallest
                        if(p0r < p0e)               // if not end run continue
                            continue;
                        goto cpy11;
                    } else {
                        *pbx++ = *p1r++;            // run 1 smallest
                        if(p1r < p1e)               // if not end run continue
                            continue;
                        goto cpy10;
                    }
                }
                // 1 way copy
    cpy11:      p0r = p1r;
                p0e = p1e;
    cpy10:      while (1) {
                    *pbx++ = *p0r++;                // copy element
                    if (p0r < p0e)                  // if not end of run continue
                        continue;
                    break;
                }
                pax += rsz << 2;            // setup for next set of runs
            }
            std::swap(a, b);                // swap ptrs
            rsz <<= 2;                      // quadruple run size
        }
        return a;                           // return sorted array
    }
    
    
    
    
    
    
    
    int main()
    {                           
     int *source;
     int *dest;
    
     srand( (unsigned int) time( NULL ) );
    
     const int count = 15000000;
    
     double avg{ 0.0 };
    
     for( int n = 0; n < 70; ++n )
     {
     source = new int[ count ];
     dest   = new int[ count ];
    
     for( int n = 0; n < count; ++n )
       {
        source[ n ] = rand();
        dest[ n ] = 0;
       }
    
     time_t s = clock();
    
    #ifdef ORG
    
    int * result = BottomUpMergeSortOrg( source, dest, count );
    
    #else
    
    int * result = BottomUpMergeSortAlt( source, dest, count );
    
    #endif
    
     time_t e = clock();
    
     avg += ( e - s );
    
     std::cout << "Time: " << ( e - s ) << " Avg: " << avg / ( double( n ) + 1.0 ) << std::endl;
    
     int val = dest[ 0 ];
    
     for( int n = 1; n < count; ++n )
       {
        if ( val > source[ n ] ) { std::cout << "Not sorted" << std::endl; break; }
        val = source[ n ];
       }
    
     delete[] source;
     delete[] dest;
     }
    
     return 0;
    }
    It is, as the original, a flat file (no header). It is not an attempt at high quality C++, just a re-read/refactor eliminating the use of goto.

    Timing is the same on an optimized build under Visual Studio 2019 (64 bit console), but the debug build shows that the alternative version takes much more time. The optimizer makes good use of inlines and various cues, but it runs about the same in release with standard optimization settings.

    I believe our elders (of which I'm almost a part) back in the 70's through the end of that era when 1 Gbyte of RAM was unaffordable and optimizers were weak, would have seen a difference like that of the debug build. Our modern compilers make this issue evaporate, as demonstrated here.

    I noticed, and can't explain (I didn't check), the sort fails on both functions when volume is at 20 million, so I tested 15 million.

    A define at the top, "ORG", is set to use the original version, comment it out to try the alternative version.

    On my 4790K @ 4.2 Ghz, the 32 bit release build at 15 million averaged about 960 ms. The 64 bit release average about 780 ms, for both versions.
    Last edited by Niccolo; 1 Week Ago at 09:39 PM.

  6. #6
    Registered User
    Join Date
    May 2019
    Posts
    142
    @rcgldr,

    Quote Originally Posted by rcgldr View Post
    ... and in 1974, Donald Knuth wrote "Structured Programming with go to Statements"
    We wrote our two posts within minutes of each other, and I didn't notice this until later.

    I wonder if you read Knuth's paper, though. Here's a relevant section:

    Probably the worst mistake any one can make with respect to the subject of go to statements is to assume that “structured programming” is achieved by writing programs as we always have and then eliminating the go to‘s. Most go to‘s shouldn’t be there in the first place! What we really want is to conceive of our program in such a way that we rarely even think about go to statements, because the real need for them hardly ever arises. The language in which we express our ideas has a strong influence on our thought processes. Therefore, Dijkstra asks for more new language features — structures which encourage clear thinking — in order to avoid the go to‘s temptations toward complications.
    Now, it is quite true that Knuth spends time in that paper to discuss the use of goto. His view is that of the assembler, and in '74 C was considered a CPU independent assembler. It was initially fashioned to re-write UNIX for portability. UNIX had been written in assembler first, but translated into C when the language was created. Naturally, when translating assembler, the destination language would be required to accept constructs like those found in the source material, and especially when the language was purpose built for that process.

    Knuth was acknowledging, however, that improvements in language should suffice so as to allow us to think in terms without explicit jumps. C was new, UNIX was not yet released for the public, and Berkeley hadn't had the chance to take a crack at it.

    Every close bracket is an implied jump point. What follows the bracket is an implied join to code surrounding the bracketed block.

    What I attempted to show in posting a non-goto reinterpretation is that the resulting code is, arguably, a little more readable, with virtually no performance detriment. What code may appear to repeat is trivial, mostly copying parts of the pointer pair I chose to place in msptr.

    I can't say I'm convinced on the case that goto is necessary for speed, or that it helps structure code. I've not used it in nearly 40 years, and didn't have a motivation to.

  7. #7
    Registered User
    Join Date
    Apr 2013
    Posts
    1,636
    Quote Originally Posted by Niccolo View Post
    I thought I'd have a go at this.
    Thanks

    Quote Originally Posted by Niccolo View Post
    The sort fails on both functions when volume is at 20 million, so I tested 15 million.
    The code in main needs to check the array pointed to by the returned pointer in "result" (instead of dest or source). To end up with the sorted array in the original array (not the working array, the code could calculate the number of passes for a given count, and if the number of passes would be odd, do an in place insertion sort 4 elements at at time at the start, so that run size (rsz) starts at 4, but this code was being used to compare pure 4-way versus pure 2-way (also to show a minheap based 4 way merge sort was much slower). A hybrid insertion / merge sort is faster, with the first pass creating runs of either 16 or 64 to keep the merge sort pass count even.

    Quote Originally Posted by Niccolo View Post
    On my 4790K @ 4.2 Ghz, the 32 bit release build at 15 million averaged about 960 ms. The 64 bit release average about 780 ms, for both versions.
    The code is intended for a processor with 16 registers, so it should be a 64 bit release build, in order to get the 16 registers.

    On my system, 3770K @ 3.5 Ghz, VS2015, Win 7 Pro 64 bit, the 64 bit release build, with the changes noted below (VS rand only returns 15 bit values), it took about 1.17 seconds with original (leaving the merge201 after the copy?), 1.18 seconds with alternate, less than 1% difference. As you posted, compiler optimization is doing a good job dealing with the inline functions. The 4-way merge code is old code, based on stuff from the 1980's, when compilers weren't that code (had to include "register" in the local variable declarations to get the compiler to use registers).

    I'm wondering about this part of the code:

    Code:
             while(pax < &a[n])
             {
                 p[0].e = rsz + (p[0].r = pax);
                 if(p[0].e >= &a[n])
                 {
                     p[0].e = &a[n];
                     MergeCpy10( p, pax, pbx, rsz );
                     Merge201( p, pax, pbx, rsz );    // ???
                     continue;
                 }
    Seems like that call to Merge201 is unneeded. I took it out, and the code still works, but it's slightly slower, which could be an issue with where code is located, as the 3770K seems to be sensitive to where tight loops are located.

    I changed main to use a third array for a check. I copied the unordered source into an array named chk, then used std::sort on chk, and then compared chk with result to make sure elements weren't lost. With Visual Studio, rand() only returns 15 bit values, so I called it 4 times, masking and shifting the results to create pseudo random 32 bit integers.
    Last edited by rcgldr; 1 Week Ago at 12:50 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 2
    Last Post: 12-08-2014, 08:12 PM
  2. How can I use "goto" for group of multiple statements?
    By shansajid in forum C Programming
    Replies: 8
    Last Post: 01-10-2013, 10:15 AM
  3. "goto" question
    By Kylecito in forum C++ Programming
    Replies: 29
    Last Post: 02-15-2006, 05:46 AM
  4. "itoa"-"_itoa" , "inp"-"_inp", Why some functions have "
    By L.O.K. in forum Windows Programming
    Replies: 5
    Last Post: 12-08-2002, 08:25 AM
  5. "CWnd"-"HWnd","CBitmap"-"HBitmap"...., What is mean by "
    By L.O.K. in forum Windows Programming
    Replies: 2
    Last Post: 12-04-2002, 07:59 AM

Tags for this Thread