Thread: C99 optimization - How does compiler optimize finding pointers in stack?

  1. #1
    Registered User
    Join Date
    May 2022
    Posts
    4

    Talking C99 optimization - How does compiler optimize finding pointers in stack?

    Hello folks, first post.

    I'm fairly new to C so I'm not very familiar with C99 optimization. I thought I could optimize some code by creating an array so that the program wouldn't have to iterate through the stack to find pointers.

    Instead it could utilize the consecutive memory locations of the array. (un)fortunately, this is handled extremely well by the compiler. (every time I think I can outsmart the compiler I feel like I'm dissing the C devs, haha)

    I've included the code I used for testing. I came to the conclusion that either:
    1. The stack isn't iterated but uses a shortcut (very likely);
    2. Stack pointer lookup is blazing fast (wouldn't necessarily account for lower clock but maybe it can do multiple at once?);
    3. Because of the limited test environment, the structs aren't defragmented, so there is no gain from consecutive memory locations of the array.

    Can someone shed a light on what is happening?


    code file (better formatting)
    google drive


    Test results:

    Original:

    Test 1 of 100 - Size: 20, Cycles: 2227
    Test 2 of 100 - Size: 20, Cycles: 2354
    Test 3 of 100 - Size: 20, Cycles: 2291
    Test 4 of 100 - Size: 20, Cycles: 2353
    Test 5 of 100 - Size: 20, Cycles: 2291...

    Test 1 of 100 - Size: 1000, Cycles: 7970
    Test 2 of 100 - Size: 1000, Cycles: 8053
    Test 3 of 100 - Size: 1000, Cycles: 7960
    Test 4 of 100 - Size: 1000, Cycles: 7940
    Test 5 of 100 - Size: 1000, Cycles: 7904
    ...


    "Optimized"

    Test 1 of 100 - Size: 20, Cycles: 2780
    Test 2 of 100 - Size: 20, Cycles: 2904
    Test 3 of 100 - Size: 20, Cycles: 3000
    Test 4 of 100 - Size: 20, Cycles: 3006
    Test 5 of 100 - Size: 20, Cycles: 2905
    ...

    Test 1 of 100 - Size: 1000, Cycles: 33837
    Test 2 of 100 - Size: 1000, Cycles: 32723
    Test 3 of 100 - Size: 1000, Cycles: 33288
    Test 4 of 100 - Size: 1000, Cycles: 33116
    Test 5 of 100 - Size: 1000, Cycles: 32804
    ...


    Code:
    As you can see the optimized function calls an array init function. This is a linear array with every other index being x.
    arr[0] = x1; arr[1] = y1; arr[2] = x2; ...

    The original (faster code) just iterates the linked list and gets its x and y.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    
    
    #define MATRIX_W            10
    #define MATRIX_H            24
    #define ARRAY_SIZE          20
    
    
    typedef struct Shape {
        struct Block *tail;
    }   Shape;
    
    
    typedef struct Block {
    
        int x, y;
        struct Block *prev;
    
    }   Block;
    
    
    Block *link (Block *prev, int x, int y)
    {
        Block *new = calloc(1, sizeof(Block));
    new-> x = x;
    new-> y = y;
    new-> prev = prev;
        return new;
    }
    
    
    void *init_shape(int *values)
    {
        int i;
    Block *block = NULL;
    Shape *shape = calloc(1, sizeof(Shape));
    
        for (i=0; i<ARRAY_SIZE; i+=2)
        {
            block = link(block, values[i], values[i+1]);
    }
    
        shape->tail = block;
        return shape;
    }
    
    
    //  create int array
    int *init_coord_array(Shape *shape)
    {
        int x, y, i;
        int *coords;
    Block *block;
    
    i=0;
    coords = calloc(ARRAY_SIZE, sizeof(int));
    block = shape->tail;
    
        while (block != NULL)           // for block in shape
    {
            coords[i++] = block->x;     // add to array
    coords[i++] = block->y;
    block = block->prev;        // next block in shape
    }
    
        return coords;
    }
    
    
    //  populate VLA
    int *init_test_values()
    {
        int i,w,h,*values;
        
    i=0;
    w = rand() % MATRIX_W;      //  no srand for testing purposes
    h = rand() % MATRIX_H;
    values = calloc(ARRAY_SIZE, sizeof(int));
    
        while (i<ARRAY_SIZE)
        {
            values[i] = w;
    values[i+1] = h;
    i+=2;
    }
        return values;
    }
    
    
    int original_function(Shape *shape)
    {
        int x,y,match;
    
    Block *block = shape->tail;
    match = 0;
    
        for (y=MATRIX_H-1; y>=0; y--)
        {
            for (x = 0; x < MATRIX_W; x++)
            {
                while (block != NULL)
                {
                    if (block->x == x && block->y == y)
                    {
                        match++;
                        break;
    }
                    block = block->prev;
    }
            }
        }
        return match;
    }
    
    
    int optimized_function(Shape *shape)
    {
        int i,x,y,match,*coords;
    
    i = match = 0;
    coords = init_coord_array(shape); // create linear int array
    
    for (y=MATRIX_H-1; y>=0; y--)
        {
            for (x = 0; x < MATRIX_W; x++)
            {
                while (i < ARRAY_SIZE)
                {
                    if (coords[i] == x && coords[i + 1] == y)
                    {
                        match++;
                        break;
    }
                    i += 2; // goto next two values
    }
            }
        }
        free(coords);
        return match;
    }
    
    
    int main()
    {
        int *values = init_test_values();   // create not so random values (no srand)
    Shape *shape = init_shape(values);  // populate struct with generated values
    free(values);
    
        for (int i=0; i<100; i++)
        {
            clock_t start = clock();
            for (int j=0; j<10000000; j++)
            {
                original_function(shape);   // struct iteration, linked list
                //optimized_function(shape);    // utilize int array
    }
            printf("Test %d of 100 - Size: %d, Cycles: %d\n", i+1, ARRAY_SIZE, clock() - start);
    }
    }

    Last edited by WillyCe; 05-08-2022 at 05:24 AM. Reason: moved code file to more visible place

  2. #2
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    I don't understand what you mean by "iterate through the stack to find pointers", but obviously your "optimized" function will be slower since it's doing more work. The "original" function simply loops through the linked list. The "optimized" function not only loops through the linked list, but dynamically allocates extra space, fills it with the data, and then loops through that, too.
    A little inaccuracy saves tons of explanation. - H.H. Munro

  3. #3
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    And, clock() doesn't measure cpu clock cycles.

  4. #4
    Registered User
    Join Date
    May 2022
    Posts
    4
    With iterate through the stack I mean that for every pointer (block->prev) it has to iterate the entire stack. The array also gets iterated but it's 1:1, every value is only used once and is always assigned to consecutive memory locations. So besides the additional work of making the array once, I assume it has less iterations than n times pointer lookups in the stack.

    Edit:
    Crap, I see I init and free the array inside the function....
    Last edited by WillyCe; 05-08-2022 at 09:59 AM.

  5. #5
    Registered User
    Join Date
    May 2022
    Posts
    4
    I see. I assumed it returned clock cycles since, according to IBM, you divide by CLOCKS_PER_SEC to calculate time in seconds. What is the best alternative for clocks? And, since I'm here, do you know another way of measuring seconds when calling to system()? Apparently that can reset clock().

  6. #6
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    I mean that for every pointer (block->prev) it has to iterate the entire stack.
    No it doesn't. It never has to "iterate the stack" for anything, ever. Why do you think it does that?
    A little inaccuracy saves tons of explanation. - H.H. Munro

  7. #7
    Registered User
    Join Date
    May 2022
    Posts
    4
    Man, I don't even know. Too caught up in my goose chase to doubt that undisputable fact. I guess I forgot memory =/= array.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Compiler optimization: volatile
    By Athul in forum C Programming
    Replies: 3
    Last Post: 12-20-2018, 03:44 AM
  2. compiler optimization and num accuracy?
    By serge in forum C++ Programming
    Replies: 11
    Last Post: 06-12-2014, 03:50 AM
  3. Testing Compiler Optimization
    By jamesleighe in forum C++ Programming
    Replies: 4
    Last Post: 03-23-2012, 11:32 AM
  4. Replies: 23
    Last Post: 04-05-2011, 03:40 PM
  5. Replies: 10
    Last Post: 07-17-2008, 11:21 AM

Tags for this Thread