Thread: Array and other "Special pointers"

  1. #1
    Old Fashioned
    Join Date
    Nov 2016
    Posts
    137

    Question Array and other "Special pointers"

    C continues to fascinate me because although it is a relatively "small language" in regards to number of constructs/syntax, there are always some edge cases I haven't yet needed to use such as "multidimensional arrays" which I've just needed to start using.

    Question: In this program:
    Code:
    int main(void)
    {
         int *heap_mem = malloc(25*sizeof(int));
        int (*a)[5][5]; // 25 spaces that a points to.
        // [] [] [] [] []
        // [] [] [] [] []
        // [] [] [] [] []
        // [] [] [] [] []
        // [] [] [] [] []
    
    
        *a[0][0] = 1337;
    
    
        printf("%d\n", *a[0][0]);
    
    
        return EXIT_SUCCESS;
    }
    If I remove the ( and ) from (*a) then this program creates a segmentation fault. Why? More importantly, from my C experience thus far, I've noticed different pointer formats. So for example on line 3 here int *heap_mem = malloc(25*sizeof(int)); is your standard char * var_name pointer time we're all familiar with.

    However, then there are function pointers declared like int (*func_ptr)(int,int); So I would consider a function ptr type to be a unique ptr declaration syntax.

    The int (*a)[5][5]; looks more like that type of syntax. However, I would expect int *a[5][5] to make sense but it does not work as intended. Additionally, if I remove the * and do (a)[5][5]; for the declaration and then remove the * from the assignment of 1337 and the * from the printf, this also works fine.

    Finally, from what I understand, int (*a)[5][5]; creates a pointer to an array of 25 ints or 25*sizeof(int) (which is probably 100 bytes in many cases). Where is this block? Is it a stack block? Because we don't call malloc here. Yet another point of confusion for me is that in any case, we're allocating the same amount of memory between the malloc call and the array declaration... Is there any way to use a cast to tell C I want to index this "non 2 dimensional heap memory" to be indexed as a 2 dimensional array?

    Are there any other "unique pointer" types and is this some universal syntax under which function pointers and array pointers both fit or are these just 2 "special cases" where () are used? Thanks.
    Last edited by Asymptotic; 06-07-2019 at 01:13 PM.
    If I was homeless and jobless, I would take my laptop to a wifi source and write C for fun all day. It's the same thing I enjoy now!

  2. #2
    Old Fashioned
    Join Date
    Nov 2016
    Posts
    137
    I did some further experimentation and wrote this as well:

    Code:
        int *three = malloc(9*sizeof(int));
        *(three + 0) = 0;
        *(three + 1) = 1;
        *(three + 2) = 2;
        *(three + 3) = 3;
        *(three + 4) = 4;
        *(three + 5) = 5;
        *(three + 6) = 6;
        *(three + 7) = 7;
        *(three + 8) = 8;
    
    
        // 0 1 2
        // 3 4 5
        // 6 7 8
    
    
        printf("%d\n", *(three + 3*2 + 2));
    This sorta accomplishes what I wanted in regards to indexing into an array in a "multidimensional way" using only the original malloc call style. However, can this be done via bracket notation?
    If I was homeless and jobless, I would take my laptop to a wifi source and write C for fun all day. It's the same thing I enjoy now!

  3. #3
    Registered User
    Join Date
    Apr 2019
    Posts
    808
    for a start don't points have to be of the type they are intended to point to. ie int *x will can point to an int. if so why have you declared the array as type char and malloc'ed for the sizeof int

    i have just read about declaring multi dimensional arrays dynamically this morning so i don't pretend to really understand it yet but the way the author did it was to malloc for x number of columns (5 in your case) to hold pointers to each row then used a[0] = malloc (num_elements *sizeof (int)) again in your case num_elements would be 5 to give you your 5x5 array.

    sorry i cant be of more help
    coop

  4. #4
    Old Fashioned
    Join Date
    Nov 2016
    Posts
    137
    Cooper that was a mistake. Fixed.

    Interesting. The way the author of the article you read did it actually may be noncontiguous memory. The way I'm looking at it is a multidimensional array is actually not multidimensional, it's a single dimensional array in that the net result is still only 1 big amount of contiguous memory but we refer to it as multidimensional for indexing/convenience sake. Whereas if I were to make an array of pointers which performed n number of calls to malloc, I guess this would technically not be the case because there would be no guarantee that the memory layout was contiguous.
    If I was homeless and jobless, I would take my laptop to a wifi source and write C for fun all day. It's the same thing I enjoy now!

  5. #5
    Registered User
    Join Date
    Apr 2019
    Posts
    808
    the example i gave uses memory from all over rather than one big lump. there was an example of having one big lump but it baffled me and as the author pointed out it has two disadvantages that if you want to resize the array you need to do "some complicated things" (his words not mine) to preserve the data the second disadvantage me mentioned was that malloc was more likley yo fail getting one big dolop from one place compaired to lots of smaller amounts. brear in mind the book was written in 2000 and quite often makes reference to things that are now obsolete or uses the c89 standered for a lot of stuff as "no comercial compiler supports c99 yet" if your intreasted the book is called c unleashed.

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Asymptotic
    If I remove the ( and ) from (*a) then this program creates a segmentation fault. Why?
    That changes the declaration to be an array of 5 arrays of 5 pointers to int, so you need malloc(25 * sizeof(int*)) space for all of them... and then you still only have pointers to int so you need them to point to something (however it is allocated) before you can use them. The segmentation fault is because you tried to use an int object that didn't exist.

    Quote Originally Posted by Asymptotic
    Additionally, if I remove the * and do (a)[5][5]; for the declaration and then remove the * from the assignment of 1337 and the * from the printf, this also works fine.
    Then you're just declaring an array of 25 arrays of int, so of course you can immediately use the int object.

    Quote Originally Posted by Asymptotic
    Finally, from what I understand, int (*a)[5][5]; creates a pointer to an array of 25 ints or 25*sizeof(int)
    No, it declares a pointer to an array of 5 arrays of 5 ints. Of course, that means it would be pointing to an array of 25 ints in total, but type matters.

    Quote Originally Posted by Asymptotic
    Where is this block? Is it a stack block? Because we don't call malloc here.
    It doesn't exist due to that declaration. You merely declared a pointer. It still has to point to something, and in this case you presumably want to cause it to point to memory returned by malloc.

    Quote Originally Posted by Asymptotic
    Is there any way to use a cast to tell C I want to index this "non 2 dimensional heap memory" to be indexed as a 2 dimensional array?
    Yes. So, a is a pointer to an array of 5 arrays of 5 ints, and if malloc didn't return a null pointer, heap_mem is a pointer to a dynamically allocated array of 25 ints. As you observed, a multidimensional array is effectively, "a single dimensional array in that the net result is still only 1 big amount of contiguous memory but we refer to it as multidimensional for indexing/convenience sake". This "convenience" underlies the entire type system, e.g., an int might just be a contiguous bunch of 4 bytes that we can use together for certain operations for "convenience sake". Anyway; this means that we just need to get a to point to where heap_mem is pointing, e.g.,
    Code:
    a = (int(*)[5][5])heap_mem;
    In principle, that should work, but I've never had the need to cast to pointers to arrays before, so it might not. But here's why: you don't need heap_mem in the first place. You can just write:
    Code:
    int (*a)[5][5] = malloc(sizeof(*a));
    Or:
    Code:
    int source[5][5];
    int (*a)[5][5] = &source;
    But most commonly, a pointer to an array itself would be used to traverse an array of such arrays, e.g.,
    Code:
    int source[10][5][5];
    int (*a)[5][5] = source;
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    Old Fashioned
    Join Date
    Nov 2016
    Posts
    137
    Wow thanks laserlight. I greatly appreciate you answering those questions with examples!

    So what's intersting now is that this has made me begin to think that structs are actually "multidimensional arrays" of some sort... Or at least can be thought of that way. For example, a struct with 4 ints may take up 4 * 4 = 16 bytes and the struct base addr is that of the first int much like an array addr is that of the first element... I know in practice structs can also have padding... But so could multidimensional arrays as well. Plus, struct memory seems to be mostly contiguous as when we look at a struct in a disassembler, we see the struct base addr being used in combination with addition offsets to reach the remaining members. However, I know that once an array element size is defined it cannot really be redefined, whereas a struct can have multiple different types of different sizes in the same struct... But still I have a feeling this could all be done using multidimensional arrays in some way. Hmmm.
    If I was homeless and jobless, I would take my laptop to a wifi source and write C for fun all day. It's the same thing I enjoy now!

  8. #8
    Registered User
    Join Date
    Apr 2019
    Posts
    808
    what are you actually trying to achieve?
    coop

  9. #9
    Old Fashioned
    Join Date
    Nov 2016
    Posts
    137
    Quote Originally Posted by cooper1200 View Post
    what are you actually trying to achieve?
    coop
    I originally found myself needing to index into a multi dimensional array but didn’t know the size at compile-time. Then I realized I didn’t know what the hell I was doing so nows a good time to get a better understanding of all of this in general.
    If I was homeless and jobless, I would take my laptop to a wifi source and write C for fun all day. It's the same thing I enjoy now!

  10. #10
    Registered User
    Join Date
    Apr 2019
    Posts
    808
    why do you mind then if the memory isn't all in one big chunk. far easier to ask for small dollops here and there then if you need more it can find another small dollop where as if you ask for a 1kb then decide you need 2 kb its got to scratch around to find a spot with 2kb free

  11. #11
    Registered User
    Join Date
    May 2019
    Posts
    214
    for @cooper1200 and @Asymptotic,

    First, about contiguous allocation:

    In theory performance is best in a contiguous block because it is cache friendly. The cache doesn't function as well to aid performance when a lot of short rows or columns are separately allocated. This is because every new address for each new row or column must be requested by the CPU, which is a delay on the RAM system. When a single block is used, in theory, this means the cache is more able to hold more of the overall data at once. This is, of course, subject to the limit of cache size in a particular CPU, and cache systems generally can point to more than one location at a time (just not a large number of different locations).

    Further, in modern machines one generally does not find an issue with size unless the dimensions and total RAM is quite large. Back in 2000, $60 might buy 64K of RAM. By 2008 $60 might fetch about 3Mbytes of RAM, where today $60 can fetch perhaps 16Gbytes or more.

    There is a fairly close relationship between arrays and images, where typical digital images may easily require a single allocation well above 10 or 20 Mbytes (uncompressed, ready for display or editing/processing). The natural layout of an image is that of rows of pixels, which is similar to a 2D array.

    There is another fundamental difference to allocating each row separately. It takes a little more space. One axis (usually rows, but it's up to the programmer) is an array of pointers. These hold the location of the second dimension of the array entries (rows in my example). This means to locate an entry in the structure one must first access the pointer at the nth element of the array of pointers (as a row for example), then using that address calculate the element on that row. In a single block of RAM, only the base pointer is required, and the rest is merely a calculation; a vector from the starting point to the entry (or cell). For a single block, therefore, only the base pointer need be loaded into a register (to point to a cell), then math performed upon it to arrive at the address of the target cell. For a structure allocating each row separately there must be the base pointer loaded, math performed upon it, which then loads the row's pointer and math performed upon that. It is a second reason such a layout is slower (beyond cache friendliness).

    One can look at any block of RAM as an array of N dimensions, as long as the dimensions aren't ragged (they're all the same). For example, a 4D structure could be declared as int x[5][5][5][5], or int x[5][6][7][8], but at least each of these dimensions are fixed. You don't have some rows longer than others, or some columns longer than others (that's a different kind of layout than what we're discussing).

    If so, a block of RAM can be address as an N dimension array simply by simple math. For 2D, for example, location = rownum * rowsize + column, for 3D location = pagenum * pagesize + rownum * rowsize + column, etc.

    Key to any array, however, is that not only the dimensions be non-ragged, but the cell size should be uniform. At most if the cell content might be varied, it should be either the largest possible, or a pointer to anything that should be accommodated. The latter, of course, is rather wasteful of allocations, but unfortunately rather common and sometimes unavoidable.

    If the cell sizes aren't uniform, or the rows/columns/pages (or whatever) are non-ragged, then what is being described is some other kind of container, not an array.

    I've been reminded that occasionally I speak of C++ in a C forum, but that's only natural to me as I don't really think in C any more, even though I come from a time before C++ existed. I say that because I was about to post about C++ containers, where that doesn't apply here.

    That said, structures (which also prompted my C++ instincts) are not arrays, despite the similarity of appearance when a structure may contain only a collection of members of the same type. That said, there are numerous examples of libraries using something like a vector3d structure (x, y & z ) as a 3 element array of the appropriate type. Sometimes they'll cast to do that. It is mere coincidence, though there are times when such coincidence is deliberately created for such a purpose.

    struct memory seems to be mostly contiguous
    Not just mostly, always.

    It is intentional that a structure is a single block of memory.

    You might think of an exception where members of the struct are pointers, but whatever those point to aren't members of the struct. The pointer is, and only the pointer.

    The struct is bound as a single unit in part so it can be an element in an array. An array of structures is a particularly important notion.

    You find this echoed in the data sent to a GPU for graphics. What you see in some of this data may be a collection of points and attributes. They are groups as if they are structures (and usually can be referenced as such), but we often use "stride" to rip through these structures. We might see, for example, that the collection of points for each vertex (x,y & z) is coupled with a 'normal' vector (also an x, y & z ). This would look like a structure of two vector3d structures (nested), but it also looks like an array of 6 floats. Packed in tightly for shipment to the GPU (in the OpenGL versions of 4 and lower, or DX of 11 and older), we may run through the list of vertices with a "stride" of 6, meaning these are groups of 6 at a time like a structure of two vector3ds. You can think of them in any such way you prefer, and often it is simpler to use a point to an appropriate structure to handle the "stride". However, when sending this to the GPU we may have to tell the GPU what the stride is, because it may have no notion of these structures.

    Hence the multiple way of looking at things.

    However, where a structure may have collections of various sizes and types, the notion of the structure resembling an array utterly breaks apart because the sizes aren't unified, the types differ and therefore how they're read and where they're located is relatively arbitrary. That's why structures are what they are; they're very flexible collections of information. Arrays are fixed.

  12. #12
    Registered User
    Join Date
    Apr 2019
    Posts
    808
    maybe im being thick but any 2d array needs two co-ordinates to access a single cell for example in a 10 x 10 array i would have to write [3][5] to access a specific location

  13. #13
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    Quote Originally Posted by cooper1200 View Post
    maybe im being thick but any 2d array needs two co-ordinates to access a single cell for example in a 10 x 10 array i would have to write [3][5] to access a specific location
    Yep... but bidimensional arrays are shortcuts to unidimensional ones. a[3][5], in an array a defined as a[10][10] is the same as *(a + 10*3 + 5):

    Code:
    #include <stdio.h>
    
    // Show the array using bidimensional notation...
    int show_array( int a[][4] )
    {
      int i, j;
    
      for (i = 0; i < 3; i++)
      {
        fputs( "{ ", stdout );
        for (j = 0; j < 3; j++)
          printf( "%d, ", a[i][j] );
        printf( "%d }\n", a[i][3] );
      }
      putchar('\n');
    }
    
    int main( void )
    {
      static int a[3][4] = { {0} }; // filled with zeros...
      int *p;
      int i, j;
      
      // Fill with i*j using bidimensional notation.
      for ( i = 0; i < 3; i++)
        for (j = 0; j < 4; j++ )
          a[i][j] = i*j;
    
      show_array( a );
    
      // Fill with NEGATIVE i*j using a pointer and an offset.
      p = (int *)a;
      for ( i = 0; i < 3; i++)
        for ( j = 0; j < 4; j++)
          *(p + 4*i + j) = -i*j;  // Notice: pointer + ("columns"*i + j).
    
      show_array( a );
    }
    With 'incomplete' arrays you MUST declare them with only the highest dimension blank because the compiler need the lower dimensions to calculate the correct pointer (see the int a[][4] in show_array argument).
    Last edited by flp1969; 06-09-2019 at 07:06 AM.

  14. #14
    Registered User
    Join Date
    May 2019
    Posts
    214
    @cooper1200

    maybe im being thick but any 2d array needs two co-ordinates to access a single cell for example in a 10 x 10 array i would have to write [3][5] to access a specific location

    @flp1969 has the goods here.

    When I posted this:

    a block of RAM can be address as an N dimension array simply by simple math. For 2D, for example, location = rownum * rowsize + column, for 3D location = pagenum * pagesize + rownum * rowsize + column, etc.
    I was trying to say what @flp1969 was saying

    It isn't just 2D but any dimension of arrays can be addressed as a single block of RAM using just math - it is actually what the compiler does under the hood when a multidimensional array is allocated in one block of memory.

    This means that given a block of memory, you could arbitrarily determine the size of the rows or columns at runtime, as long as the amount of RAM is sufficient to hold the result. Likewise, you could even decide on the number of dimension to the array at runtime.

    We find code frequently expressing a 4X4 matrix (for linear algebra in 3D games for example) where that matrix is allocated as a simple 16 long array of floats. The matrix is a 4 x 4 square of cells, but the allocated structure is a single dimension array of 16 floats. They are the same thing under the hood, it merely means that each row is 4 floats wide, so while row 1 might be 0 to 3, row 2 is 4 to 7.

    Even more important, though, is that matrices may be considered in row/column or column/row order. We can flip that at will at runtime as we wish when we stop insisting on C's representation of a 2D array, and consider the array merely as a matter of math.

    It is the same with uncompressed images (relative to the location of each pixel), or the pixels on a display (which is, in reality, one big block of RAM).

    Now, even better is the ability to pull a smaller square out of such a structure. Using math for addressing it becomes almost trivial to localize a 3 x 3 square 2D array that is inside a 16 x 16 square 2D array, the same way a window on your display is such a small rectangle within the larger rectangle. The smaller "window" has it's own coordinate system using the same basic notions.

  15. #15
    Registered User I C everything's Avatar
    Join Date
    Apr 2019
    Posts
    101
    Niccolo,so old rule of 256x256 texture(image) or any array and continues 64k still apply,its kept in cache,use aligned memory too
    you tell me you can C,why dont you C your own bugs?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. A "Simple" Array/String Problem. "Help" ASAP needed
    By AirulFmy in forum C Programming
    Replies: 10
    Last Post: 08-19-2015, 04:30 AM
  2. Replies: 39
    Last Post: 07-28-2013, 08:24 PM
  3. Replies: 2
    Last Post: 11-16-2011, 05:55 AM
  4. Replies: 2
    Last Post: 05-10-2005, 07:15 AM
  5. Replies: 9
    Last Post: 12-02-2003, 02:24 PM

Tags for this Thread