Thread: Dynamic pointer array in C

  1. #1
    Registered User
    Join Date
    Apr 2005
    Posts
    8

    Dynamic pointer array in C

    Hi Guys,

    My compiler is LCC (a C compiler).

    I am currently using a large (global) array of pointers to load a text file (in a function) as follows:

    1. Using a 64k local UCHAR var to hold a line of the file with fgets.

    2. Malloc a pointer from the array to the proper size and copy the UCHAR var to the pointer.

    3. Continue this for each line of the file.

    4. Record the number of lines so I can use free() in a loop on program exit.

    The problem: The text file can vary from a few lines to a few thousand. I'd like to be able to set the size of the global pointer array to match the number of lines in the file to avoid using an unnecessarily large pointer array.

    The pointer array must be global, and the solution must be in C (no C++).

    Any ideas?

    Thanks, Mac
    Last edited by MacFromOK; 04-05-2005 at 11:03 PM.

  2. #2
    Registered User
    Join Date
    Apr 2005
    Posts
    8
    OK, after several more hours of searching (lol, and no sleep since my original post)...

    I seem to have found one possible solution (a double pointer) in an obscure post about matrix multipliers (yeah I know - barely related).

    The only problem I see with this approach is that I'll have to read the file twice - once to get the number of lines and malloc the number of elements, then again to load it and malloc each element to fit line length.

    Is there a way to add elements "on the fly" as the file is read? Or do I need another solution altogether?

    Any help is appreciated.

    Thanks, Mac

  3. #3
    Registered User
    Join Date
    Jan 2005
    Posts
    847
    For this a linked list would work nicely.
    Code:
    struct Line
    {
       char Line *Data;
       Line *Next;
    }
    First declare a line structure as the root line then to add a new line allocate a new line structure and save it's pointer in Next.
    If you want to be able to insert lines in any location you'll need a double linked list with Previous and Next pointers.
    When you allocate a line structure it's Next pointer should first be initialized to NULL so you can loop thru the lines untill you reach the NULL pointer indicating you reached the end.
    There's a tuorial on linked lists here
    http://www.cprogramming.com/tutorial/lesson15.html

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    http://cboard.cprogramming.com/showt...threadid=20302
    Shows how to extend an array of variable length strings, starting from a char **variable
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Apr 2005
    Posts
    8
    Quote Originally Posted by Salem
    http://cboard.cprogramming.com/showt...threadid=20302
    Shows how to extend an array of variable length strings, starting from a char **variable
    Ah this is perfect.

    I just realloc the primary pointer (number of elements) of the **variable as each line loads, and then malloc that element for line length. Thanks.

    BTW the project is fairly large and would take a lot of recoding to use a linked list on this.

    Thanks again, Mac
    Last edited by MacFromOK; 04-06-2005 at 05:03 PM.

  6. #6
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > I just realloc the primary pointer (number of elements) of the **variable as each line loads
    If you're doing this every line, then its a waste of effort.
    Allocate them in batches of lines when you need to, and if you know you have many 1000's of lines to cope with, then make the block factor fairly large.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  7. #7
    Registered User
    Join Date
    Apr 2005
    Posts
    8
    Quote Originally Posted by Salem
    > I just realloc the primary pointer (number of elements) of the **variable as each line loads
    If you're doing this every line, then its a waste of effort.
    Allocate them in batches of lines when you need to, and if you know you have many 1000's of lines to cope with, then make the block factor fairly large.
    Actually most of the time there will prolly be fewer than 100 lines. But the potential is there for a few thousand lines on large apps, so I agree - it's a prolly good idea to use blocks of 10 or so at least.

    There seems to be a problem with realloc in the compiler version I'm using though. In the double pointer for example, realloc only works if I malloc a larger number than needed first - it then happily reallocs a smaller part of that space. Malloc'ing to a smaller size first has no effect.

    For some reason it apparently does not always move newly (larger) realloc-ed memory to unused space. This could explain some elusive memory problems in other areas of the code that I've been chasing as well.

    If this is the case (more testing should tell), guess I'll have to use malloc before subsequent reallocs (when possible), or just use malloc/free instead.

    Any opinion on the API memory functions? Are they better/worse or comparable to malloc/realloc?

    Thanks again, Mac

  8. #8
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    You'd have to post a specific example of your code which you think breaks.

    So long as you get the numbers and the sizes right, there shouldn't be a problem.

    > For some reason it apparently does not always move newly (larger) realloc-ed memory to unused space.
    There is no guarantee that the block will move when you increase the size (or decrease for that matter).
    The only thing guaranteed is that the amount of memory preserved will be the smaller of the old and new sizes.

    > This could explain some elusive memory problems in other areas of the code that I've been chasing as well.
    When you malloc for a string, do you remember to add 1
    char *p = malloc( strlen(str) + 1 );
    It's a common mistake in trying to implement "strdup"
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  9. #9
    Registered User
    Join Date
    Apr 2005
    Posts
    8
    Quote Originally Posted by Salem
    When you malloc for a string, do you remember to add 1
    char *p = malloc( strlen(str) + 1 );
    It's a common mistake in trying to implement "strdup"
    Actually I usually malloc/realloc +2 bytes to cover possible string manipulation errors on my part. One extra byte is cheap insurance.

    Quote Originally Posted by Salem
    > For some reason it apparently does not always move newly (larger) realloc-ed memory to unused space.
    There is no guarantee that the block will move when you increase the size (or decrease for that matter).
    The only thing guaranteed is that the amount of memory preserved will be the smaller of the old and new sizes.
    I understand that - my point was that it did not seem to be moving the block when there was not enough unused space. Sorry, I failed to mention there were page fault errors.

    A realloc test app shows that realloc does seem to be working OK, but it also revealed a quirk (for lack of a better word) with this compiler that I was unaware of (or had forgotton). Normal string declarations must allow an extra char past the normal null. Here's an example with a 20 char string:
    Code:
    // This displays extra trash chars using "test_str[20]"
    UCHAR test_str[20] = "xxxxxxxxxxxxxxxxxxxx"; // 20 x's
    MessageBox(NULL, test_str, "Test", MB_OK|MB_TOPMOST);
    
    // but displays fine using "test_str[21]"
    UCHAR test_str[21] = "xxxxxxxxxxxxxxxxxxxx"; // 20 x's
    MessageBox(NULL, test_str, "Test", MB_OK|MB_TOPMOST);
    This is no doubt causing various problems because I assumed normal capacity.

    Thanks, Mac

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > // This displays extra trash chars using "test_str[20]"
    > UCHAR test_str[20] = "xxxxxxxxxxxxxxxxxxxx"; // 20 x's
    Yeah, you mis-understand the nature of strings.
    This is NOT a string, it is an initialised char array. By placing the exact number of characters in the array, the \0 at the end has been removed (there is no place to store it). Trying to use it as a string results in those 20 x's being output, and as much junk as it takes until the program either finds a \0 in some random memory location, or it touches an illegal memory address and promptly dies.

    > // but displays fine using "test_str[21]"
    > UCHAR test_str[21] = "xxxxxxxxxxxxxxxxxxxx"; // 20 x's
    Yeah, that's because there is room for the \0 in the very last element of the array.

    Most of the time, we just write
    UCHAR test_str[] = "xxxxxxxxxxxxxxxxxxxx"; // 20 x's
    and let the compiler do all the counting for us.
    This construct always has just the right amount of space for the string and a \0.

    The worst of it is, even if your realloc code is correct, random bugs like this elsewhere in your code can make the memory pool behave oddly. Memory corruption bugs can be really hard to track down because of this.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  11. #11
    Registered User
    Join Date
    Apr 2005
    Posts
    8
    Quote Originally Posted by Salem
    > // This displays extra trash chars using "test_str[20]"
    > UCHAR test_str[20] = "xxxxxxxxxxxxxxxxxxxx"; // 20 x's
    Yeah, you mis-understand the nature of strings.
    This is NOT a string, it is an initialised char array. By placing the exact number of characters in the array, the \0 at the end has been removed (there is no place to store it). Trying to use it as a string results in those 20 x's being output, and as much junk as it takes until the program either finds a \0 in some random memory location, or it touches an illegal memory address and promptly dies.

    > // but displays fine using "test_str[21]"
    > UCHAR test_str[21] = "xxxxxxxxxxxxxxxxxxxx"; // 20 x's
    Yeah, that's because there is room for the \0 in the very last element of the array.
    Uh... I beg to differ - with most compilers the chars are in positions 0-19 (which is 20 chars) and that leaves pos 20 for the null. Not sure why this compiler requires one extra, Borland's C++Builder and Turbo C++ (for DOS) do not.

    And I do understand this is a char (UCHAR in this case) array. I'm not a newbie, I started programming in DOS in the early 90's - just haven't had need of double pointers until now.

    Cheers, Mac

  12. #12
    Yes, my avatar is stolen anonytmouse's Avatar
    Join Date
    Dec 2002
    Posts
    2,544
    A 20 element array has 20 elements (0..19), not 21. Touching element 21 (position 20) or greater to read or write is incorrect and undefined.
    Code:
    char arr[20];
    
    arr[20] = 1; /* Incorrect. */
    a = arr[20]; /* Incorrect. */
    >> Not sure why this compiler requires one extra, Borland's C++Builder and Turbo C++ (for DOS) do not. <<

    That's the thing with undefined code. It may work on some compilers some of the time. That doesn't make it correct and it is not reasonable to blame the compiler which "fails" with incorrect code.

  13. #13
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    char a[20];
    Has NO index 20
    0 - 19 is 20 elements.

    While you're at it, read http://www.eskimo.com/~scs/C-faq/q11.22.html
    Then read the rest of that FAQ.

    > Borland's C++Builder and Turbo C++ (for DOS) do not.
    DOS compilers let you get away with so much it's a wonder anyone learns how to do it properly.
    That coupled with HS being a popular book

    > with most compilers
    What "most" compilers do, or "my" compiler does is irrelevant. Learning an implementation just gets you into all sorts of trouble when you change OS/Compiler (that's why you're here now). All your prior assumptions about how the language worked based on your reverse engineered understanding of your previous compiler are exposed as that - assumptions.

    I mean, your previous statement about adding 2 for "insurance" leads me to suspect that you really don't know what is actually happening.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  14. #14
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > Uh... I beg to differ
    And yet you have no reasonable explanation for why the code is failing, and you disagree with my explanation.

    > I'm not a newbie, I started programming in DOS in the early 90's
    Well we all have our crosses to bear.
    Yours is having to learn C again from the beginning.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  15. #15
    Registered User
    Join Date
    Apr 2005
    Posts
    8
    OK, I'm really glad to learn this - I've been operating under the assumption that an array of [20] would hold 20 chars plus null, and it has worked with no problem in my Borland (and some other DOS) compilers for years. Prolly because I usually allow a few bytes extra rather than try to use the minimum size I can get by with.

    But there's no need rag on me because Borland compilers have allowed me (possibly led me) to program under a wrong assumption. Lol, I just found one example in the BCB help file (strchr) that copies 16 chars to an array of [15].

    Quote Originally Posted by Salem
    I mean, your previous statement about adding 2 for "insurance" leads me to suspect that you really don't know what is actually happening.
    Lol, no - it just means you have no idea how much string manipulation I do. My current project is a script language that compiles stand-alone EXEs and the string manipulation to make it work is pretty intense. Lol, I must say it's amazing it works at all since you seem to think I'm so dense.

    Quote Originally Posted by Salem
    And yet you have no reasonable explanation for why the code is failing, and you disagree with my explanation.
    Funny - thought I'm the one that said: This is no doubt causing various problems because I assumed normal capacity. Admittedly I was wrong about what "normal capacity" is, but look how smart you seem now...

    Quote Originally Posted by Salem
    Well we all have our crosses to bear.
    Yours is having to learn C again from the beginning.
    Lol, you guys are no doubt very knowledgable programmers, but I gotta say sometimes your people skills lack a little finesse. No, I don't have to "learn C again from the beginning" because of one (albeit a gross one) misconception. No offense btw.

    Thanks for the FAQ link.

    Cheers, Mac
    Last edited by MacFromOK; 04-09-2005 at 10:10 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. How to determine length of pointer array
    By Dr.Zoidburg in forum C Programming
    Replies: 12
    Last Post: 02-16-2009, 06:52 PM
  2. Replies: 2
    Last Post: 07-11-2008, 07:39 AM
  3. Quick Pointer Question
    By gwarf420 in forum C Programming
    Replies: 15
    Last Post: 06-01-2008, 03:47 PM
  4. Pros pls help, Pointer to array
    By kokopo2 in forum C Programming
    Replies: 7
    Last Post: 08-17-2005, 11:07 AM
  5. sending or recieving a[][]to a function
    By arian in forum C++ Programming
    Replies: 12
    Last Post: 05-01-2004, 10:58 AM