Thread: Dynamic array help

  1. #16
    Registered User
    Join Date
    Apr 2009
    Posts
    8
    Ok here's what I've got now, I have made some minor revisions to the structure and names of stuff so it might be a little hard to follow from the previous example, but nothing too major.

    The main.c file
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include "student.h"
    
    #define BUFSIZE 256
    
    int main()
    {
    	record *array = 0;
    	record std;
    	int number_of_elements = 0;
    	int array_size = 0;
    	
    	char line[BUFSIZE];
    	char first[21];
    	char last[21];
    	int score;
    	
    	/*get the user input line while not EOF*/
    	while (fgets(line, BUFSIZE, stdin) != 0)
    	{
    		sscanf(line, "%s%s%d", last, first, &score);
    		strcpy(std.name.lastname, last);
    		strcpy(std.name.firstname, first);
    		std.score = score;
    		add(&array, &std, &number_of_elements, &array_size);
    	}
    	free(array);
    	return 0;
    }
    The student.h header file (note I made a couple of changes to the structures)
    Code:
    #include <stdlib.h>
    
    #ifndef STUDENT_C
    #define STUDENT_C
    
    #define NAMESIZE 21
    typedef struct name
    {
    	char lastname[NAMESIZE];
    	char firstname[NAMESIZE];
    } name;
    
    typedef struct record
    {
    	name name;
    	int score;
    } record;
    
    void add(record**, record*, int*, int*);
    
    #endif
    and finally the student.c file which includes the dynamic array
    Code:
    #include <stdlib.h>
    #include <stdio.h>
    #include "student.h"
    
    void add(record **array, record *item, int *number_of_elements, int *array_size)
    {
    	if (*number_of_elements == *array_size) /*resize array*/
    	{
    		if (*array_size == 0)
    		{
    			*array_size = 1;
    		}
    		else
    		{
    			*array_size = *array_size * 2;
    		}
    		*array = (record*)realloc(*array, ((*array_size) * sizeof(record)));
    		
    		if (*array == 0)
    		{
    			printf("no space for memory");
    			exit(1);
    		}
    	}
    	*array[*number_of_elements] = *item;
    	(*number_of_elements)++;
    }
    To recap, it compiles and runs fine, the dynamic array works fine for the first string I enter, but it segfaults when I enter the second one :S this random behavior kind of leads me to believe it has to do with something not being initalized or something along those lines.

    I'm thinking maybe it is something to do with the main, but whatever it is I've been fooling around with it for hours and can't seem to figure it out.

    TIA

  2. #17
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    As I said last night, it wasn't necessary to make 'item' a pointer, but it's just as well that you did. It's more efficient to send the function a pointer to a struct rather than making an extra copy of the entire struct -- this would be important if you were dealing with a large structure.

    Anyway, your segfault is due to one little detail in student.c. Access the array like this
    Code:
        (*array)[*number_of_elements] = *item;
    or
    Code:
        *(*array + *number_of_elements) = *item;
    By the way, I don't like the idea of reusing a typename to name a field in the structure, as in
    Code:
    name name;
    although apparently the compiler doesn't mind. I just think that's looking for trouble/confusion.

    Also, do you really want the arraysize to double every time it runs out of space? That may be OK going from 1 to 2 and 2 to 4, but do you really want to jump from 16 to 32, and so on?

  3. #18
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by R.Stiltskin View Post
    Also, do you really want the arraysize to double every time it runs out of space? That may be OK going from 1 to 2 and 2 to 4, but do you really want to jump from 16 to 32, and so on?
    Why not - certainly it would NOT be a good idea to grow it by a small constant value. Imagine if you have to grow it to 16000 or so, by steps of (say) 8? On average you'd have copied half the array 2000 times, instead of 11 times with a doubling scheme. [In fact, this very problem existed in a driver a former colleague of mine workd on, where it grew an array by 4KB every time it ran out of space - but in this particular "error report", the software product needed 16MB of space. So it would copy the data 4000 times, averaging 8MB per copy. That takes a while even on the latest greates processors! By doubling instead, it went from 0.01 fps to 5 fps or some such - it still wasn't a fantastic frame-rate, but it was good enough that the application wasn't deemed "broken" any more].

    It obviously does depend on how large you expect the array to grow, and thus how many times you need to copy it. But growing by a factor other than 1 element more every time is certainly not a bad idea.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  4. #19
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Quote Originally Posted by matsp View Post
    It obviously does depend on how large you expect the array to grow, and thus how many times you need to copy it. But growing by a factor other than 1 element more every time is certainly not a bad idea.

    --
    Mats
    I certainly agree with that. I had in mind something more in line with the nature of the application.

    As for your example, granted, doubling each time is an easy way out of estimating what's actually needed, but I'd be pretty annoyed if the next error report was only 4K and cost me 16MB to store it.

  5. #20
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    But doubling gives a maximum unused overhead of (number_of_elements / 2)-1 times the size of the element. And that's only if the number of elements is 2^n+1. For anything else it is less. Since 2^n grows exponentially, it also is exponentially less likely that you will hit that special case [unless it so happens that your data ALWAYS is powers of 2 + 1 of course].

    But it certainly is a trade-off between "how many times do we copy this" and "how much space is wasted".

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #21
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by R.Stiltskin
    I certainly agree with that. I had in mind something more in line with the nature of the application.
    What do you propose as an alternative, given that we have no idea if Perogy needs to store the data of students who took a class test or the data of students who took a national examination?

    Quote Originally Posted by R.Stiltskin
    As for your example, granted, doubling each time is an easy way out of estimating what's actually needed, but I'd be pretty annoyed if the next error report was only 4K and cost me 16MB to store it.
    I do not see how a factor of 2 can possibly lead to 16 MB being used to store 4 KB of data. More likely it would be 8 KB, with a temporary total of 12 KB if realloc() needs to copy to a separate block in memory. Of course, one can choose a smaller factor like 1.5 and still get amortized constant time performance.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #22
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Quote Originally Posted by laserlight View Post
    What do you propose as an alternative, given that we have no idea if Perogy needs to store the data of students who took a class test or the data of students who took a national examination?
    ...which is exactly why I didn't propose an alternative. Presumably Perogy has some idea of his needs.


    Quote Originally Posted by laserlight View Post
    I do not see how a factor of 2 can possibly lead to 16 MB being used to store 4 KB of data. More likely it would be 8 KB, with a temporary total of 12 KB if realloc() needs to copy to a separate block in memory. Of course, one can choose a smaller factor like 1.5 and still get amortized constant time performance.
    Just following the example given by matsp. If the previous "unusual" 16MB report filled the array, a subsequent 4K report would result in doubling the array size.


    (Why are we arguing about this?)

  8. #23
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by R.Stiltskin View Post
    (Why are we arguing about this?)
    Because whilst you are arguing it shouldn't be done this way, I and Laserlight think that doubling is a fine thing to do.

    Yes, if you need 16MB + 4K, then you get 32MB. Compared to the OVERALL needs of a program that uses 16MB of vertices, I'm sure it's peanuts, but I guess there's always someone who will complain about it.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #24
    Registered User
    Join Date
    Apr 2009
    Posts
    8
    Quote Originally Posted by R.Stiltskin View Post
    As I said last night, it wasn't necessary to make 'item' a pointer, but it's just as well that you did. It's more efficient to send the function a pointer to a struct rather than making an extra copy of the entire struct -- this would be important if you were dealing with a large structure.

    Anyway, your segfault is due to one little detail in student.c. Access the array like this
    Code:
        (*array)[*number_of_elements] = *item;
    or
    Code:
        *(*array + *number_of_elements) = *item;
    By the way, I don't like the idea of reusing a typename to name a field in the structure, as in
    Code:
    name name;
    although apparently the compiler doesn't mind. I just think that's looking for trouble/confusion.

    Also, do you really want the arraysize to double every time it runs out of space? That may be OK going from 1 to 2 and 2 to 4, but do you really want to jump from 16 to 32, and so on?
    ahhh that was it, thank you very much. Those precedence errors can really catch a C newbie like me off guard.

    And just for clarification this is a school assignment which states we have to double the array, so that is why i chose to do it that way :-)

  10. #25
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by R.Stiltskin
    ...which is exactly why I didn't propose an alternative. Presumably Perogy has some idea of his needs.
    That's a cop out
    If we consider asymptotic running time, the choice is between allocating just the right amount of space, or increasing the space by some factor greater than unity. Increasing the space by a constant amount risks quadratic time performance, though realloc() may expand the block in-place where possible, so it would not necessarily be so bad. As such, having some idea of your needs means knowing exactly the amount of space needed, or at least knowing the upper bound on space (and thus also wasting space if the upper bound is not reached).

    Quote Originally Posted by Perogy
    And just for clarification this is a school assignment which states we have to double the array, so that is why i chose to do it that way :-)
    Sure, it is good to follow your assignment requirements, but also try to understand why it is done that way.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  11. #26
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Quote Originally Posted by matsp View Post
    Because whilst you are arguing it shouldn't be done this way, I and Laserlight think that doubling is a fine thing to do.

    Yes, if you need 16MB + 4K, then you get 32MB. Compared to the OVERALL needs of a program that uses 16MB of vertices, I'm sure it's peanuts, but I guess there's always someone who will complain about it.

    --
    Mats
    I would like my money to grow exponentially (it doesn't), but I tend to think that I should avoid having my resource requirements grow exponentially absent some compelling reason. Call me stingy.


    Edit: ...although I suppose it's not so bad if we consume only twice as much as we need in the worst case...
    Last edited by R.Stiltskin; 04-08-2009 at 11:03 AM.

  12. #27
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by R.Stiltskin
    I tend to think that I should avoid having my resource requirements grow exponentially absent some compelling reason
    Memory is one resource, and processing time is another resource. Being able to allocate precisely the amount of space needed is indeed ideal. But if you do not know the size of the input before hand, then to achieve the insertion of an element at the back in amortized constant time, you have to increase space consumption by a factor. This space-time trade-off is often worth it, and of course one could resize to exact fit when done if necessary.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  13. #28
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Yes, that is certainly valid. Thanks laserlight and mats for pointing that out.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Dynamic Array Resizing
    By dld333 in forum C++ Programming
    Replies: 13
    Last Post: 11-04-2005, 12:13 AM
  2. need help with dynamic array syntax
    By soldyne in forum C Programming
    Replies: 3
    Last Post: 10-11-2005, 01:59 PM
  3. Class Template Trouble
    By pliang in forum C++ Programming
    Replies: 4
    Last Post: 04-21-2005, 04:15 AM
  4. 2D dynamic array problem
    By scsullivan in forum C Programming
    Replies: 3
    Last Post: 12-30-2002, 10:02 PM
  5. Dynamic array allocation and reallocation
    By purple in forum C Programming
    Replies: 13
    Last Post: 08-01-2002, 11:48 AM