dynamic array

**claudiu** · 04-10-2010

Originally Posted by grumpy

No. The point of the "a = malloc(n * sizeof *a)" technique is that it works correctly regardless of what type a points at.

Using another technique is just an invitation to forget to update the malloc() call when the type of a changes.

The coder is going to need to ask that question for any use of the pointer after the malloc() call anyway.

Your technique would be misleading in this circumstance.

Code:

long *x;      /*   We used to have this as a short  */

/*  and later   */

x = malloc(n*sizeof short);    /*   Whoops!   This does not match declaration of x */

/*   Assume x points to a short ... after all, the malloc() call here tells us that  */

Yes, it is inconvenient to have to check the type of x to use it. But using a technique that allows the programmer to misinform him/her self is not just inconvenient, it is folly.

I disagree. Certainly, I see the advantages of that method, however there are several things I would like to mention here:

Firstly, as I said previously, x = malloc(n * sizeof(int)) improves the READABILITY of the code which is an undeniable qualitative aspect of the program. I think this is more important than writing code so as to prevent yourself from making mistakes that you should not be making in the first place. (i.e. your long-short example) Simply put, the quality of the code, which is also reflected by its readability cannot be sacrificed for the sake of reducing the time you spend debugging your program.

Secondly, a good programmer will spend a considerable amount of type designing the data structures and data types used in his program well before he starts typing code. Consequently, changes in data types should not be too frequent while developing your code. Data representation is a design decision first and foremost. If you design your program well, you will know before you type that x should be a long and not a short. I personally am more old school and believe in the principle of thinking before typing code, rather than jumping on the keyboard and figuring things out on the way.

Thirdly, the importance of malloc is that it allocates memory, however ironically, using:
x=malloc(n * sizeof *x) gives you absolutely NO IDEA about how much memory you are allocating. All you know is that you are allocating n slots for whatever type x is and that is simply insufficient.

Lastly, the R&K book uses this approach on malloc() and not the one with the pointer. This is not a reason per se, however I think it's important to learn from how the developers of the language use it.

**tabstop** · 04-10-2010

Originally Posted by claudiu

I disagree. Certainly, I see the advantages of that method, however there are several things I would like to mention here:

Firstly, as I said previously, x = malloc(n * sizeof(int)) improves the READABILITY of the code which is an undeniable qualitative aspect of the program. I think this is more important than writing code so as to prevent yourself from making mistakes that you should not be making in the first place. (i.e. your long-short example) Simply put, the quality of the code, which is also reflected by its readability cannot be sacrificed for the sake of reducing the time you spend debugging your program.

Secondly, a good programmer will spend a considerable amount of type designing the data structures and data types used in his program well before he starts typing code. Consequently, changes in data types should not be too frequent while developing your code. Data representation is a design decision first and foremost. If you design your program well, you will know before you type that x should be a long and not a short. I personally am more old school and believe in the principle of thinking before typing code, rather than jumping on the keyboard and figuring things out on the way.

Thirdly, the importance of malloc is that it allocates memory, however ironically, using:
x=malloc(n * sizeof *x) gives you absolutely NO IDEA about how much memory you are allocating. All you know is that you are allocating n slots for whatever type x is and that is simply insufficient.

Lastly, the R&K book uses this approach on malloc() and not the one with the pointer. This is not a reason per se, however I think it's important to learn from how the developers of the language use it.

IMO you have this exactly backwards.

Code:

x = malloc(20);

What's the largest subscript I can use with x? That is, how many objects did I obtain here?

Code:

x = malloc(5 * sizeof(*x));

What's the largest subscript I can use with x? This time, you know the answer -- this line is more readable than the other.

**grumpy** · 04-10-2010

Originally Posted by claudiu

Firstly, as I said previously, x = malloc(n * sizeof(int)) improves the READABILITY of the code which is an undeniable qualitative aspect of the program. I think this is more important than writing code so as to prevent yourself from making mistakes that you should not be making in the first place. (i.e. your long-short example) Simply put, the quality of the code, which is also reflected by its readability cannot be sacrificed for the sake of reducing the time you spend debugging your program.

The primary practical purpose of enhancing readability or any other measure of "quality" is ensure a programmer can understand the code, in order to minimise likelihood of programmer error.

Using readability to justify a code construct that increases likelihood of programmer error is completely at odds with that.

Software engineering is not about writing aesthetically beautiful but bug-ridden code.

Originally Posted by claudiu

Secondly, a good programmer will spend a considerable amount of type designing the data structures and data types used in his program well before he starts typing code. Consequently, changes in data types should not be too frequent while developing your code. Data representation is a design decision first and foremost. If you design your program well, you will know before you type that x should be a long and not a short. I personally am more old school and believe in the principle of thinking before typing code, rather than jumping on the keyboard and figuring things out on the way.

That's an academic ideal. In practice, designs tend to evolve because - even if the eventual users of your program are specialist software engineers - it is almost guaranteed that requirements will change over time.

A lot of software designs are required to allow for changes of data representations. For example, software that is required to be portable may use a short on one machine and a long on the other. It is better, in practice, to minimise the amount of code that is unique to each machine - and that means using code constructs that do not need to be hand-crafted for each machine.

Originally Posted by claudiu

Thirdly, the importance of malloc is that it allocates memory, however ironically, using:
x=malloc(n * sizeof *x) gives you absolutely NO IDEA about how much memory you are allocating. All you know is that you are allocating n slots for whatever type x is and that is simply insufficient.

Given that the sizes of most types are implementation defined (eg sizeof int may be 2 with one compiler/OS, and 4 or more for another) that same argument works equally well against the approach you advocate. The size of struct types are always implementation defined, due to padding (and to implementation-defined sizes of basic types).

If you want to report the amount of raw memory being allocated, simply store n * sizeof *x in another variable, and examine that variable. That allows the value to be examined either at run time (eg print it out) or during debugging.

Originally Posted by claudiu

Lastly, the R&K book uses this approach on malloc() and not the one with the pointer. This is not a reason per se, however I think it's important to learn from how the developers of the language use it.

I assume you're referring to K&R here.

No, it is not a reason.

There are a number of books and articles by K&R: some are advanced and some are intended for beginners. One characteristic of good teachers (and I consider K&R are good teachers) is that they structure things differently for beginner, intermediate, and expert readers.

This means they will introduce things for beginners that are easier to explain, but may not necessarily be preferable in practice. Later they will point out the flaws in the basic treatment and introduce more advanced (and specialised) approaches.

The x = malloc(n*sizeof *x) technique is moderately advanced and specialised. That doesn't mean it's inferior - it just means it is harder to explain to someone before they have reached a certain point in their learning.

**claudiu** · 04-10-2010

>"The primary practical purpose of enhancing readability or any other measure of "quality" is ensure a programmer can understand the code, in order to minimise likelihood of programmer error.
Using readability to justify a code construct that increases likelihood of programmer error is completely at odds with that.
Software engineering is not about writing aesthetically beautiful but bug-ridden code."

Yes and No. Another important goal of code readability is to minimize production costs. You don't want to pay a software developer to spend 90% of his time understanding old code and 10% of his time improving it, or adding new features to it. Programmer error IMO is really the programmer's concern and it is a measure of his skill. Following this logic you could also say that all API's should be written to take into account any possible misuse. That would be a blunder of astronomic proportions. The sad truth is, no matter how bulletproof or "safe" your code is, some idiot somewhere, sometime will still be able to trip on it. This is more of an academic ideal than anything.

>"That's an academic ideal. In practice, designs tend to evolve because - even if the eventual users of your program are specialist software engineers - it is almost guaranteed that requirements will change over time.

A lot of software designs are required to allow for changes of data representations. For example, software that is required to be portable may use a short on one machine and a long on the other. It is better, in practice, to minimise the amount of code that is unique to each machine - and that means using code constructs that do not need to be hand-crafted for each machine."

Yes, I agree with this.

>"Given that the sizes of most types are implementation defined (eg sizeof int may be 2 with one compiler/OS, and 4 or more for another) that same argument works equally well against the approach you advocate. The size of struct types are always implementation defined, due to padding (and to implementation-defined sizes of basic types).

If you want to report the amount of raw memory being allocated, simply store n * sizeof *x in another variable, and examine that variable. That allows the value to be examined either at run time (eg print it out) or during debugging."

That's just a waste of time, when I can have it right there in front of my eyes. Yes, I may not be able to calculate how much memory I am allocating to the last byte, but given the fact that I probably know what range of machines I am coding for, I could have a pretty good idea.

Another argument against the use of that method of malloc-ing is that it's not as general as you may think. What if, in your long/short example before, I decided that neither would do, and I actually want variable x to be a void*?

**claudiu** · 04-10-2010

Originally Posted by tabstop

IMO you have this exactly backwards.

Code:

x = malloc(20);

What's the largest subscript I can use with x? That is, how many objects did I obtain here?

Code:

x = malloc(5 * *x);

What's the largest subscript I can use with x? This time, you know the answer -- this line is more readable than the other.

I was never advocating using magic numbers in your malloc(), so I don't see the relevance of this. Yes, I agree with you, in that case the other option is certainly superior.

**vart** · 04-10-2010

Code:

x = malloc(5 * *x);

sizeof is missing here

**tabstop** · 04-10-2010

Originally Posted by vart

Code:

x = malloc(5 * *x);

sizeof is missing here

Yes it is, thanks. I've fixed the original.

**whiteflags** · 04-10-2010

Originally Posted by claudiu

Another argument against the use of that method of malloc-ing is that it's not as general as you may think. What if, in your long/short example before, I decided that neither would do, and I actually want variable x to be a void*?

Code:

typedef void **whatever;

whatever x = NULL;
x = malloc(5 * sizeof(*x));

Now you have five void pointers, or, five ints, or whatever. And the typedef here may not be ideal to you in practice but I mean to say it doesn't matter. Just like typedef is an alias for a type, sizeof(*x) is an alias for whatever hard-coded type-name you could give sizeof because it's also evaluated by the compiler.

Edit: And uh, void * doesn't indicate a size by itself, using it to actually allocate doesn't make sense. If you insisted, then yeah, use a void pointer and write something other than sizeof(*x). Ask yourself if the exception is the rule, though.

**claudiu** · 04-10-2010

@whiteflags

You seem to be missing the point here. I said I want my variable to be a void* not a void**. A void** and a void* are not the same thing. A void* is a pointer to anything whereas a void** is a pointer to a pointer to anything, therefore IT IS NOT a pointer to anything itself.

**whiteflags** · 04-10-2010

But you can't allocate to void *. void * is used as a return type for malloc because it points to anything. Using void * for allocation is not only meaningless

x = malloc(sizeof(void)); // does not compile

but x would not be dereferenceable, even if you didn't write something as dumb.

**claudiu** · 04-10-2010

Of course you can, try this:

void *x = malloc(sizeof(int) * 3);

However you can't do:

x = malloc(3 * sizeof(*x)), which is exactly my point!

EDIT: Who says x needs to be dereferanceable?

You can just do:

int *y = (int*)x;

y[1] = 3;

Voila!

**whiteflags** · 04-11-2010

OK, but let's look at a bigger picture.

x = malloc(3 * sizeof(*x));

If this compiles, x is a working, dereferenceable pointer.

void *x = malloc(sizeof(int) * 3);

If this compiles, you have to assign x to a working, dereferenceable pointer before you can use it for anything other than pointer assignment.

void *x;
type *y;

/* wall of text */

x = malloc(sizeof(type) * 3);
y = x;

To me, it doesn't solve any problems you can have finding definitions for y. And y is just an unnecessary alias if x had a real type.

Even better, when is the last time you really allocated that far away from a definition? Shouldn't you just reduce x's scope?

**claudiu** · 04-11-2010

Originally Posted by whiteflags

OK, but let's look at a bigger picture.

x = malloc(3 * sizeof(*x));

If this compiles, x is a working, dereferenceable pointer.

void *x = malloc(sizeof(int) * 3);

If this compiles, you have to assign x to a working, dereferenceable pointer before you can use it for anything other than pointer assignment.

void *x;
type *y;

/* wall of text */

x = malloc(sizeof(type) * 3);
y = x;

To me, it doesn't solve any problems you can have finding definitions for y. And y is just an unnecessary alias if x had a real type.

Not necessarily. In fact x can work as a data template in some generic API, such as a linked list node content for example. You can have a list that you construct with the content void pointer allocating longs and another where you are allocating doubles. In both cases the allocation is done via a void pointer and of course some size_t variable you pass to the list constructor. When you want to USE the data in your user program you already know what kind of list you constructed so you cast it to the appropriate pointer. However, you still have one generic implementation for any type of data.

**whiteflags** · 04-11-2010

That's no refutation. A list may well use void pointers internally to separate data from the list proper but that does not preclude you from using other types of pointers generally, and as arguments to list functions that want void * stuff. You're really doing yourself a disservice if you insist on using void * anywhere but where it must be used. The reason I replied was because you said that sizeof(*x) was a serious weakness if x is void *, and I already proved it was impossible unless you provide a defined type, and if you can provide a defined type, you can use a pointer of that type (and therefore sizeof(*x). I really don't see a fault in that.

**claudiu** · 04-11-2010

Originally Posted by whiteflags

That's no refutation. A list may well use void pointers internally to separate data from the list proper but that does not preclude you from using other types of pointers generally, and as arguments to list functions that want void * stuff. You're really doing yourself a disservice if you insist on using void * anywhere but where it must be used. The reason I replied was because you said that sizeof(*x) was a serious weakness if x is void *, and I already proved it was impossible unless you provide a defined type, and if you can provide a defined type, you can use a pointer of that type (and therefore sizeof(*x). I really don't see a fault in that.

NO. You haven't been following the thread. The void* issue was related to something else entirely, and got sidetracked somewhat by your post of : "You can't allocate to a void*".

The issue was between the use of:

int *x;
/* lots of lines of code and somewhere way far away*/
x = malloc(sizeof(int) * 100);

or the use of:

int *x;
/* lots of lines of code and somewhere way far away*/
x = malloc(sizeof(*x) * 100);

Someone suggested that the second option is better because you may at some point realize that you don't want an int but actually a long, so all you have to do is change the declaration.

I SAID: What if you realize, FOR WHATEVER REASON, you don't want a long*, but a void*? The second method would imply you need to change the allocation, which is the very reason that was argued that the first option would be inferior in the first case. Hence, I provided an example where you have to change the allocation in the second method as well and argued that this is not sufficient reason to claim that the second method is superior.