Thread: argv and its "formal" type

  1. #1
    Registered User
    Join Date
    Apr 2012
    Posts
    13

    argv and its "formal" type

    The standard variable "argv" in the main function is commonly declared as "char **argv" or "char *argv[]".

    However upon close scrutiny I noticed some semantic incompleteness.
    "char **argv" inherently only expresses a pointer to a pointer to a single char variable and neglects to mention the full data that really is associated with the variable.
    "char *argv[]" inherently only expresses an array of pointers that point to a single char variable.

    I see the true type of argv as an array of pointers to an array of char variables, so a more descriptive type might be as shown here:

    Code:
    #include<stdio.h>
    int
    main(int ac,char (*av[])[]){
      puts(*av[0]); //expected; prints first argument
      return(0);
    }
    But if I wanted to assign this variable, I'd think that I'd need to obtain its address and store that:

    Code:
    #include<stdio.h>
    char (*(*argv)[])[];
    int
    main(int ac,char (*av[])[]){
      //  av is              an array of pointers to an array of chars
      //argv is a pointer to an array of pointers to an array of chars
      argv=&av; //incompatible types
      puts(*(*argv)[0]); //undefined
      return(0);
    }
    I did notice that the following works:

    Code:
    #include<stdio.h>
    char (*(*argv)[])[];
    int
    main(int ac,char (*av[])[]){
      //  av is              an array of pointers to an array of chars
      //argv is a pointer to an array of pointers to an array of chars
      argv=av; //still incompatible types
      argv=&av[0]; //same as above
      puts(*(*argv)[0]); //expected
      return(0);
    }
    Can somebody assist in the rationalization of this? Thanks for any help.
    Last edited by yuklair; 04-18-2012 at 03:20 PM. Reason: grammar

  2. #2
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    In a function declaration, char*[] and char** are identical. They both mean pointer to pointer to char.

    Ok. So:
    I see the true type of argv as an array of pointers to an array of char variables
    This is the primary source of your problem. argv is not an array of pointers to an array of char. It's a pointer to pointer to char. Whether you like that or not, that's what it is. You don't have to like the fact that a char* can point to a string of characters, but it can. Keep in mind that an array is not a pointer. You cannot interpret a pointer to a pointer as a pointer to an array and expect it to work.

    In C there is one level of array “decay”, so an array of T can generally be treated as a pointer to T. As as result, the best you can do is think of argv as an array of char*. You just cannot treat it as an array of array because it is not one.

  3. #3
    Registered User
    Join Date
    Dec 2011
    Posts
    795
    > Can somebody assist in the rationalization of this?
    Yes, I can help explain it. :P

    A double pointer is, in its most basic (value) form, an integer that points to a location in memory. At the second location, you will find another value that points to another location in memory. Provided a variable declared as this:
    Code:
    char value;
    char *sptr = &value;
    char **dptr = &sptr;
    The memory will look something like this:

    dptr *dptr / sptr **dptr / value
    address of (&var) 0x7FFFFFFFF 0x80000000 0x60000000
    points to (*var) 0x80000000 0x60000000 (n/a)

    However, this system can also be treated as an array, using pointer arithmetic.
    Code:
    char *sptr[10];
    char **dptr = (char **)sptr;
    The initial access of (*dptr) will act like a normal access to (sptr). But, when you increment dptr, it will point to the next pointer in the array of sptr.

    You could get the same effect with either of the following:
    Code:
    char **dptr;
    
    char *dptr[10];
    The one exception being:
    Code:
    char dptr[10][10];
    As this is laid out as a linear array, instead of using pointer arithmetic to find the columns.

  4. #4
    Registered User camel-man's Avatar
    Join Date
    Jan 2011
    Location
    Under the moon
    Posts
    693
    Memcpy, Can you declare char **argv, as char argv[][]?

  5. #5
    Registered User
    Join Date
    Dec 2011
    Posts
    795
    Quote Originally Posted by camel-man View Post
    Memcpy, Can you declare char **argv, as char argv[][]?
    I'm not entirely sure, but I remember reading that "char argv[][]" is invalid syntax, as (per c99 standard) 2d arrays in function declarations are required to have at least one constant value.

    Either way, it's worth a try (I can't test right now).

  6. #6
    Registered User
    Join Date
    Apr 2012
    Posts
    13
    Thanks for all your replies.
    After some more analysis, I have reached a satisfactory conclusion.

    The C99 standard dictates that argv in main be declared as "char *argv[]", but happens to be equivalent to "char **argv". First off I consider which of the two would be more "descriptive":
    The former implies only a single char involved.
    The latter implies an array of pointers, that each still only imply a single char, so neither types are "complete".
    One problem with the latter type, which I found the hard way as described in OP, is declaring a pointer to argv using similar syntax; it is simply wrong as argv is not an array.
    The "char **argv" type is more accurate in this case, and in general; the [] in argument declarations causes nothing but confusion.

    However the standard dictates nothing about defining another, more "descriptive" variable to store main's argv and to be used in its place.
    To help understand the problem, I make a diagram of the data that exists, regardless of how argv is declared. Let's consider an instance that had two command-line arguments "name" and "arg":

    Code:
    [ n][ a][ m][ e][\0][ a][ r][ g][\0][\0]
     10  11  12  13  14  15  16  17  18  19
    
    [10][15][19]
     30  31  32
    
    [30]
     50
    What we see clearly is arrays of data, which main's argv completely neglects.

    In this scenario the program sees a value of 30 at address 50, which would be main's argv declared as "char **argv".
    What is the value at address 50 really pointing to though?
    The top row of data is an array of chars, or "char []".
    The middle row of data is an array of pointers to an array of chars, or "char (*[])[]".
    The bottom datum is a pointer to an array of pointers to an array of chars, or "char (*(*)[])[]".

    Therefore:
    Code:
    #include<stdio.h>
    char (*(*foo)[])[];
    int
    main(int argc,char **argv){
      foo=(void *)argv;
      puts(*(*foo)[0]);
      puts(*(*foo)[1]);
      return(0);
    }
    When evaluating an array type (the arguments in the puts functions are each an array type), C automatically handles it as an argument to the first member so there is no problem with using it directly in functions that expect "char *", and at the same time we get a much more descriptive type than just "char **".

    Incidentally, if you wanted to set foo to argv's second member (for instance, if you wanted to ignore the first command-line argument):
    Code:
    #include<stdio.h>
    char (*(*foo)[])[];
    int
    main(int argc,char **argv){
      foo=(void *)&(*(void *(*)[])argv)[1]; //no need for full type in a typecast that's to be typecasted to a general pointer
      puts(*(*foo)[0]);
      puts(*(*foo)[1]);
      return(0);
    }
    Compare THAT to simply char **argv. :P
    Last edited by yuklair; 04-19-2012 at 03:54 PM.

  7. #7
    Registered User
    Join Date
    Mar 2011
    Posts
    546
    Except your conclusion is incorrect. argv IS an array of pointers to char. that is, in memory, there is an array of pointers, the first being argv[0], the second argv[1] etc. you are showing the data that argv[0] points to. but if you examine the memory at the location referenced by 'argv' by itself, you will see an array of pointers to 'name' and 'arg'.

  8. #8
    Registered User
    Join Date
    Mar 2011
    Posts
    546
    Code:
    int main(int argc,char *argv[])
    {
    	printf("%p\n",argv);		// location of argv
    	printf("%p\n",argv[0]);		// contents of argv[0] (points to name of program)
    	printf("%p\n",argv[1]);		// contents of argv[1] (points to argument 'name')
    	printf("%p\n",argv[2]);		// contents of argv[2] (points to argument 'arg')
    	printf("%s\n",argv[0]);		// %s dereferences the pointer in argv[0] and prints the name of the program
    	printf("%s\n",argv[1]);		// %s dereferences the pointer in argv[1] and prints 'name'
    	printf("%s\n",argv[2]);		// %s dereferences the pointer in argv[2] and prints 'arg'
    	return 0;
    }
    program output (with comments)
    Code:
    00423380     location of the argv array
    00423390     contents of first element of argv
    004233AE     contents of second element of argv
    004233B3     contents of third element of argv
    e:\home\dh0072\x1.exe
    name
    arg
    memory dump
    Code:
                   argv[0]        argv[1]       argv[2]
    x00423380  [90 33 42 00] [ae 33 42 00] [b3 33 42 00] 00 00 00 00  .3B.®3B..3B.....
    0x00423390  65 3a 5c 68 6f 6d 65 5c 64 68 30 30 37 32 5c 78  e:\home\dh0072\x
    0x004233A0  5c 44 65 62 75 67 5c 78 31 2e 65 78 65 00 6e 61  \Debug\x1.exe.na
    0x004233B0  6d 65 00 61 72 67 00 fd fd fd fd ab ab ab ab ab  me.arg.

  9. #9
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    OP, you should read this. Question 6.3

  10. #10
    Registered User
    Join Date
    Apr 2012
    Posts
    13
    Typo in my post:
    Quote Originally Posted by yuklair View Post
    When evaluating an array type (the arguments in the puts functions are each an array type), C automatically handles it as a pointer to the first member so there is no problem with using it directly in functions that expect "char *"
    --

    Quote Originally Posted by dmh2000 View Post
    argv IS an array of pointers to char.
    Case 1:
    foo IS an array of pointers to char.
    Code:
    #include<stdio.h>
    char *foo[2]={"foo","bar"};
    int
    main(int argc,char **argv){
      char *(*pointer)[]=&foo;
      printf("%s %s\n",(*pointer)[0],(*pointer)[1]);
      return(0);
    }
    Case 2:
    argv IS an array of pointers to char.
    Code:
    #include<stdio.h>
    int
    main(int argc,char **argv){
      char *(*pointer)[]=&argv;
      printf("%s %s\n",(*pointer)[0],(*pointer)[1]);
      return(0);
    }
    The second case segfaults.

    --

    Also, remember that this topic is about style of C code; about inferences that can be interpreted based on the declaration of variables.
    Last edited by yuklair; 04-19-2012 at 05:32 PM.

  11. #11
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Your arguments would be a lot more persuasive if your code did not present errors.
    Code:
    Comeau C/C++ 4.3.10.1 (Oct  6 2008 11:28:09) for ONLINE_EVALUATION_BETA2
    Copyright 1988-2008 Comeau Computing.  All rights reserved.
    MODE:strict errors C99 
    
    "ComeauTest.c", line 4: error: a value of type "char ***" cannot be used to
              initialize an entity of type "char *(*)[]"
        char *(*pointer)[]=&argv;
                           ^
    
    1 error detected in the compilation of "ComeauTest.c".

  12. #12
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    @yuklair:
    Your problem is (not to be rude) a lack of understanding of C, or an expectation for it to behave the way you want, with regards to pointers, 2-d arrays and how arrays behave when passed to functions. Read this and see if it helps: Arrays and Pointers. Of particular interest might be 6.1-6.4 and 6.13-6.20, but I would read them all just for the heck of it.

    Some of your expectations/assumtions that you need to change:
    > I see the true type of argv as an array of pointers to an array of char variables
    This is wrong, and is probably the greatest source of your problem. argv is just an array of pointer to char. Each pointer in the array may point to the first element of an array of char, but it's still just an array of pointer to char.

    >The former implies only a single char involved.
    >The latter implies an array of pointers, that each still only imply a single char, so neither types are "complete".
    Neither of those implies a single char.
    No pointer ever implies there is only one thing that is being pointed to. It's just an address to the start of one or more consecutive things (or it could be null).

    >
    The "char **argv" type is more accurate in this case, and in general; the [] in argument declarations causes nothing but confusion.
    This is completely subjective, and not true for everybody. Some people like that it reminds them that what is pointed to is an array.

    > What we see clearly is arrays of data, which main's argv completely neglects.
    I don't see any arrays of data that main's argv neglects, even partially. It is not neglect, it is simply how arrays are passed to functions in C, i.e. how they decay to a pointer. argv is a variable that exists on the stack, therefor it has an address on the stack, which you can get by using &argv. It would be 50 in your example. argv without the ampersand is the address of the first element of the array, or &argv[0]. That would be 30 in your example, with &argv[1] and &argv[2] being 31 and 32 respectively. Each element of argv is a pointer to char (really a pointer to one or more consecutive chars). So in your example, argv[0] is 10, argv[1] is 15. You're slightly off with argv[2] though, it points to NULL. Sure, on some crazy architecture, NULL could be 19, but I doubt it. It's best to think of it as zero:
    Code:
    
    
    Code:
    [10][15][0]
     30  31  32
    > To help understand the problem, I make a diagram of the data that exists, regardless of how argv is declared.
    You can't say "regardless of how argv is declared". Any environment that conforms to the standard will pass in the data for argv in a specific format, namely char **argv (or it's equivalent, char *argv[]). If you declare argv in an invalid manner, then you can't properly interpret that data through argv. Because of this, the following interpretations are off.

    > The top row of data is an array of chars, or "char []".
    > The middle row of data is an array of pointers to an array of chars, or "char (*[])[]".
    > The bottom datum is a pointer to an array of pointers to an array of chars, or "char (*(*)[])[]".
    The top row of data is not really an array of chars. They are not (AFAIK) required to be sequential. Really, what you have is two distinct arrays of char, one for "name" and one for "arg".
    The middle row of data is an array of pointers to char. Each element points to the first char in an array from row 1, except the last, which points to null.
    The bottom row is a pointer to an array of pointer to char.

    > When evaluating an array type (the arguments in the puts functions are each an array type), C automatically handles it as an argument to the first member so there is no problem with using it directly in functions that expect "char *", and at the same time we get a much more descriptive type than just "char **".
    What you've done in that example is use a type case to cover up the fact that argv and foo are incompatible types. Anything can be assigned to a void *, and a void * can be assigned to anything, but that doesn't mean it should. Remove the cast and the compiler should complain that you're doing something stupid. The type of the arguments to puts are array type only because you cast argv to something it isn't. Yes, *(*foo)[0] is array type, but it is not an accurate description of what it really is, which is argv[0], which is pointer to char.



    > Incidentally, if you wanted to set foo to argv's second member (for instance, if you wanted to ignore the first command-line argument):
    Your description is a little off, IMO. You basically want foo to be just like argv, except skipping the first element. In that case, what you really want to do is
    set foo to the address of argv's second member. Something like:
    Code:
    char **foo = &argv[1];
    foo points to the second element in argv, so foo[0] == argv[1], foo[1] == argv[2], etc. Setting foo to argv's second member would simply be:
    Code:
    char *foo = argv[1];

  13. #13
    Registered User
    Join Date
    Apr 2012
    Posts
    13
    Thanks for your replies.

    @whiteflags: Typecast the expression to the right of the = to a generic pointer.

    Quote Originally Posted by anduril462 View Post
    >The former implies only a single char involved.
    >The latter implies an array of pointers, that each still only imply a single char, so neither types are "complete".
    Neither of those implies a single char.
    No pointer ever implies there is only one thing that is being pointed to. It's just an address to the start of one or more consecutive things (or it could be null).


    You're missing the point. What I'm talking about is programming style/convention.

    Take variable name conventions for example.
    If we followed the convention that the name of every variable of integer type start with the letter "i", then when we first glance at a variable with the letter "i" we can assume immediately that it is of integer type.

    Similarly, consider this example:
    Code:
    before:
    //THIS FUNCTION MUST TAKE A POINTER TO ONE CHAR ONLY.
    void foo(char *a);
    //This function could take a pointer to one or more chars. Doesn't matter.
    void bar(char *a);
    
     after:
    //I don't need to comment my prototypes. My declaration style says it for me. :)
    void foo(char *a);
    void bar(char (*a)[]);


    Quote Originally Posted by anduril462 View Post
    > The "char **argv" type is more accurate in this case, and in general; the [] in argument declarations causes nothing but confusion.
    This is completely subjective, and not true for everybody. Some people like that it reminds them that what is pointed to is an array.

    I agree that was a little rash on my part.

    Quote Originally Posted by anduril462 View Post
    > What we see clearly is arrays of data, which main's argv completely neglects.
    I don't see any arrays of data that main's argv neglects, even partially. It is not neglect, it is simply how arrays are passed to functions in C, i.e. how they decay to a pointer. argv is a variable that exists on the stack, therefor it has an address on the stack, which you can get by using &argv. It would be 50 in your example. argv without the ampersand is the address of the first element of the array, or &argv[0]. That would be 30 in your example, with &argv[1] and &argv[2] being 31 and 32 respectively. Each element of argv is a pointer to char (really a pointer to one or more consecutive chars). So in your example, argv[0] is 10, argv[1] is 15. You're slightly off with argv[2] though, it points to NULL. Sure, on some crazy architecture, NULL could be 19, but I doubt it. It's best to think of it as zero:
    Code:
    
    
    Code:
    [10][15][0]
     30  31  32
    main's argv does indeed neglect any array, according to MY arbitrary style/convention that I have defined.
    You're right about the NULL thing, slight slip up on my part.

    Quote Originally Posted by anduril462 View Post
    > The top row of data is an array of chars, or "char []".
    > The middle row of data is an array of pointers to an array of chars, or "char (*[])[]".
    > The bottom datum is a pointer to an array of pointers to an array of chars, or "char (*(*)[])[]".
    The top row of data is not really an array of chars. They are not (AFAIK) required to be sequential. Really, what you have is two distinct arrays of char, one for "name" and one for "arg".
    The middle row of data is an array of pointers to char. Each element points to the first char in an array from row 1, except the last, which points to null.
    The bottom row is a pointer to an array of pointer to char.
    Perhaps the two strings in the top row aren't right next to each other; but each individual string has to be sequential in order for it to be a string, and that's all that matters. The middle row of data is an array of pointers to sequential chars. Arrays of chars are sequential chars. Therefore, if we interpret (typecast) sequential chars to an array, there is no problem.

    Quote Originally Posted by anduril462 View Post
    > When evaluating an array type (the arguments in the puts functions are each an array type), C automatically handles it as an argument to the first member so there is no problem with using it directly in functions that expect "char *", and at the same time we get a much more descriptive type than just "char **".
    What you've done in that example is use a type case to cover up the fact that argv and foo are incompatible types. Anything can be assigned to a void *, and a void * can be assigned to anything, but that doesn't mean it should. Remove the cast and the compiler should complain that you're doing something stupid. The type of the arguments to puts are array type only because you cast argv to something it isn't. Yes, *(*foo)[0] is array type, but it is not an accurate description of what it really is, which is argv[0], which is pointer to char.
    I only typecasted it to a generic pointer for brevity of code. I could have written out the full type for foo. If I remove the cast the compiler would indeed complain because I removed the cast (as is the case with most typecasts...).

    Why is it bad to typecast a variable to an incompatible type? That's the nature of typecasting, and it's used all the time. It's just gives another way to manipulate the data associated with the variable, as in my case works fine.

    Quote Originally Posted by anduril462 View Post
    > Incidentally, if you wanted to set foo to argv's second member (for instance, if you wanted to ignore the first command-line argument):
    Your description is a little off, IMO. You basically want foo to be just like argv, except skipping the first element. In that case, what you really want to do is
    set foo to the address of argv's second member. Something like:
    Code:
    char **foo = &argv[1];
    foo points to the second element in argv, so foo[0] == argv[1], foo[1] == argv[2], etc. Setting foo to argv's second member would simply be:
    Code:
    char *foo = argv[1];
    Wording slip up on my part.
    Last edited by yuklair; 04-19-2012 at 07:40 PM.

  14. #14
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by yuklair
    Similarly, consider this example:
    Code:
    before:
    //THIS FUNCTION MUST TAKE A POINTER TO ONE CHAR ONLY.
    void foo(char *a);
    //This function could take a pointer to one or more chars. Doesn't matter.
    void bar(char *a);
     
     after:
    //I don't need to comment my prototypes. My declaration style says it for me. :)
    void foo(char *a);
    void bar(char (*a)[]);
    I think you have a potential problem though. You can call foo like this:
    Code:
    char b[10];
    foo(b);
    However, you would likely call bar like this:
    Code:
    bar(&b);
    The trouble is, when you define bar, you would need to specify the complete type of the parameter, e.g.,
    Code:
    void bar(char (*a)[5]) {}
    That is, a is a pointer to an array of N chars, and you need to decide just what N should be. Once you do that, passing a pointer to an array of M chars, where N != M, is incompatible. An array is convertible to a pointer to its first element, but a pointer to an entire array of N chars is not convertible to a pointer to an entire array of M chars.

    This is fine if you really do only want to deal with an array of N chars, but for many applications the array size should be variable and specified either directly by another parameter or indirectly by a special sentinel value (such as the null character), in which case your approach does not apply. You would probably have to keep the incomplete type in the parameter list of the definition then type cast to a variable length array for this to work, but is that really worth it compared to just using a pointer to char parameter?

    Where your approach might be used is when we are dealing with a 2D array with a known size. An array of arrays is converted to a pointer to its first element, hence an array of arrays is converted to a pointer to an array, thus having a pointer to an entire array as a parameter may be useful.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  15. #15
    Registered User
    Join Date
    Apr 2012
    Posts
    13
    Quote Originally Posted by laserlight View Post
    This is fine if you really do only want to deal with an array of N chars, but for many applications the array size should be variable and specified either directly by another parameter or indirectly by a special sentinel value (such as the null character), in which case your approach does not apply.
    In my case, with argv, I exclusively use the special sentinel value (NULL), because it deals with C strings. Why does not my approach apply?

    I repeat myself to the point of exhaustion, this is a matter of programming style/convention.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 14
    Last Post: 12-26-2004, 11:18 AM
  2. "Any" type using "Boost"
    By Trauts in forum C++ Programming
    Replies: 1
    Last Post: 05-01-2003, 10:53 AM
  3. "itoa"-"_itoa" , "inp"-"_inp", Why some functions have "
    By L.O.K. in forum Windows Programming
    Replies: 5
    Last Post: 12-08-2002, 08:25 AM
  4. "CWnd"-"HWnd","CBitmap"-"HBitmap"...., What is mean by "
    By L.O.K. in forum Windows Programming
    Replies: 2
    Last Post: 12-04-2002, 07:59 AM
  5. "int main(int argc,char* argv[])" when can I use it
    By Tonyukuk in forum C++ Programming
    Replies: 2
    Last Post: 11-19-2002, 05:40 AM