Thread: parsing command line strings

  1. #1
    Registered User
    Join Date
    Sep 2007
    Posts
    119

    parsing command line strings

    What's the best way to parse command line arguments for a shell program? Also, what's the most amount of arguments a command can have, such as "ls" or "rm" (the max arguments/options any command can have)? I'm using execvp( cmd, args ) to run the command, where cmd is the command and args is the parsed string of arguments and options will a NULL value at the end. Assume the path variable is set in the environment and all variables have been innitialized.

    Code:
             //Assume the cmd variable contains the full string "ls -l"
    
              token = strtok( cmd, " " ); //holds tokens in the string
    	   cmd = token; //the first token is the command
    
    	   //store the rest of the tokens in the arguments array
    	   while ( token != NULL )
    	   {
    	   		token = strtok( NULL, " " );
    	   		args[ count ] = token;
    	   		count++;
    	   }
    
    	   code = execvp( cmd, args );
    Last edited by John_L; 05-26-2008 at 07:29 PM.

  2. #2
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    > Also, what's the most amount of arguments a command can have
    That's system dependant, but in Linux it can be found in /usr/include/linux/limits.h

    On my system it's defined as, #define ARG_MAX 131072 /* # bytes of args + environ for exec() */

    Are you going to support full parsing? Ie, ls --format=commas OR ls --format='commas' OR ls --format="commas" are all valid.
    Last edited by zacs7; 05-26-2008 at 10:33 PM.

  3. #3
    Registered User
    Join Date
    Apr 2007
    Location
    Sydney, Australia
    Posts
    217
    This is how i do it:

    Code:
    int main(int argc, char* argv[])
    {
        char fName[1024] = "";
        int width = 0;
        int height = 0;
        
        //Loop through command arguments
        for(int i = 0; i < argc; ++i)
        {
            //Check if it starts with a "-"
            if(argv[i][0] == '-')
            {
                switch(argv[i][1])
                {
                    //If we're setting the width variable
                    case 'w':
                    case 'W':
                        if(++i < argc)
                          width = atoi(argv[i]);
                    break;
    
                    //If we're setting the height variable
                    case 'h':
                    case 'H':
                        if(++i < argc)
                          height = atoi(argv[i]);
                    break;
                }
                
            //If it is not a command option (an option starting with '-'),
            //its probably the name of the file that was dragged onto the exe
            } else if(i == 1)
                strncpy(fName, argv[i], sizeof(fName));
        }
        printf("File: &#37;s\n", fName);
        printf("Width: %i\n", width);
        printf("Height: %i\n", height);
    
        if(strlen(fName))
        {
            //Do something
        }
        return 1;
    }
    So it could pass this: myProg.exe fileToOpen.txt -W 32 -H 64

    This would print

    File: fileToOpen.txt
    Width: 32
    Height: 32

    EDIT: Now that i look again, this may not be what you want
    Last edited by 39ster; 05-26-2008 at 11:27 PM.

  4. #4
    Registered User
    Join Date
    Sep 2007
    Posts
    119
    zacs7, i'm going to add more functionality as I go probably. Right now the main concern is being able to just parse options beginning with '-' and arguments, i.e. files for a command or something.

    39ster, I'm trying to build a simple shell. My program loops until exited, and reads in commands from standard input. It executes the commands available in the bin folder, but doesn't account for options or arguments yet. I thought I could simply parse the command string and store the options/arguments in an array and pass it to the version of the exec command. But that doesn't seem to be the case. Hope this is slightly clearer, any ideas?

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    What exactly is it that doesn't work, and how is it "not working"?

    I don't see anything obvious that is wrong in your code.

    Edit: Ah, except you are missing the first argument (argv[0]), which should be the name of the command (e.g. "ls").

    Just add:
    Code:
    	   		args[ count ] = token;
    	   		count++;
    before your while ( token != NULL ) line.


    --
    Mats
    Last edited by matsp; 05-27-2008 at 06:51 AM.
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #6
    Registered User
    Join Date
    Sep 2007
    Posts
    119
    I already take care of the first command, from my original code.

    Code:
             token = strtok( cmd, " " ); //holds tokens in the string
    	   cmd = token; //the first token is the command
    so say I enter the string "ls -l", cmd = ls, and -l is stored in the args array, which is ended by a NULL value at the last position in the array.

    Is it that simple tho, passing an array of arguments, than i'm done? Cause it's not working up till now.

    Code:
             //or maybe it should be like this?  using a different string to contain the standard input?
              token = strtok( input, " " ); //holds tokens in the string
    	   cmd = token; //the first token is the command
    Last edited by John_L; 05-27-2008 at 07:27 AM.

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    First of all, I asked in the previous post what is not working. In what way is it not working? If you explain what is going wrong, we may be better placed to answer what to do about it.

    And what I was trying to say is that your string comes in "ls -l", and you should produce the following:
    Code:
    cmd = "ls";
    argv[] = { "ls", "-l", NULL };
    So cmd is copied in two places.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  8. #8
    Registered User
    Join Date
    Sep 2007
    Posts
    119
    oh, so than that's probably what's wrong. Well as far as what's going wrong, it just doesn't execute the command ls -l properly. It will execute ls by itself though. I mean since it compiles and runs fine there's really no errors or anything generated. Does the arguments array I pass have to be of the EXACT size of the number of arguments? Or can it have blank entries as long as its ended by a NULL value?

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    You can have:
    Code:
    char *argv[100000] = { "ls" , "-l", NULL };
    As long as it's big enough for the actual number of arguments you are using, it doesn't matter. It is passed as the address of the "vector" to execvp(), and execvp() will read the vector to find how many arguments there really are.

    So do I get it right that you get "ls" to work, but if you do "ls -l" it does exactly the same as "ls"? In that case, my supposition is that argv[1] is not "-l", but NULL, and argv[0] contains the "-l" - since ls itself skips argv[0], it sees no arguments and just does normal "ls" instead of the long version you expect from "-l".

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  10. #10
    Registered User
    Join Date
    Sep 2007
    Posts
    119
    Ok, after trying a couple things out I know what's wrong. If I make the array holding the arguments some generic size, say 20 (to account for any number of possible arguments) I run into problems, Even though I do null terminate it. But when I make the array a set size, say 3, and than test out writing "ls -l" it works. But the problem is now it no longer works for just "ls" or a command longer with more options. Is it supposed to matter if I pass the exec() command an array with empty values as long as it's null terminated (which doesn't seem to work)?
    Last edited by John_L; 05-27-2008 at 02:56 PM.

  11. #11
    Registered User
    Join Date
    Dec 2007
    Posts
    2,675
    Hmmm. Worked fine for me.

    Code:
    me@initech:~/yup> gcc -g -o yup yup.c
    me@initech:~/yup> ./yup
    ls -l
    total 16
    -rwxr-xr-x 1 me users 9731 2008-05-27 16:57 yup
    -rw-r--r-- 1 me users 1737 2008-05-27 16:56 yup.c
    
    
    me@initech:~/yup>
    EDIT:
    If you compile this with optimizations, though, your method for NULL-terminating the args array may fail spectacularly, because unused entries between the last actual entry and the one you've set to NULL, which is the last in the array, will likely be filled with garbage. I would suggest initializing the args array with zeros to prevent this from happening. Here's what happens:

    Code:
    me@initech:~/yup> gcc -Wall -O3 -o yup yup.c
    me@initech:~/yup> ./yup
    ls -l
    ls: ▒▒▒▒▒: No such file or directory
    ls: ▒▒: No such file or directory
    ls: \▒: No such file or directory
    ls: ▒▒: No such file or directory
    ls: (▒|(▒: No such file or directory
    ls: ▒▒ ▒▒▒▒▒ ▒▒▒)▒▒▒▒E▒t&#37;1▒▒֋E▒▒▒D▒E
    ▒D$▒▒$▒▒▒9}▒u߃▒[^_]&#203;$▒U▒▒S▒▒▒▒▒▒▒t1▒▒Ћ▒▒▒▒▒▒▒u▒▒[]&#208;▒▒U▒▒S▒▒▒: No such file or directory
    ls: : No such file or directory
    
    me@initech:~/yup>
    Last edited by rags_to_riches; 05-27-2008 at 03:07 PM. Reason: More info...

  12. #12
    Registered User
    Join Date
    Sep 2007
    Posts
    119
    What was wrong, was for some reason the fact I was using an array of arbitrary size to store the commands in and it some values were left unfilled but it was still null terminated. So instead I created a new array of set size (after counting the number of arguments), copied parts of the old array into the new, null terminated it and passed it to the exec() command. Works great now. I just thought I could use an array of any size as long as it was null terminated.

  13. #13
    Registered User
    Join Date
    Sep 2007
    Posts
    119
    If I wanted to take care of "globbing" now, how could I easily tie it in. I'm sure all the files/source already exist, can I envoke it somehow? Or do I have to write my own routine for handling it?

  14. #14
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by John_L View Post
    What was wrong, was for some reason the fact I was using an array of arbitrary size to store the commands in and it some values were left unfilled but it was still null terminated. So instead I created a new array of set size (after counting the number of arguments), copied parts of the old array into the new, null terminated it and passed it to the exec() command. Works great now. I just thought I could use an array of any size as long as it was null terminated.
    You can use any size array. Perhaps you can post your code and we can discuss how you should solve the problem.

    Also, it may make sense to write an application that shows it's own arguments, e.g.
    Code:
    #include <stdio.h>
    
    int main(int argc, char **argv)
    {
        int i;
        printf("argc = %d, argv = %p\n", argc, (void *)argv);
        for(i = 0; i <= argc; i++)
           printf("argv[%d]=%s (%p)\n", i, argv[i]?argv[i]:"(NULL)", (void *)argv[i]);
        return 0;
    }
    That way, when you are playing with argument parsing and such things, you can see what it's getting. Note that it's showing both argv[0] (should be the name of the program) and argv[argc] (should be NULL).

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  15. #15
    Registered User
    Join Date
    Sep 2007
    Posts
    119
    It's fine now. It's really not much extra effort creating a new array of set size to hold the actual amount of arguments there are. Everything executes fine and the commands work properly now with options and arguments. I'm concerned now with implementing globbing. The documentation I find on the net isn't very helpful. Is it a pretty easy function to iimplement?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. sscanf and parsing strings
    By jco1323 in forum C Programming
    Replies: 4
    Last Post: 02-20-2008, 06:32 PM
  2. Parsing Strings
    By Hunter2 in forum C++ Programming
    Replies: 29
    Last Post: 12-05-2004, 07:48 PM
  3. Parsing Strings
    By SubLogic in forum C++ Programming
    Replies: 15
    Last Post: 01-07-2003, 11:11 AM
  4. Searching and Comparing Strings Using Parsing
    By niroopan in forum C++ Programming
    Replies: 3
    Last Post: 09-28-2002, 10:18 AM
  5. parsing delimited strings
    By Unregistered in forum C++ Programming
    Replies: 4
    Last Post: 11-08-2001, 12:57 PM