Thread: Functions for C strings

  1. #16
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Quote Originally Posted by officedog View Post
    Code:
    void getUserInputLine(FILE * fp, char * ps) {
        char buf[100];
        if ( ((ps = fgets(buf, sizeof(buf), fp))) != NULL) {
            printf("\n%s", buf);
            printf("\n%s", ps);
        }
    }
    
    int main (int argc, const char * argv[]) {
        char * pString;
        getUserInputLine(stdin, pString);
        return 0;
    }
    That does absolutely nothing. The pointer is passed to the function by value, so the assignment 'ps = ' is applied to the copy, not the variable declared within the scope of main. To get all that to work would require passing a char**.

  2. #17
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    The second option is bad because it assumes the pString declared in main() is valid after the call to getUserInputLine, but it's not.

    In your first option, the declaration of pString is redundant and does not add anything. What you probably want is something more like:
    Code:
    /* There is no point passing in the FILE* pointer, because the function name implies stdin will be used */
    void getUserInputLine(char* buffer, size_t buffer_size)
    {
        char* p;
    
        /* Read in a line from the user */
        if(fgets(buffer, buffer_size, stdin) == NULL)
            buffer[0] = '\0';
    
        /* Now remove the trailing newline character if it exists */
        p = strchr(buffer, '\n');
        if(p != NULL)
            *p = '\0';
    }
    
    int main(void)
    {
        char input[100];
        getUserInputLine(input, sizeof(input));
        printf("User entered: %s\n", input);
        return 0;
    }
    bit∙hub [bit-huhb] n. A source and destination for information.

  3. #18
    Registered User officedog's Avatar
    Join Date
    Oct 2008
    Posts
    77
    Thanks for replies. This a great forum, very helpful. I remember how painful arrays and array pointers are! I know this is all really basic stuff, and I appreciate the help to remember some of it.

    I try and make a summary of this (please correct if this is wrong)

    If declare a char array and pass the array to the function, this will effectively be passed by reference as the name of the array is the same as the address of array[0]. Although the function copies the address, the address still refers to the array which was created outside the function. The function can therefore modify the array and the modifications will exist outside the function.

    I'm still more confused about the char * and why it is a problem passing it into a function - I'll have to read up on this.

    So my minimal template function now looks like this.

    Code:
    void getUserInputLine(char buf[ ], int bufSize) {
        if ( ((fgets(buf, bufSize, stdin))) != NULL) {
            printf("\n%s", buf);
        }
        //strchr function here
    }
    
    int main (int argc, const char * argv[]) {
        int bufSize = 100;
        char buf[bufSize];
        getUserInputLine(buf, bufSize);
        printf("\nbuf[0] = %c", buf[0]); //tests result
        return 0;
    }
    A downside of this is being restricted to the self imposed buffer size. What views are there on the use of malloc and how it compares to passing a char array? I'm trying to get a grasp of objective C at the same time, so returning an NSString is also an option...

  4. #19
    Registered User officedog's Avatar
    Join Date
    Oct 2008
    Posts
    77
    Just to add a progress notes:

    I have read up on the difference between
    char array[] = "a string";
    and
    char * pString = "a string"

    Some of the things I read were
    a) "a string" is a string literal / string constant and is stored in protected/read only memory
    b) char array[] makes a copy of the string literal on the stack and can be altered
    c) char * aString simply holds the address of the string literal. The string cannot be altered using for example aString[0] = 's'
    d) sizeof(array) will return the size of the array, but cannot do this if the array is passed into a function - in this case 'array' will be treated as a char *.

    Current progress with my generic function looks like this. I returned to fgetc as I like its simplicity and am getting a better feel for how it works. Would be interested to know if anyone can spot any problems with this. It seems to work OK I think

    Code:
    //Initialise all elements of array to '\0'
    //Note - best to pass array size using sizeof(charArray)
    void GC_InitCharArray(char charArray[], int arraySize) {
        int index;
        for (index = 0 ; index < arraySize ; index++) {
            charArray[index] = '\0';
        }
    }
    
    void GC_GetUserInput(char buf[], int bufSize) {
        int index = 0;
        int c;
        //initialise array so all elements are '\0'
        GC_InitCharArray(buf, bufSize);
        
        //while fgetc(stdin) ! = '\n' designed to clear input buffer
        while ( (c = fgetc(stdin) ) != '\n') {
            
            //if there is space in array, add character
            // bufSize-1 is sentry value to leave a terminating '\0'
            if (index < (bufSize - 1) ) {
                buf[index] = c;
            }
            //for debugging
            else if (index == bufSize -1) {
                printf("\nBuffer full");
            }
            index++;
        }
        
        //if all characters from input stream have been read, replace the terminating '\n' with '\0'
        if (c == '\n') {
            buf[index] = '\0';
        }
    }
        
    
    //MAIN
    int main (int argc, const char * argv[]) {
        int bufSize = 5;
        char buf[bufSize];
        
        GC_GetUserInput(buf, sizeof(buf));
        printf("\n%s...", buf);
           
        return 0;
    }

  5. #20
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    There is one problem. You will overwrite your buffer if the string is too long by 1 character.
    Code:
            else if (index == bufSize -1) {// Let's say we hit this location.  Buffer is full.
                printf("\nBuffer full");
            }
            index++; // Now we increment index past the buffer
        }
        
        //if all characters from input stream have been read, replace the terminating '\n' with '\0'
        if (c == '\n') {
            buf[index] = '\0'; // Now we write past the end of the buffer
        }
    EDIT: I spoke too soon. The null only gets added if the character is '\n', and the loop wont execute if fgetc() returns '\n'.
    Last edited by bithub; 09-23-2009 at 04:34 PM.
    bit∙hub [bit-huhb] n. A source and destination for information.

  6. #21
    Registered User officedog's Avatar
    Join Date
    Oct 2008
    Posts
    77
    Thanks for pointing this out bithub, I appreciate your help. Odd how such a simple loop can hurt the brain. I think you are right and have modified the code but putting index++ into the initial if(...) statement.

    When I think more about it, the problem seems to be that, because the while loop always reads through the stdin stream, the final statement will always evaluate to true and will be executed. I put it there, thinking I needed to have something that replaced the '\n' with '\0'. However, because the char array is initialised to '\0' AND the initial while loop exits on '\n', there is actually no need for it. I think this is right anyway. I'll post the current version without all the printf messages and comments

    Code:
    void GC_GetUserInput(char buf[], int bufSize) {
        int index = 0;
        int c;
        
        GC_InitCharArray(buf, bufSize);
        
        while ( (c = fgetc(stdin) ) != '\n') {
            if (index < (bufSize - 1) ) {
                buf[index] = c;
                index++;  
            }
        }
    }
    EDIT: was just wondering if assigning an int to a char array will always work, or whether I should convert it before assignment?
    Last edited by officedog; 09-24-2009 at 11:21 AM.

  7. #22
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    EDIT: was just wondering if assigning an int to a char array will always work, or whether I should convert it before assignment?
    It will always work as long as it is an int returned from fgetc(). The only value fgetc() returns that cannot be assigned to a char is EOF, so you should check for that. Maybe change your while loop to:
    Code:
    while ( (c = fgetc(stdin) ) != '\n' && c != EOF) {
    bit∙hub [bit-huhb] n. A source and destination for information.

  8. #23
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Quote Originally Posted by officedog View Post
    Thanks for pointing this out bithub, I appreciate your help. Odd how such a simple loop can hurt the brain. I think you are right and have modified the code but putting index++ into the initial if(...) statement.

    When I think more about it, the problem seems to be that, because the while loop always reads through the stdin stream, the final statement will always evaluate to true and will be executed. I put it there, thinking I needed to have something that replaced the '\n' with '\0'. However, because the char array is initialised to '\0' AND the initial while loop exits on '\n', there is actually no need for it. I think this is right anyway. I'll post the current version without all the printf messages and comments

    Code:
    void GC_GetUserInput(char buf[], int bufSize) {
        int index = 0;
        int c;
        
        GC_InitCharArray(buf, bufSize);
        
        while ( (c = fgetc(stdin) ) != '\n') {
            if (index < (bufSize - 1) ) {
                buf[index] = c;
                index++;  
            }
        }
    }
    EDIT: was just wondering if assigning an int to a char array will always work, or whether I should convert it before assignment?
    Don't forget to null-terminate the string after the loop (you can use 'index' as the position, obviously).

    >> GC_InitCharArray

    For strings, this really isn't necessary. Just set the first character to 0 and you're good to go.

    I'm still more confused about the char * and why it is a problem passing it into a function - I'll have to read up on this.
    Think of it this way. A pointer is really just an integer whose value is an address. So naturally, passing one to a function allows you to read/write to the data at that address. But the pointer itself is passed by value, so you can't alter the address it contains from the caller's standpoint, since you just have a copy of it. In order to do so requires passing a pointer to a pointer, eg:

    Code:
    /*
        Only modifies the copy of the pointer
    */
    void foo( char* ptr )
    {
        ptr = "Unread message";
    }
    
    /*
        Modifies the caller's pointer
    */
    void bar( char** ptr )
    {
        *ptr = "New message";
    }
    
    int main( void )
    {
        char* str = "Message";
        printf( "'str': %s\n", str );
        foo( str );
        printf( "'str': %s\n", str );
        bar( &str );
        printf( "'str': %s\n", str );
        return 0;
    }

  9. #24
    Registered User officedog's Avatar
    Join Date
    Oct 2008
    Posts
    77
    Thanks again bithub. I think this is also what tabstop said on page 1. I get it now. The reason I didn't think it was necessary was because '\n' is the trigger for flushing to the input buffer. But EOF is also returned if there is an error reading the stream (or if it is input with ctrl-D). I've added it now.

    Thanks for your time in explaining sebastiani. I'm getting there! slowly. About the idea of initialising the array to '\0' - I read somewhere that this was a good idea because then you know exactly what is there. But simply adding a terminating '\0' seems a whole lot simpler and more efficient. I'm assuming that when I pass the array, to any string functions, they simply stop once they get to the first '\0' and ignore the rest. Is there any point at all in initialising the array then?

    I have added another strand to the project. This time I am having a little experiment with using dynamic memory to store the user input. I'm trying to get to the stage that I can resize the memory if the user sends more than the original static array could hold

    Here's what I got as a first step. It seems to work so far, but this does not mean it's right of course

    Code:
    char * GC_GetUserInputDyanmic() {
        char * s;
        int baseSize = 100;
        s = malloc(sizeof(char) * baseSize);
        if (s == NULL) {
            return s;
        }
        int len;
        if ( fgets(s, baseSize, stdin) != NULL) {
            len = strlen(s);
            printf("\nstrlen = %d, string: %s", len, s);
        }
        return s;
    }
            
    
    //MAIN
    int main (int argc, const char * argv[]) {
    
        char * pDynamicString = GC_GetUserInputDyanmic();
        //EDIT - I should do something like this too I think
        if (pDynamicString == NULL) {
            return 0;
        }
        //End EDIT
        printf("\nIn Main: dynamic string = %s", pDynamicString);
        free(pDynamicString);
        pDynamicString = NULL;
           
        return 0;
    }
    Last edited by officedog; 09-24-2009 at 02:12 PM.

  10. #25
    Registered User officedog's Avatar
    Join Date
    Oct 2008
    Posts
    77
    I have worked further on my efforts at making a function which dynamically allocates memory to store a string from user input. This time I have added an extra bit so it will resize (I hope) depending on the input from the user. It seems to work, but this is new territory for me. Can anyone see problems with this?

    Code:
    char * GC_GetUserInputDyanmic() {
        int baseSize = 2;  //set to very small level for test runs
        char c;
        
        char * spDynamic = malloc(sizeof(char) * baseSize);
        if (spDynamic == NULL) {
            return spDynamic;
        }
        
        int index = 0;
        
        while ((c = fgetc(stdin)) != '\n'  && c != EOF) {
            if(index < (baseSize - 1)) {
                spDynamic[index] = c;
                index++;
            }
            else {
                printf("\nReallocating space: baseSize = %d, index = %d", 
                       baseSize, index);
                baseSize = baseSize * 2;
                realloc(spDynamic, sizeof(char) * baseSize);
                spDynamic[index] = c;
                index++;
            }
        }
        spDynamic[index] = '\0';
        
        return spDynamic;
    }

  11. #26
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by officedog View Post
    Can anyone see problems with this?
    Only that your basesize increases exponentially. Why don't you use +2 instead of *2?

    Another point is that these are very trivial amounts of memory. You might as well just work with kb, not bytes. So initially allocate 1024, then another. Generally, allocating in lots smaller than that would be considered inefficient, unless you are working in a very limited special environment.

    An option you could use is to read the data into a large block local to the function (eg, 8196 bytes), then strlen it, malloc a seperate pointer the same amount, and strcpy. Then you have a returnable heap variable of the exact right size -- the 8k local to the stack will be gone when the function ends. This is probably less work for the processor than a string of realloc calls, and is pretty much optimal, memory use wise.

    Make sense?

    [later] also read Sebastiani's post below -- you should be using the return value of realloc

    Code:
    spDynamic=realloc(spDynamic, sizeof(char) * baseSize);
    Last edited by MK27; 09-25-2009 at 01:19 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  12. #27
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    >> realloc(spDynamic, sizeof(char) * baseSize);

    Again, remember that in C all parameters are passed by value, and so all that realloc gets is a *copy* of your pointer - there is no way for it to modify the caller's pointer (eg: set it to the new block of memory). Fortunately, realloc knows this, and conveniently returns a pointer to the new block (or NULL, if unavailable).

    Other than that, though, I'd say the function looks fine.

    One suggestion, though. Look at this part of the loop:

    Code:
                spDynamic[index] = c;
                index++;
    Instead of writing that twice, you can simply perform the size check first, resize if necessary, and then at the bottom of the loop (outside of any conditional) have the assignment/increment (the above code fragment, that is).

  13. #28
    Registered User officedog's Avatar
    Join Date
    Oct 2008
    Posts
    77
    Thank you for this feedback. It's helping a lot to build some confidence

    I rewrote the function with the larger temporary storage, rather than resizing with realloc. It still seems to work OK. I decided to keep the option of resizing, just in case the 'user' has a lot to say! But with such a large buffer, it probably won't be used. And then tidied up the while loop.

    Code:
    char * GC_GetUserInputDyanmicV2() {
        int baseSize =  2048;
        char c;
        
        char * spDynamic = malloc(sizeof(char) * baseSize);
        if (spDynamic == NULL) {
            return spDynamic;
        }
        
        int index = 0;
        
        while ((c = fgetc(stdin)) != '\n'  && c != EOF) {
            if (index > baseSize -2) {
                printf("\nReallocating space: baseSize = %d, index = %d", 
                       baseSize, index);
                baseSize += baseSize;
                realloc(spDynamic, sizeof(char) * baseSize);
                
            }
            spDynamic[index] = c;
            index++;
        
        }
        spDynamic[index] = '\0';
        
        int len = strlen(spDynamic);
        
        char * stringReturn = malloc(sizeof(char) * (len + 1) );
        strncpy(stringReturn, spDynamic, len+1);
        stringReturn[len] = '\0';
        
        free(spDynamic);
        spDynamic = NULL;
        
        return stringReturn;
    }

  14. #29
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    >> realloc(spDynamic, sizeof(char) * baseSize);

    You're still not capturing the return value correctly.

  15. #30
    Registered User officedog's Avatar
    Join Date
    Oct 2008
    Posts
    77
    OK I understand now. Thanks for keeping an eye on this, sebastiani. Really appreciate it. So, the reason it worked fine was because my call to realloc didn't need to find a block of memory somewhere else and could simply extend the existing block.... meaning the pointer was still valid.

    Code:
    buffer = realloc(buffer, sizeof(char) * baseSize);
    I guess I should also add a test for a NULL pointer too

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. strings and functions
    By simo_mon in forum C Programming
    Replies: 8
    Last Post: 09-16-2008, 05:18 AM
  2. Passing strings to functions and modifying them
    By diddy02 in forum C Programming
    Replies: 6
    Last Post: 08-11-2008, 01:07 AM
  3. Reading strings input by the user...
    By Cmuppet in forum C Programming
    Replies: 13
    Last Post: 07-21-2004, 06:37 AM
  4. Attaching functions to a class/struct
    By VirtualAce in forum Tech Board
    Replies: 2
    Last Post: 08-04-2003, 10:56 AM
  5. Replies: 2
    Last Post: 04-13-2003, 08:40 PM