Thread: What is the best example of obtaining a string from a user/filestream?

  1. #1
    Registered User
    Join Date
    Sep 2008
    Location
    Nort East of Great Britain!
    Posts
    8

    What is the best example of obtaining a string from a user/filestream?

    First post here, apologies in advance for any blasphemous/offensive breaches of posting etiquette.

    In a nutshell, I would like a fool proof and efficient method of obtaining a string from the user and or file-streams.

    Ideally, I'd like to be able to allocate an array of characters - and have the right amount for that string. I believe this is achievable using the "malloc" function, but as yet can't get my head around the concept. I understand what it does, just now how to implement it.

    Below is a small program I wrote to experiment with the various ways of getting input from "stdin".

    Code:
    /* input.c
       Various methods of getting input from the user */
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    int main() {
      char sbuf[3], buf[80], *message;
    
       /* Input a string. */
       puts("Enter 1st line of text. To be placed in small buffer.");
       gets(sbuf);
       puts(sbuf);
       printf("%d \n", strlen(sbuf));
       
       /* Input a string. */
       puts("Enter 2nd line of text. To be placed in big buffer.");
       gets(buf);
       puts(buf);
       
       /* Input a string. */
       puts("Enter 3rd line of text. Dynamically allocated spave");
       gets(buf);
       puts(buf);
       
       /* Allocate the initial block and copy the string to it. */
       message = realloc(NULL, strlen(buf)+1);
       strcpy(message, buf);
       puts(buf);
       
        /* Input a string. */
       puts("Enter 3rd line of text. Fgets, 4 char buffer");
       fgets(buf, 4, stdin);
       puts(buf);
       
      getch();
      return 0;
    }
    The above code was compiled, successfully, in Dev-C++ 4.9.9.2 running on Windows XP Home Edition

    All 4 methods work. As in I can display the array in the terminal, and have no 'junk' characters appear, or lose any of the message I typed in - however long.

    Here is where it all goes pair shaped.

    I set aside 3 characters as a "small buffer", and called is sbuf. Using the "gets" function; if I type more than 3 characters I can still display the entire message.

    Where are my "extra" characters being stored?

    Is this some delightful feature of my compiler, protecting the programmer from himself?

    I have no idea what is going on here.

    I hope someone here can shed some light on this. I'd like a solution, but also to understand the solution.

    Thanks in advance,

    Clint

  2. #2
    The superhaterodyne twomers's Avatar
    Join Date
    Dec 2005
    Location
    Ireland
    Posts
    2,273
    Well, you probably shouldn't use gets(). It's not your compiler. The function has no buffer overflow control [ http://faq.cprogramming.com/cgi-bin/...&id=1043284351 ]

  3. #3
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    1. You don't have to implement malloc, unless you plan on writing your own compiler.
    2. Your "extra" characters go ... somewhere. Here you're probably being saved because most modern OS's simply can't give you "only three bytes", so you probably have some extra room because the OS actually handed you 4K or something similar. Of course that extra room isn't yours and can get reallocated if you do another malloc or such, so you don't want to rely on that. That is the downfall of gets -- it will read as much as you're willing to type at it, and it will put that extra data ... somewhere (on top of your other variables, on top of some other program, ...).

  4. #4
    Registered User
    Join Date
    Sep 2008
    Location
    Nort East of Great Britain!
    Posts
    8

    Thanks

    Thanks for that FAQ.

    It explained clearly the dangers of using gets, and how it can cause a lot of bother.

    I even wrote an example myself (using a structure) and got the same results, just to clarify.


    So, if you where to obtain input from a user how would you do it? Just set aside a buffer limit to accept it? It sounds sensible, but is their another way?

    I've figure out that I can read a file character by character, for example, and count the chars and declare a buffer setting aside enough space for the amount of characters I have counted.

    Rather than counting, and then resetting my file pointer to the Beginning Of File, is their a way to "add" to an array, as it goes a long reading in the characters one by one.


    Cheers for clarifying up the whole "gets()" mystery.

    Clint

  5. #5
    The superhaterodyne twomers's Avatar
    Join Date
    Dec 2005
    Location
    Ireland
    Posts
    2,273
    No problem.

    >> So, if you where to obtain input from a user how would you do it? Just set aside a buffer limit to accept it? It sounds sensible, but is their another way?
    To be fair I program more in C++ and this isn't as big an issue there. I don't think the adding limits is that bad a situation. If you consider getting someone's name at the input. It's probably not going to be above a certain maximum length. Adding an upper limit is, when it comes to it, sensible. Guess what you might consider to be a reasonable maximum, and add half more to it.

    >> Rather than counting, and then resetting my file pointer to the Beginning Of File, is their a way to "add" to an array, as it goes a long reading in the characters one by one.
    Would probably be better to read the file word by word and work with that. Create a variable that you call, I dunno, file_buffer that's large enough for general purpose long strings and copy from that, as it were. You can get size requirements from that. I know it's not ideal, but as I said, I use C++ file IO in general so some other heads around here probably know better tricks, what I suggested would require a lot of dynamic memory allocation...

    >> Cheers for clarifying up the whole "gets()" mystery.
    I said something similar once to my 7 year old nephew (not about gets(), mind) and he said ~"A mystery is something that nobody knows the answer to but a secret is something somebody knows the answer to. It's hard to find somebody who knows the answers". Damn kid's wiser than I am.
    Last edited by twomers; 09-15-2008 at 05:15 PM.

  6. #6
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    You could write a function:

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    
    char *fgetsline (int bytes, FILE *fst ) {       // you must free() the line returned!
            char buffer[bytes], *string;
                
            fgets(buffer,bytes,fst);
            string=malloc(strlen(buffer)+1);
            strncpy(string,buffer,strlen(buffer));
            return string;
    }
          
    
    
    int main () {
            char *line=fgetsline(4096,stdin);
            puts(line);
            free(line);
    }
    The first argument to fgetsline could be obtained from a file size or set to a ridiculous limit, since it will only exist that way for the duration of the function. I use GNU getline so I actually don't have to do this...but it seems to work...
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  7. #7
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    The above will work to read one line. But if you want a string to contain more than one line then you need to use realloc() in order to re-allocate memory to add each line to the string.

    1) You can read one by one character and use realloc() to allocate one more space. Then save the character in that space. You ll use a variable to increase the size each time by one.
    2) The above is the obvious thing you would want to do. Though it won't be efficient. Calling realloc() so many times isn't good. So you can read one line at a time, re-allocate memory and store the data there. You could even read every X number of characters, re-allocate memory and store them.
    3) The above describe how to get data from a file or stdin. If you want a string, thus no whitespace (space, newline, EOF) you can process the data accordingly. Use fgets() and check for whitespace, then reallocate memory and you have your solution.

  8. #8
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    char buffer[bytes],
    this is not C89

    strncpy(string,buffer,strlen(buffer));
    what about nul-terminator?
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  9. #9
    Registered User
    Join Date
    Sep 2008
    Location
    Nort East of Great Britain!
    Posts
    8

    Buffer Length

    I'm just working my way through your information and various approaches. I was pleasantly surprised by the depth and amount of information I got back!

    I wrote this little program below:

    Code:
    /* buffer_len.c
       16/09/08 12:03
       This saves input to a buffer, and returns length of input string.
    */
    
    #include <stdio.h>
    #include <string.h>
    
    /* Input Buffer Size */
    #define IBS 150
    
    void putn(int i) {
        printf("%d", i);   
    }
    
    int main() {
       char szbuf[IBS];
       fgets(szbuf, IBS, stdin );
       int fnl = strlen(szbuf);
       putn(fnl);
       
       getchar();
       return 0;
    }
    It reads a string from "stdin" into a buffer [100+'\0'], then outputs the "strings length".

    If I type in something really imaginative like:

    TEST

    The output will look like

    Code:
    TEST
    5
    Only 5 characters, including the '\0'. Where are my other 96? Is that memory still reserved for szbuf?

    Also,

    If I enter more than 100 characters, my program terminates/crashes. It just closes. Is that "fgets" ending the program? Or something else?

    Thanks again,

    Clint

  10. #10
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I can't make that code crash (I entered approximately 3.5 lines at 80 chars per line) - and yes, the space you have is still there in the string, you just don't use it. The strlen() function shows how much space is consumed by "TEST\n", which is 5 characters. A sixth character '\0' is marking the end of the string, so in total you have used 6 positions.

    Since IBS is 150, you have space for another 144 characters.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  11. #11
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    I assume you mean you enter more than 150 characters, not 100. In that case, fgets will read 150 characters and crash because it won't be able to enter the last \0. Right?
    If that is the case, you should read IBS-1 characters with fgets() so you have room for the \0
    EDIT: Looking again fgets(), I was wrong. It reads IBS-1 characters if you enter IBS as a parameter. So it shouldn't crash

  12. #12
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    OK, a wild guess.
    But did you actually put getchar() in the end so the window does not close in VS? I have done that that is why I am asking
    If that is the case then fgets() won't read everything, so getchar() will read a character and the window will close, since it won't "freeze" waiting for input, so it doesn't actually crush

  13. #13
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by C_ntua View Post
    I assume you mean you enter more than 150 characters, not 100. In that case, fgets will read 150 characters and crash because it won't be able to enter the last \0. Right?
    If that is the case, you should read IBS-1 characters with fgets() so you have room for the \0
    But that's what fgets does already - it allows "max-1" characters so that there is space for the terminating zero. For example, if I enter a very long string, I get 149 as the result from the above code. What you DON'T get is a newline at the end of the string, so you can actually determine that the string didn't fit.

    [Of course, it is possible that some old implementation of fgets() doesn't work correctly and casues some strange problems - although that seems rather unlikely for any commonly used compiler].

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  14. #14
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    Quote Originally Posted by matsp View Post
    But that's what fgets does already - it allows "max-1" characters so that there is space for the terminating zero. For example, if I enter a very long string, I get 149 as the result from the above code. What you DON'T get is a newline at the end of the string, so you can actually determine that the string didn't fit.

    [Of course, it is possible that some old implementation of fgets() doesn't work correctly and casues some strange problems - although that seems rather unlikely for any commonly used compiler].

    --
    Mats
    Yeah, that is why I edited my post? I ll bet my money that the window just closes in VS

  15. #15
    Registered User
    Join Date
    Sep 2008
    Location
    Nort East of Great Britain!
    Posts
    8
    Quote Originally Posted by C_ntua View Post
    I assume you mean you enter more than 150 characters, not 100. In that case, fgets will read 150 characters and crash because it won't be able to enter the last \0. Right?
    If that is the case, you should read IBS-1 characters with fgets() so you have room for the \0
    EDIT: Looking again fgets(), I was wrong. It reads IBS-1 characters if you enter IBS as a parameter. So it shouldn't crash
    Yes more than 150, sorry.

    Its not so much that its if I put in the same amount of characters as the buffer, its just exceeding the buffer in general. Be it one hundred times over, or just 1 character.

    I was just under the impression that fgets() would 'cope' with exceeding the buffer. In my first program (1st post, this thread) when I exceeded the buffer that fget() gave the input, then it simple "lopped" off the end of my message.

    But in this more recent example, it just closes instantly. I don't know whether that is because the program terminated itself gracefully, or just "bummed out" and was terminated by the OS etc.

    Quote Originally Posted by C_ntua View Post
    OK, a wild guess.
    But did you actually put getchar() in the end so the window does not close in VS? I have done that that is why I am asking
    If that is the case then fgets() won't read everything, so getchar() will read a character and the window will close, since it won't "freeze" waiting for input, so it doesn't actually crush

    I'm using the Bloodshed's Dev-C++ Compiler, and the only reason the "getchar()" is there is so it waits for input before closing. That means I get a chance to read any output from the program before my command prompt window closes.

    I should really comment that, its just easy to forget such things.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C++ ini file reader problems
    By guitarist809 in forum C++ Programming
    Replies: 7
    Last Post: 09-04-2008, 06:02 AM
  2. Replies: 4
    Last Post: 03-03-2006, 02:11 AM
  3. Classes inheretance problem...
    By NANO in forum C++ Programming
    Replies: 12
    Last Post: 12-09-2002, 03:23 PM
  4. creating class, and linking files
    By JCK in forum C++ Programming
    Replies: 12
    Last Post: 12-08-2002, 02:45 PM
  5. Warnings, warnings, warnings?
    By spentdome in forum C Programming
    Replies: 25
    Last Post: 05-27-2002, 06:49 PM

Tags for this Thread