Thread: Need regexp help

  1. #31
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by MK27 View Post
    okay, okay, "linein" is probably the first function I ever wrote. Or close to it. And the "Teach Yourself C in 21 Days" I got from the library (1990) doesn't mention fgets, but it does warn against the use of "gets".
    I've never read the book, so that's entirely possible.

    Quote Originally Posted by Mk27 View Post
    Excuses aside, however, I will defend myself by noting that with fgets "You must supply count characters worth of space in [the char pointer]" (GNU documentation). So linein seems more versatile to me -- ie. without testing it looks like you have to do your own malloc using fgets.

    So long live linein. Or else I'm still wrong.
    There are two advantages to fgets in this situation (from my perspective): 1. Since I am managing my own memory, I know exactly where everything is. I know when my memory is created, and when I need to free it. I can't do a memory leak every time through a for-loop, as in the example above with a memory leak when reading every single line of a file, without making it tremendously obvious that malloc is being called 10000000 times and free once. I get to see the matching malloc/free calls, rather than malloc being literally three-times-removed from my code. I know that I have a temporary buffer that I have to strcpy with, but I can then reuse immediately for the next line. (I realize that programming has Moved On from any of this sort of thing with memory management, gc, etc. Fortunately I have a second point.) 2. Using fgets (or whichever other standard function you choose to use) means that any computer programmer off the street -- such as, I don't know, someone who's having trouble with a regex, for instance -- knows how it works, what's necessary to use it, and exactly what it does and does not do. No surprises like "oh and it calls malloc for you too".

    Edit: The advantage of getline is what I alluded to at the end of point one -- since each call gets its own slice of memory, you can just store pointers in an array or whatever you want -- no strcpying! Unfortunately, that isn't what we ended up doing.
    Last edited by tabstop; 09-05-2008 at 09:59 PM.

  2. #32
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300

    Talking 32

    Obviously, I'm not trying to convince the world that linein is really the future, esp. since it relies on getline which is GNU specific I think.

    Still, just because you didn't call malloc yourself doesn't mean you can't call free if you know malloc is being called -- which is why in post #21 I corrected myself by adding a "free" call in linefile. Which only needs to be called once anyway, since even if the while loop does iterate 1000000 times, it just reassigns to the same local variable (line).

    As for our friend needing regexp help, I think he got it...
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  3. #33
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by MK27;784975
    Still, just because you didn't call malloc yourself doesn't mean you can't call [B
    free[/B] if you know malloc is being called -- which is why in post #21 I corrected myself by adding a "free" call in linefile. Which only needs to be called once anyway, since even if the while loop does iterate 1000000 times, it just reassigns to the same local variable (line).
    Eh, you only have to type it once, but it needs to be inside the loop (if getline happens 1000000 times, free needs to happen 1000000 times, or else you've leaked the first 999999 of them).

  4. #34
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by tabstop View Post
    Eh, you only have to type it once, but it needs to be inside the loop (if getline happens 1000000 times, free needs to happen 1000000 times, or else you've leaked the first 999999 of them).
    Well. This is probably due to my misunderstanding of this (boldface) statement in The GNU C Library Reference Manual:

    Do not expect to find any data (such as a pointer to the next block in a chain of blocks) in the block after freeing it.

    to mean that free just plain eliminates the variable. So I thought this would fail
    Code:
    int main() {
            char *this;
            while (1) {
                    this=malloc(5);
                    strcpy(this,"that");
                    puts(this);
                    free(this);
            }   
    }
    which it doesn't. More thinking for me to do...thanx tabstop

    ps. by "leaking" memory you mean using excessively unnessesarily -- or could something genuinely bad happen, like an overwrite? And isn't that counter-intuitive -- shouldn't multiple malloc's on the the same variable deal with the same starting block? Or is that impossible because the size will vary in the stack (or wherever)?
    Last edited by MK27; 09-05-2008 at 11:16 PM. Reason: ps

  5. #35
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by MK27 View Post
    ps. by "leaking" memory you mean using excessively unnessesarily -- or could something genuinely bad happen, like an overwrite?
    All free does is say "the memory on the other end of this pointer -- I don't need it any more, so you can have it back". It doesn't invalidate the variable name; you shouldn't expect anything useful on the other end until you malloc again, though.

    A memory leak is when you have memory allocated somewhere (you called malloc, or getline, or whatever) and then you throw away the pointer to it (in this case, by assigning a new value to the pointer). The memory is still ours, it's still patiently holding our string, but we've lost all ability to reference that memory (since we don't have a pointer to it any more). That memory is gone until program end/reboot (I'm pretty sure program end on most modern OS returns the entire program's memory space to the OS). Do it enough times and you'll run out of memory. So in our hypothetical million-line file, if each line was 80 characters, we would have leaked 80 megs of RAM from the system. To prevent this, we need to free(line) each time through -- we get a new piece of memory, which we need to give back before we get the next one.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Php regexp --> C++
    By michkine in forum C++ Programming
    Replies: 8
    Last Post: 02-07-2005, 01:19 PM