Thread: Question about K&R program

  1. #1
    Widdle Coding Peon Aerie's Avatar
    Join Date
    Dec 2004
    Posts
    115

    Question about K&R program

    After having been busy a couple months, I'm going back through K&R to get back in the habit of typing C, and to get used to the conventions/syntax, etc... so I can pick up where I left off again...

    Anyway, I found a weird set of behavior in one program I typed out from the book.
    I haven't modified it at all(well, I added a \n in the output to make it more sane), but other than that the code is exactly as it appeared in the book, minus comments.

    Code:
    #include <stdio.h>
    #define MAXLINE 100
    
    int getline(char [], int);
    void copy(char [], char []);
    
    int main(void)
    {
    	int len, max;
    	char line[MAXLINE];
    	char longest[MAXLINE];
    
    	max = 0;
    	while((len = getline(line, MAXLINE)) > 0)
    		if(len > max) {
    			max = len;
    			copy(longest, line);
    		}
    
    	if(max)
    		printf("\n%s\n", longest);
    	return 0;
    }
    
    int getline(char s[], int lim)
    {
    	int c, i;
    
    	for(i = 0; i< lim-1 && (c = getchar()) != EOF && c != '\n'; i++)
    		s[i] = c;
    	if(c == '\n') {
    		s[i] = '\0';
    		++i;
    	}
    	return i;
    }
    
    void copy(char to[], char from[])
    {
    	int i;
    
    	i = 0;
    	while((to[i] = from[i]) != '\0')
    		++i;
    
    }
    Now, normally this code behaves exactly as I expect it to. However, there is one exception:

    If I type the program name and then press Ctrl+D, the program exits immediately, as expected.

    If I type any text, then press Enter, then Ctrl+D, it prints the longest line, then exits.

    However, if I type any amount of text, including any number of carriage returns, but _don't_ press Enter at the end, and then press Ctrl+D, nothing happens. If I press Ctrl+D another 2 times(total of 3 presses), this happens:

    Code:
    $ ./ex1-16
    abc  #I'm pressing abc^D^D^D; the first two appear to do nothing.
    abc@ô@ÿwÐôÿ¿Ðd@/
    What I want to know is, how does that garbage get into the array? In other words, what's actually going on underneath the code?
    Last edited by Aerie; 04-23-2005 at 03:23 AM.
    I live in a giant bucket.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > if(c == '\n')
    There are several ways out of this function which don't result in a \0 being written in the correct place to mark the end of the string.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Widdle Coding Peon Aerie's Avatar
    Join Date
    Dec 2004
    Posts
    115
    I kind of figured that out. However, not knowing much about how the underlying model of buffered input works(ie., at what points the value of c will be checked, and what that value will be at that exact time), I'm pretty much in the dark about which of those exact ways is causing this behavior.
    I live in a giant bucket.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Well look at your condition in your for loop - what values of c cause it to exit?

    > However, not knowing much about how the underlying model of buffered input works
    It doesn't matter how it works, the code is still broken.
    It assumes that EOF is only signalled at the start of a line (which would cause getline() to return 0, and safely exit the while loop in main.
    An EOF at any other time returns a broken line to main
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Widdle Coding Peon Aerie's Avatar
    Join Date
    Dec 2004
    Posts
    115
    That comes in under "exactly when c is examined."

    See, the way I've always had for loops explained to me is that the process occurs like this:
    -If first iteration, perform initialization
    -Check conditions until you hit a true, or till you run out of conditions to test.
    -If true, perform instructions in loop
    -Perform any incrementational/post-loop instructions(in this case, i++)

    So my understanding was that this exact sequence would happen, assuming that all conditions remained true:
    1. Return of getchar() is copied to c
    2. c is compared to EOF
    3. c is compared to '\n'
    4. s[i] is set to c
    5. i is incremented by 1.

    Since this is my understanding of what occurs at each iteration of the loop, I don't understand how EOF would only be checked for at the beginning of a specific line.

    Can you explain what is _really_ going on in this sequence, if my understanding is somehow mistaken?

    And -- this isn't "my" code. I wouldn't have structured it quite this way anyhow, though it is possible I would have made the exact same mistake, since I obviously don't fully grasp what is really going on here.
    I live in a giant bucket.

  6. #6
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > See, the way I've always had for loops explained to me is that the process occurs like this:
    Yes, that's it.

    > this exact sequence would happen, assuming that all conditions remained true:
    Also true.

    > I don't understand how EOF would only be checked for at the beginning of a specific line.
    EOF is checked at every character read by getchar()

    > If I type any text, then press Enter, then Ctrl+D
    So the sequence of values returned by getchar() would be say
    h
    e
    l
    l
    o
    \n
    EOF

    It is only by some happy coincidence that you managed to have "i=0" in your loop, so that when the loop immediately exits when "(c = getchar()) != EOF" fails, you hit the "return i;" with the correct value returned to the caller.

    > but _don't_ press Enter at the end,
    So the sequence of values returned by getchar() would be say
    h
    e
    l
    l
    EOF

    Since i will have been incremented to 4 (and c==EOF), you both skip the s[i]='\0'; to end the string (you only do this if c=='\n'), AND you end up returning the value 4 (return i;) to the caller.

    Since the caller is using the convention that non-zero results mean a valid line "while((len = getline(line, MAXLINE)) > 0)", the failure to add a \0 in the correct place means a whole bunch of junk gets printed as well.

    > I'm pressing abc^D^D^D; the first two appear to do nothing.
    This is nothing to do with your code, it is a feature of your terminal driver which is buffering up data before passing it to your program. Normally, it expects an EOF indication at the start of a line (you only need one ^D), but you need to be persistent if you want EOF at some other time.
    This however doesn't matter to your code, since getchar() simply sees all the characters and then an EOF.
    At the moment, the code works best when that EOF immediately follows a newline. At any other time, an unterminated string is created.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  7. #7
    Widdle Coding Peon Aerie's Avatar
    Join Date
    Dec 2004
    Posts
    115
    Oh, I see. I guess I stared at it so long I totally stopped seeing what was going on, and was seeing some bizarre internal representation of what I thought was going on.

    Thanks for explaining it out, helped jar me back into the real world a bit...

    Edit: hopefully silliness such as this on my part will become more rare as I start being able to focus on the logic underlying the code, and not the mere animal act of getting the syntax to express what I actually want to happen...
    Last edited by Aerie; 04-23-2005 at 01:27 PM.
    I live in a giant bucket.

  8. #8
    Widdle Coding Peon Aerie's Avatar
    Join Date
    Dec 2004
    Posts
    115
    This is probably ugly and awful, but I tried to write a variant version of getline that would operate the same way as far as the program calling it was concerned, but was a little more robust.

    I tried to stick using the same sorts of techniques that the book was using in surrounding texts, though I admit to cheating a bit with the use of break.

    Code:
    int getline(char charbox[], int lim)
    {
    	int i, c;
    	
    	i = 0;
    	while((c = getchar()) != EOF) {
    		if(c == '\n') {
    			break;
    		}
    		else if(i >= lim - 1)
    			break;
    		else {
    			charbox[i] = c;
    			i++;
    		}
    	}
    
    	charbox[i] = '\0';
    
    	if(c != '\n' && i == 0)
    		return -1;
    	else
    		return i;
    }
    Edit: slightly better version. No longer returns the same value for an empty line as it does for no input at all. Now returns -1 for no input, 0 for an empty line, and a positive number for anything else.
    Last edited by Aerie; 04-24-2005 at 03:17 AM.
    I live in a giant bucket.

  9. #9
    Widdle Coding Peon Aerie's Avatar
    Join Date
    Dec 2004
    Posts
    115
    Here, by the way, is the solution I came up with for the exercise in question... It's really a step backwards for me in terms of actual information, these early chapters, but I'm learning all sorts of trip-points that would really screw me over later on if I didn't study now...

    Code:
    #include <stdio.h>
    #define MAXLINE 1024
    
    int main(void)
    {
    	int lim, c;
    	char charbox[MAXLINE];
    
    	lim = MAXLINE;
    
    	while((c = getline(charbox, MAXLINE)) >= 0) {
    		if(c >= 80)
    			printf("\n%s\n", charbox);
    		else
    			printf("Not long enough...(%d)\n", c);
    	}
    	return 0;
    }
    
    int getline(char charbox[], int lim)
    {
    	int i, c, d;
    	
    	i = 0;
    
    	printf("\nInput text, press Enter when done.\n");
    
    	while((c = getchar()) != EOF) {
    		if(c == '\n') {
    			break;
    		}
    		else if(i >= lim - 1) {
    			while((d = getchar()) != EOF && d != '\n')
    				;
    			break;
    		}
    		else {
    			charbox[i] = c;
    			i++;
    		}
    	}
    
    	charbox[i] = '\0';
    
    	if(c != '\n' && i == 0)
    		return -1;
    	else
    		return i;
    }
    If you want to chew me out about how stupid/lame/broken my solution is, have at; feel welcome. Just... tell me exactly what I -should- be doing, and how what I actually did differs. Don't tell me how to do it, unless I ask, because I want to do the actual problem solving myself.

    Thanks in advance.

    Edit: the exercise is to echo all input lines that exceed 80, of course.

    One final question: should I post solutions to subsequent exercises in this thread, or create a new thread for each, or maybe a new thread every day or so of the solutions I do then?

    I want to show off my work so I can hear what I'm doing wrong, but I don't want to clutter up the forum.
    I live in a giant bucket.

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Code:
    	if(c == '\n') {
    		s[i] = '\0';
    		++i;
    	} else {
    		s[i] = '\0';
    	}
    Perhaps?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  11. #11
    Widdle Coding Peon Aerie's Avatar
    Join Date
    Dec 2004
    Posts
    115
    A little context, please.

    I'm probably just being dense, but I don't see what you're hinting at.
    I live in a giant bucket.

  12. #12
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Well the whole problem is that in some cases, it's possible to return with a string which has no \0
    This is just making sure it always has a \0
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  13. #13
    Widdle Coding Peon Aerie's Avatar
    Join Date
    Dec 2004
    Posts
    115
    The only time I can see it not containing a '\0' is when it's empty anyway, in which case the return value specifies that the contents are to be ignored.

    I can code it to make SURE there is ALWAYS an in-bounds '\0', if that's considered good form, but I figured it was sort of silly to take the trouble if I could just signal the whole array was full of junk.
    I live in a giant bucket.

  14. #14
    Widdle Coding Peon Aerie's Avatar
    Join Date
    Dec 2004
    Posts
    115
    And come to think of it, there is an unqualified line that sets charbox[i] to '\0', and since i should never go below 0, and shouldn't ever be incremented beyond bounds, I thought that there wasn't any way for that to occur.
    I live in a giant bucket.

  15. #15
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > I can code it to make SURE there is ALWAYS an in-bounds '\0', if that's considered good form
    Well it helps to head bugs "off at the pass" as it were, by eliminating a potential opportunity for a bug to arise.
    Besides, the code to do it all the time (even if it is slightly redundant occasionally) is a lot simpler than code which specifically avoids the work when it technically doesn't matter.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Password Program Question
    By SirTalkAlot415 in forum C++ Programming
    Replies: 13
    Last Post: 11-06-2007, 12:35 PM
  2. Random Question Assign Program
    By mikeprogram in forum C++ Programming
    Replies: 6
    Last Post: 11-17-2005, 10:04 PM
  3. Complete n00b Question, Client -> Sever MMO Program?
    By Zeusbwr in forum Networking/Device Communication
    Replies: 4
    Last Post: 07-28-2005, 08:33 PM
  4. Question type program for beginners
    By Kirdra in forum C++ Programming
    Replies: 7
    Last Post: 09-15-2002, 05:10 AM
  5. Replies: 8
    Last Post: 03-26-2002, 07:55 AM