Thread: Problem with K&R Exercise 1-23

  1. #1
    Registered User
    Join Date
    Oct 2013
    Posts
    5

    Problem with K&R Exercise 1-23

    So I am writing the solution [including nested comments and special cases, since I don't like half-assing]

    I wrote a string that has special cases.

    Code:
    printf("\"test/*iello*/"); /*hello*/ //hello
    It is coming out to

    Code:
    printf("\"test/*iello*/"); /
    My code is

    Code:
        void remcomments( char s[] )
        {
        	printf("Removing comments\n");
        	int slcflag, mlcflag, qflag;
        	int readindex, writeindex;
        	slcflag = mlcflag = readindex = qflag = writeindex = 0;
        	int loop = 0;
        	while( s[readindex] != EOF )
        	{
        		printf("Loop %d\n", loop);
        		if( s[readindex] == '\n' && mlcflag == 0 )
        		{
        			s[writeindex] = s[readindex];
        			slcflag = 0;
        			writeindex++;
        		}
        		else
        		{
        			if( qflag == 0 )
        			{
        				if( s[readindex] == '"' )
        				{
        					if( s[readindex - 1] != '\\' )
        					{
        						qflag = 1;
        					}
        					else if( s[readindex - 1] == '\\' && s[readindex - 2] == '\\' )
        					{
        						qflag = 1;
        					}
        				}
        				if( qflag == 0 )
        				{
        				    if( slcflag == 0 && mlcflag == 1 && s[readindex] == '*' && s[readindex + 1] == '/' )
        					{
        						readindex++;
        						mlcflag = 0;
        					}
        					else if( s[readindex] == '/' )
        					{
        						if( mlcflag == 0 && s[readindex + 1] == '/' )
        						{
        							readindex++;
        							slcflag = 1;
        						}
        						else if( slcflag == 0 && s[readindex + 1] == '*' )
        						{
        							readindex++;
        							mlcflag = 1;
        						}
        					}
        				}
        			}
        			else if( qflag == 1 )
        			{
        				if( s[readindex] == '"')
        				{
        					if( s[readindex - 1] != '\\')
        					{
        						qflag = 0;
        					}
        					else if( s[readindex - 1] == '\\' && s[readindex - 2] == '\\')
        					{
        						qflag = 0;
        					}
        				}
        			}
        			if( mlcflag == 0 && slcflag == 0 )
        			{
        				s[writeindex] = s[readindex];
        				writeindex++;
        			}
        		}	
        		loop++;
        		readindex++;
        		printf("end of loop");
        	}
        	s[writeindex] = '\0';
        }
    Where am I going wrong? I can't figure it out.

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    I think the error is very simple. Check around lines 34 through 38 of your program. When you are comming out of a multi-line comment, you are not moving the readindex far enough into the string, so consequently, you end up writing part of the comment.

    I noticed this while watching your function execute a few times. Here's the proof:
    Code:
    (gdb) c
    Continuing.
    end of loopLoop 33
    Hardware watchpoint 3: writeindex
    
    
    Old value = 27
    New value = 28
    remcomments (s=0x61fb28 "printf(\"\\\"test/*iello*/\"); /*hello*/ //hello") at remcomments.c:78
    78              loop++;
    (gdb) print readindex
    $13 = 35
    (gdb) p s+readindex
    $14 = 0x61fb4b "/ //hello"
    (gdb) p s+writeindex
    $15 = 0x61fb44 "*hello*/ //hello"
    (gdb) p slcflag
    $16 = 0
    (gdb) p mlcflag
    $17 = 0
    Notice where $14 and $15 start. Since both slcflag and mlcflag are 0, your function wrote the slash at the end of the comment here.
    Code:
    printf("\"test/*iello*/"); /*hello*/ //hello
    So once again if you bump readindex enough to skip past the whole comment token, I think you will have less of a problem.
    Last edited by whiteflags; 08-14-2016 at 06:31 PM.

  3. #3
    Registered User
    Join Date
    Oct 2013
    Posts
    5
    Quote Originally Posted by whiteflags View Post
    I think the error is very simple. Check around lines 34 through 38 of your program. When you are comming out of a multi-line comment, you are not moving the readindex far enough into the string, so consequently, you end up writing part of the comment.

    I noticed this while watching your function execute a few times. Here's the proof:
    Code:
    (gdb) c
    Continuing.
    end of loopLoop 33
    Hardware watchpoint 3: writeindex
    
    
    Old value = 27
    New value = 28
    remcomments (s=0x61fb28 "printf(\"\\\"test/*iello*/\"); /*hello*/ //hello") at remcomments.c:78
    78              loop++;
    (gdb) print readindex
    $13 = 35
    (gdb) p s+readindex
    $14 = 0x61fb4b "/ //hello"
    (gdb) p s+writeindex
    $15 = 0x61fb44 "*hello*/ //hello"
    (gdb) p slcflag
    $16 = 0
    (gdb) p mlcflag
    $17 = 0
    Notice where $14 and $15 start. Since both slcflag and mlcflag are 0, your function is going to write the slash at the end of the comment here.
    Code:
    printf("\"test/*iello*/"); /*hello*/ //hello
    To fix it, try moving readindex ahead by two characters instead of one when you detect */ ; this will properly skip over those elements in the text.
    I did that before, but I'm pretty sure there's an edge case where I'm going to miss something. So instead I added a skip flag at the line I'd add the readindex++ and at the write section at the end I just make sure skip flag == 0, and then after that I set skip flag to 0. I'm pretty sure reading ahead twice is going to skip a char since at the end of the function I increment the read index also.

  4. #4
    Registered User
    Join Date
    Oct 2013
    Posts
    5
    You're right. I just did that and had a much more complex test and hell, it worked Thanks a lot for your help. Now I just want to figure out how I can remove the newline character if the only thing on that line was a comment. But I think that is going to make my program much larger.

    Quote Originally Posted by whiteflags View Post
    I think the error is very simple. Check around lines 34 through 38 of your program. When you are comming out of a multi-line comment, you are not moving the readindex far enough into the string, so consequently, you end up writing part of the comment.

    I noticed this while watching your function execute a few times. Here's the proof:
    Code:
    (gdb) c
    Continuing.
    end of loopLoop 33
    Hardware watchpoint 3: writeindex
    
    
    Old value = 27
    New value = 28
    remcomments (s=0x61fb28 "printf(\"\\\"test/*iello*/\"); /*hello*/ //hello") at remcomments.c:78
    78              loop++;
    (gdb) print readindex
    $13 = 35
    (gdb) p s+readindex
    $14 = 0x61fb4b "/ //hello"
    (gdb) p s+writeindex
    $15 = 0x61fb44 "*hello*/ //hello"
    (gdb) p slcflag
    $16 = 0
    (gdb) p mlcflag
    $17 = 0
    Notice where $14 and $15 start. Since both slcflag and mlcflag are 0, your function wrote the slash at the end of the comment here.
    Code:
    printf("\"test/*iello*/"); /*hello*/ //hello
    So once again if you bump readindex enough to skip past the whole comment token, I think you will have less of a problem.

  5. #5
    Registered User
    Join Date
    Oct 2013
    Posts
    5
    So, there was an edge case where it would error out.
    Code:
    "printf(/*test*/\"\\\"test/*iello*/  //more\"/*test*/);"
    It will jump over the \, then next it would trigger an "out of quote" state, and output printf(""test, since it's thinking the /* in the string is a valid beginning of mlc marker.
    I fixed it though.
    I fixed it so hard, now it can even skip blank lines.
    and I don't have to do readindex++.
    The new code:
    Code:
    void remcomments(char s[])
    {
    	int slcflag, mlcflag, qflag;
    	int readindex, writeindex;
    	slcflag = mlcflag = qflag = readindex = writeindex = 0;
    	while (s[readindex] != '\0')
    	{
    		if (slcflag == 1 && s[readindex] == '\n')
    			slcflag = 0;
    		if (mlcflag == 0)
    		{
    			if (s[readindex] == '"' && (s[readindex - 1] != '\\' || (s[readindex - 1] == '\\' && s[readindex - 2] == '\\')))
    			{
    				if (qflag == 0) qflag = 1;
    				else qflag = 0;
    			}
    			else if (qflag == 0)
    			{
    				if (s[readindex] == '/' && s[readindex + 1] == '/')
    				{
    					readindex++;
    					slcflag = 1;
    				}
    				if (s[readindex] == '/' && s[readindex + 1] == '*')
    				{
    					readindex++;
    					mlcflag = 1;
    				}
    			}
    			if (slcflag == 0 && mlcflag == 0 && !(s[readindex] == '\n' && s[writeindex - 1] == '\n'))
    			{
    				s[writeindex] = s[readindex];
    				writeindex++;
    			}
    		}
    		else if (s[readindex - 1] == '*' && s[readindex] == '/')
    				mlcflag = 0;
    		readindex++;
    	}
    	s[writeindex] = '\0';
    }
    Quote Originally Posted by CjStaal View Post
    I did that before, but I'm pretty sure there's an edge case where I'm going to miss something. So instead I added a skip flag at the line I'd add the readindex++ and at the write section at the end I just make sure skip flag == 0, and then after that I set skip flag to 0. I'm pretty sure reading ahead twice is going to skip a char since at the end of the function I increment the read index also.

  6. #6
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    Code:
    if (slcflag == 0 && mlcflag == 0 && !(s[readindex] == '\n' && s[writeindex - 1] == '\n'))
    This looks like it works, until you consider a string that starts with \n. In that case, this test gets to s[writeindex - 1], and since writeindex == 0, you read s[-1] and potentially crash.

    It is much safer to write the extra newlines and overwrite duplicates instead.
    Code:
    if (!slcflag && !mlcflag) 
    {
        s[writeindex] = s[readindex];
        writeindex++;
        if (writeindex >= 2 && s[writeindex - 2] == '\n' && s[writeindex - 1] == '\n')
        {
            writeindex--;
        }
    }
    In one of the simplest cases, "\n//comments\n", the result should be "\n" because of the fact that s was null terminated where the last character is, later.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C++ Exercise Problem: Find word in text that the user has entered
    By DecoratorFawn82 in forum C++ Programming
    Replies: 5
    Last Post: 09-30-2015, 12:20 PM
  2. Problem with extremely beginner exercise
    By Molokai in forum C++ Programming
    Replies: 11
    Last Post: 05-08-2007, 09:05 AM
  3. C++ exercise problem
    By Pantheon in forum C++ Programming
    Replies: 13
    Last Post: 09-07-2006, 09:30 AM
  4. C++ Primer 4th Edition, Problem with Exercise
    By Kaidao in forum C++ Programming
    Replies: 4
    Last Post: 07-15-2006, 11:13 AM
  5. new to c. problem with exercise.
    By cakewalkr7 in forum C Programming
    Replies: 4
    Last Post: 04-21-2002, 08:37 PM

Tags for this Thread