# Thread: Problem with K&R Exercise 1-23

1. ## Problem with K&R Exercise 1-23

So I am writing the solution [including nested comments and special cases, since I don't like half-assing]

I wrote a string that has special cases.

Code:
`printf("\"test/*iello*/"); /*hello*/ //hello`
It is coming out to

Code:
`printf("\"test/*iello*/"); /`
My code is

Code:
```    void remcomments( char s[] )
{
int slcflag, mlcflag, qflag;
slcflag = mlcflag = readindex = qflag = writeindex = 0;
int loop = 0;
while( s[readindex] != EOF )
{
printf("Loop %d\n", loop);
if( s[readindex] == '\n' && mlcflag == 0 )
{
slcflag = 0;
writeindex++;
}
else
{
if( qflag == 0 )
{
if( s[readindex] == '"' )
{
if( s[readindex - 1] != '\\' )
{
qflag = 1;
}
else if( s[readindex - 1] == '\\' && s[readindex - 2] == '\\' )
{
qflag = 1;
}
}
if( qflag == 0 )
{
if( slcflag == 0 && mlcflag == 1 && s[readindex] == '*' && s[readindex + 1] == '/' )
{
mlcflag = 0;
}
else if( s[readindex] == '/' )
{
if( mlcflag == 0 && s[readindex + 1] == '/' )
{
slcflag = 1;
}
else if( slcflag == 0 && s[readindex + 1] == '*' )
{
mlcflag = 1;
}
}
}
}
else if( qflag == 1 )
{
if( s[readindex] == '"')
{
if( s[readindex - 1] != '\\')
{
qflag = 0;
}
else if( s[readindex - 1] == '\\' && s[readindex - 2] == '\\')
{
qflag = 0;
}
}
}
if( mlcflag == 0 && slcflag == 0 )
{
writeindex++;
}
}
loop++;
printf("end of loop");
}
s[writeindex] = '\0';
}```
Where am I going wrong? I can't figure it out.

2. I think the error is very simple. Check around lines 34 through 38 of your program. When you are comming out of a multi-line comment, you are not moving the readindex far enough into the string, so consequently, you end up writing part of the comment.

I noticed this while watching your function execute a few times. Here's the proof:
Code:
```(gdb) c
Continuing.
end of loopLoop 33
Hardware watchpoint 3: writeindex

Old value = 27
New value = 28
remcomments (s=0x61fb28 "printf(\"\\\"test/*iello*/\"); /*hello*/ //hello") at remcomments.c:78
78              loop++;
\$13 = 35
\$14 = 0x61fb4b "/ //hello"
(gdb) p s+writeindex
\$15 = 0x61fb44 "*hello*/ //hello"
(gdb) p slcflag
\$16 = 0
(gdb) p mlcflag
\$17 = 0```
Notice where \$14 and \$15 start. Since both slcflag and mlcflag are 0, your function wrote the slash at the end of the comment here.
Code:
`printf("\"test/*iello*/"); /*hello*/ //hello`
So once again if you bump readindex enough to skip past the whole comment token, I think you will have less of a problem.

3. Originally Posted by whiteflags
I think the error is very simple. Check around lines 34 through 38 of your program. When you are comming out of a multi-line comment, you are not moving the readindex far enough into the string, so consequently, you end up writing part of the comment.

I noticed this while watching your function execute a few times. Here's the proof:
Code:
```(gdb) c
Continuing.
end of loopLoop 33
Hardware watchpoint 3: writeindex

Old value = 27
New value = 28
remcomments (s=0x61fb28 "printf(\"\\\"test/*iello*/\"); /*hello*/ //hello") at remcomments.c:78
78              loop++;
\$13 = 35
\$14 = 0x61fb4b "/ //hello"
(gdb) p s+writeindex
\$15 = 0x61fb44 "*hello*/ //hello"
(gdb) p slcflag
\$16 = 0
(gdb) p mlcflag
\$17 = 0```
Notice where \$14 and \$15 start. Since both slcflag and mlcflag are 0, your function is going to write the slash at the end of the comment here.
Code:
`printf("\"test/*iello*/"); /*hello*/ //hello`
To fix it, try moving readindex ahead by two characters instead of one when you detect */ ; this will properly skip over those elements in the text.
I did that before, but I'm pretty sure there's an edge case where I'm going to miss something. So instead I added a skip flag at the line I'd add the readindex++ and at the write section at the end I just make sure skip flag == 0, and then after that I set skip flag to 0. I'm pretty sure reading ahead twice is going to skip a char since at the end of the function I increment the read index also.

4. You're right. I just did that and had a much more complex test and hell, it worked Thanks a lot for your help. Now I just want to figure out how I can remove the newline character if the only thing on that line was a comment. But I think that is going to make my program much larger.

Originally Posted by whiteflags
I think the error is very simple. Check around lines 34 through 38 of your program. When you are comming out of a multi-line comment, you are not moving the readindex far enough into the string, so consequently, you end up writing part of the comment.

I noticed this while watching your function execute a few times. Here's the proof:
Code:
```(gdb) c
Continuing.
end of loopLoop 33
Hardware watchpoint 3: writeindex

Old value = 27
New value = 28
remcomments (s=0x61fb28 "printf(\"\\\"test/*iello*/\"); /*hello*/ //hello") at remcomments.c:78
78              loop++;
\$13 = 35
\$14 = 0x61fb4b "/ //hello"
(gdb) p s+writeindex
\$15 = 0x61fb44 "*hello*/ //hello"
(gdb) p slcflag
\$16 = 0
(gdb) p mlcflag
\$17 = 0```
Notice where \$14 and \$15 start. Since both slcflag and mlcflag are 0, your function wrote the slash at the end of the comment here.
Code:
`printf("\"test/*iello*/"); /*hello*/ //hello`
So once again if you bump readindex enough to skip past the whole comment token, I think you will have less of a problem.

5. So, there was an edge case where it would error out.
Code:
`"printf(/*test*/\"\\\"test/*iello*/  //more\"/*test*/);"`
It will jump over the \, then next it would trigger an "out of quote" state, and output printf(""test, since it's thinking the /* in the string is a valid beginning of mlc marker.
I fixed it though.
I fixed it so hard, now it can even skip blank lines.
and I don't have to do readindex++.
The new code:
Code:
```void remcomments(char s[])
{
int slcflag, mlcflag, qflag;
slcflag = mlcflag = qflag = readindex = writeindex = 0;
while (s[readindex] != '\0')
{
if (slcflag == 1 && s[readindex] == '\n')
slcflag = 0;
if (mlcflag == 0)
{
if (s[readindex] == '"' && (s[readindex - 1] != '\\' || (s[readindex - 1] == '\\' && s[readindex - 2] == '\\')))
{
if (qflag == 0) qflag = 1;
else qflag = 0;
}
else if (qflag == 0)
{
if (s[readindex] == '/' && s[readindex + 1] == '/')
{
slcflag = 1;
}
if (s[readindex] == '/' && s[readindex + 1] == '*')
{
mlcflag = 1;
}
}
if (slcflag == 0 && mlcflag == 0 && !(s[readindex] == '\n' && s[writeindex - 1] == '\n'))
{
writeindex++;
}
}
else if (s[readindex - 1] == '*' && s[readindex] == '/')
mlcflag = 0;
}
s[writeindex] = '\0';
}```
Originally Posted by CjStaal
I did that before, but I'm pretty sure there's an edge case where I'm going to miss something. So instead I added a skip flag at the line I'd add the readindex++ and at the write section at the end I just make sure skip flag == 0, and then after that I set skip flag to 0. I'm pretty sure reading ahead twice is going to skip a char since at the end of the function I increment the read index also.

6. Code:
`if (slcflag == 0 && mlcflag == 0 && !(s[readindex] == '\n' && s[writeindex - 1] == '\n'))`
This looks like it works, until you consider a string that starts with \n. In that case, this test gets to s[writeindex - 1], and since writeindex == 0, you read s[-1] and potentially crash.

It is much safer to write the extra newlines and overwrite duplicates instead.
Code:
```if (!slcflag && !mlcflag)
{