[C] remove comments

This is a discussion on [C] remove comments within the C Programming forums, part of the General Programming Boards category; K&R2 1-23: Write a program to remove all comments from a C program. Don't forget to handle quoted strings and ...

  1. #1
    Registered User
    Join Date
    Jul 2009
    Location
    Croatia
    Posts
    272

    [C] remove comments

    K&R2 1-23: Write a program to remove all comments from a C program.
    Don't forget to handle quoted strings and character constants
    properly. C comments do not nest.

    So i have to remove all comments from a C program.

    I dont understand how i'm supposed to handle quoted strings and character constants. Delete them aswell or what do they mean?

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    21,401
    I think that it means is that you cannot blindly remove the contents of strings, even if they appear to contain comments.
    C + C++ Compiler: MinGW port of GCC
    Version Control System: Bazaar

    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    Jul 2009
    Location
    Croatia
    Posts
    272
    So basicaly, i have to have 4 states - normal, comment, quoted string, and chracter constant and make sure i only delete comments which i find from normal state.

    Or is there an easier fix to this problem?

  4. #4
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    21,401
    Well, that sounds about right to me, actually.
    C + C++ Compiler: MinGW port of GCC
    Version Control System: Bazaar

    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  5. #5
    Registered User
    Join Date
    Jul 2009
    Location
    Croatia
    Posts
    272
    But im interested about the constant state.

    I dont think a comment start can fit into a constant:

    If you write '/*' debuger reports an error. So why am i supposed to bother with the constant part?

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    21,401
    Quote Originally Posted by Tool
    I dont think a comment start can fit into a constant:

    If you write '/*' debuger reports an error. So why am i supposed to bother with the constant part?
    There are multi-character constants, though I am not well read about them.
    C + C++ Compiler: MinGW port of GCC
    Version Control System: Bazaar

    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    Registered User
    Join Date
    Jul 2009
    Location
    Croatia
    Posts
    272
    I know about escape char constants, example

    '\n'.

    But that only works for escape character \ followed by any character.

    I dont know if there are any other multi char consts, are there? One which could be filled with a comment?

  8. #8
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    I would say both constants and defines must use QUOTED string literals, so all you have to do is ignore "comments" that occur inside quotes.

    AFAIK you cannot extent a quote over multiple lines in C code. Also, a single quote will not escape a comment*, so all you have to worry about is double quoted strings on a single line. I guess the thing about constants and defines is just to draw your attention to the issue.

    * '\/*' is nonsense anyway, and it WILL open a comment.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    21,401
    Quote Originally Posted by Tool
    I know about escape char constants, example

    '\n'.

    But that only works for escape character \ followed by any character.
    Yes, but that is nonetheless a single character constant.

    Quote Originally Posted by Tool
    I dont know if there are any other multi char consts, are there?
    The standard allows for them.

    Quote Originally Posted by Tool
    One which could be filled with a comment?
    I have my doubts. I think MK27's advice is sound: just concentrate on string literals. You can get back to handling multi-character constants '/*' and '*/' (and '//' if you are dealing with C99) as a remote possibility for extra credit or something later.

    Quote Originally Posted by MK27
    AFAIK you cannot extent a quote over multiple lines in C code.
    Besides automatic concatenation, a string literal can span multiple lines in the source code by using a backslash at the end of the lines that it spans (other than the last). However, if I remember correctly this would become a single line after preprocessing.
    C + C++ Compiler: MinGW port of GCC
    Version Control System: Bazaar

    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #10
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by laserlight View Post
    Besides automatic concatenation, a string literal can span multiple lines in the source code by using a backslash at the end of the lines that it spans (other than the last). However, if I remember correctly this would become a single line after preprocessing.
    Point.

    Code:
    char eg = "this and \
         that";
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  11. #11
    Registered User
    Join Date
    Jul 2009
    Location
    Croatia
    Posts
    272
    * '\/*' does not open a comment in my compiler.

    So the only state i have to worry about is quote? And if im allready in the state quote, and i reach a \n newline signal, i have to return into normal state therefore, right? Only if there's \ and then \n, then i dont have to, right?

    Since i wrote exercise 1-24 today, i didn't include that going into a newline after being in double quote state throws you out of double quote state.
    Should i add that part aswell?


    Sorry. But what are string literals?
    Last edited by Tool; 11-14-2009 at 11:29 AM.

  12. #12
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Tool View Post
    * '\/*' does not open a comment in my compiler.
    Well, it is some kind of error or mistake in any case.

    So the only state i have to worry about is quote? And if im allready in the state quote, and i reach a \n newline signal, i have to return into normal state therefor, right? Only if there's \ and then \n, then i dont have to, right?
    The first one would also be some kind of error/mistake. The ONLY exception to that would be '"' (single quoted double quote). I don't think you should have to account for how the code could appear if it is not going to compile properly anyhow.

    String literals are strings which appear literally "as is" in the code. In C these are always double quoted (so a string literal could be a define or a variable assignment). AFAIK, the only place you can escape comments is in a string literal. Pretty nearly positive.
    Last edited by MK27; 11-14-2009 at 11:33 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  13. #13
    Registered User
    Join Date
    Jul 2009
    Location
    Croatia
    Posts
    272
    The first one would also be some kind of error/mistake. The ONLY exception to that would be '"' (single quoted double quote). I don't think you should have to account for how the code could appear if it is not going to compile properly anyhow.
    Can you please simplify what you're saying.

    I don't understand a thing what you said here. What would be an error?

    What do i have to deal with here then.

    I have to make a parser with states. States are normal and double_quote, you said i should ignore single quote constants.

    Only from the normal state i delete comments.

    The only exception for the double quote state is escape sequence \, which i know how it works and i will add it.

    Other then that, am i supposed to take care of any other forms of exceptions?

    AFAIK, the only place you can escape comments is in a string literal. Pretty nearly positive.
    Yes.
    Last edited by Tool; 11-14-2009 at 11:57 AM.

  14. #14
    Registered User
    Join Date
    Apr 2006
    Posts
    2,021
    What you need to do is come up with a list of test cases that you want to support. The list should include all the complex possibility with single quotes, double quotes, multi line comments and any other possibilities you can think of. Do you need to worry about ??/ used in place of \? What about C++/C99 single line comments? If yes, write a few examples.

    Only once you have that can you start thinking about how many state you need, and what constitutes a transition between them. Though I suspect that you're original four states are correct and all needed. Five if you support // comments.
    Last edited by King Mir; 11-14-2009 at 12:22 PM.
    It is too clear and so it is hard to see.
    A dunce once searched for fire with a lighted lantern.
    Had he known what fire was,
    He could have cooked his rice much sooner.

  15. #15
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Tool View Post
    Can you please simplify what you're saying.

    I don't understand a thing what you said here. What would be an error?
    My point was just '\/*' will not appear in a legal, sensible C program. The only purpose of single quotes in C that I can think of is to indicate character values ('\0', 'x', '\\', '"', etc) which is a finite set that does not include '\/*' '/*' or '//'.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

Page 1 of 4 1234 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Remove comments
    By St0rM-MaN in forum C Programming
    Replies: 4
    Last Post: 05-18-2007, 11:03 PM
  2. program to remove comments from source
    By Abda92 in forum C Programming
    Replies: 12
    Last Post: 12-25-2006, 04:18 PM
  3. Request for comments
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 01-02-2004, 09:33 AM
  4. The Art of Writing Comments :: Software Engineering
    By kuphryn in forum C++ Programming
    Replies: 15
    Last Post: 11-23-2002, 04:18 PM
  5. remove comments from source code
    By limbo100 in forum C Programming
    Replies: 2
    Last Post: 09-29-2001, 06:25 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21