Thread: Writing a program to remove comments. Not working

  1. #1
    Registered User
    Join Date
    Apr 2011
    Posts
    55

    Writing a program to remove comments. Not working

    My problem is based on an excercise from K & R book. To quote the book:

    Write a program to remove all comments from a C program. Don't forget to
    handle quoted strings and character constants properly. C comments don't nest.
    To begin with, I am trying to keep things simple & I am checking comments only between * (of /*) and ". Here is my code:

    Code:
    #include <stdio.h>
    
     main()
    {
    int c =0;
    int i =0;
    int m =0;
    int s[1000];
    int in_quotes = 0;
    
    
    for (i=0; i < 999 && (c=getchar())!=EOF && c!='\n'; ++i)
    {
    
    	s[i] = c;
    	if (s[i] == '"' || s[i] == '*') /*within the quotes*/
    {
    
    		in_quotes = 1;
    		i++;
    		if (s[i] == '"' || s[i] == '*') /*reaching the end of quotes*/
    		{
    			
    	in_quotes = 0;
    }
    if (in_quotes = 0)
    {
    	for (m = 0; m<=i; m++)
    {
    	s[m] = '/b ';
    }
    }
    
    }
    printf ("%c",s[i]);
    }
    }
    Output:
    Code:
    #include <stdio.h>
    Also I would be thankful if difference between:

    printf ("%c",s[i]);
    and
    printf ("%s",s[i]);
    and
    printf ("%s",s);
    and
    printf ("%c",s);
    can be explained.

    To summarize I am trying to replace all the comments (in the input) with blanks. Can someone please help?

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by alter.ego
    Here is my code:
    Good to see that you put in the effort, but you should indent your code properly to make it readable.

    Note that the main function should be declared as returning an int, and that there is a difference between = and ==.

    Quote Originally Posted by alter.ego
    Also I would be thankful if difference between:
    What do the %c and %s mean to you?
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    The great thing about programming, is you can easily explore and find these answers, right on the same PC you program on.

    Keep in mind that a string is just a collection of adjacent char's followed by an end of string char: '\0' (which conceptually, is a NULL).

    abcdefgh, is just a bunch of chars, not a string. abcdefgh'\0', is a string.
    Code:
    char mystring[] = {"Some string"};
    will have an end of string char, added to it, automatically. It is a string.

    The end of string char is never printed, and if you use a string function to detect it's length, it will not count the end of string char. But, if you lose the end of string char, in your program, then the string, is no longer a string, anymore - it's just a bunch of chars.

  4. #4
    Registered User Maz's Avatar
    Join Date
    Nov 2005
    Location
    Finland
    Posts
    194
    Quote Originally Posted by Adak View Post
    The great thing about programming, is you can easily explore and find these answers, right on the same PC you program on.

    Keep in mind that a string is just a collection of adjacent char's followed by an end of string char: '\0' (which conceptually, is a NULL).

    abcdefgh, is just a bunch of chars, not a string. abcdefgh'\0', is a string.
    Code:
    char mystring[] = {"Some string"};
    will have an end of string char, added to it, automatically. It is a string.

    The end of string char is never printed, and if you use a string function to detect it's length, it will not count the end of string char. But, if you lose the end of string char, in your program, then the string, is no longer a string, anymore - it's just a bunch of chars.
    This is true. And by no means do I say that there's something wrong with it. However I personally hate this approach.
    For me there is no such fundamental type as string in C. There is just type char. That is 8 bit wide piece of data, often interpreted as ascii value and displayed as a character.
    Then there is char * - pointer to character. On 32 bit intel architecture that is 32 bit wide number, representing address which should contain 8 bit wide char data.
    Then there is str family functions. Those are meant to handle human readable character data. Those functions are built upon assumption that character data theyre handling will be in continuous memory block, and that end of such block is denoted with '\0' character. The start of such a memory block is told to these functions using char * pointer - which contains the address of 8-bit wide memory block containing first character.

    There is no way in C to distinguish char * pointer, which points to arbitrary memory location containing whatsoever data (or at invalid memory location) from pointer pointing at such an "character array".

    Basically, as long as it is taken care that the data in memory is accessible and continuous and terminated with '\0', it is safe to use str() family functions (or %s format specifier) to manipulate / display it. However compiler does not take care of fullfilling these conditions. Nor does any other part of the system. You as code writer must do that. Nothing magically notes that your char * pointer is not a string. If you create a pointer not pointing to valid memory area, or if you forgot the '\0' termination, then there is good chances that MemoryManagementUnit is what notifies you about your error - by terminating your program. I believe it is safer to remember this than think that we have some magical string objects in C. Actually, almost all C is, is writing and reading data from different addresses. Actually, that is what a low level language is, because that is what processor does - writes and reads data from different locations.

    To answer a thing:

    %c means your variable is 8 bit wide data, containing one ascii value interpreted as character.
    %s means your variable is address lenght wide data (typically 32/64 bits with modern PCs), containing an address. The data starting from this address is interpreted as 8-bit wide values forming ASCII characters - untill 8 bit block where none of the bits are set. ('\0')

    EDIT

    ...None of the bits are zero => none of the bits are set.
    Also Laserlight has valid point in post below. Using C string in spoken language eases things. How ever understanding that there is no magical string object in C, just memory containing data is crucial to successfully use these C strings.
    /EDIT
    Last edited by Maz; 09-27-2011 at 05:19 AM. Reason: Corrected typo + added note

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Maz
    For me there is no such fundamental type as string in C.
    This is true according to the C standard, not just for you.

    However, ignoring the concept of a null terminated string, which is ingrained in the C standard, just makes it more difficult to express yourself when common C terminology is available.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    You're absolutely right, Maz. Strings in C are what I would call a concept, or an elevated data type, rather than a fundamental or basic data type.

  7. #7
    Registered User
    Join Date
    May 2011
    Location
    Around 8.3 light-minutes from the Sun
    Posts
    1,949
    Quote Originally Posted by Maz View Post
    To answer a thing:

    %c means your variable is 8 bit wide data, containing one ascii value interpreted as character.
    %s means your variable is address lenght wide data (typically 32/64 bits with modern PCs), containing an address. The data starting from this address is interpreted as 8-bit wide values forming ASCII characters - untill 8 bit block where none of the bits are set. ('\0')
    Ok, first that was overly pedantic and only served to confuse the topic and is not required. Second, a char is not guaranteed to be 8-bits by the standard, so if you are going that route, be correct about it. Third, you would be surprised at how few native types actually exist in C, so let's just not begin that discussion.
    Quote Originally Posted by anduril462 View Post
    Now, please, for the love of all things good and holy, think about what you're doing! Don't just run around willy-nilly, coding like a drunk two-year-old....
    Quote Originally Posted by quzah View Post
    ..... Just don't be surprised when I say you aren't using standard C anymore, and as such,are off in your own little universe that I will completely disregard.
    Warning: Some or all of my posted code may be non-standard and as such should not be used and in no case looked at.

  8. #8
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    I don't know about the OP, but I liked the discussion, and would rate it valuable, if slightly off-topic, from what the OP requsted. For a thread in a forum, it seemed unusually thorough.

    Alter ego what problem(s) did you find when you tested your program?
    Last edited by Adak; 09-27-2011 at 09:02 AM.

  9. #9
    Registered User
    Join Date
    May 2011
    Location
    Around 8.3 light-minutes from the Sun
    Posts
    1,949
    Quote Originally Posted by Adak View Post
    I don't know about the OP, but I liked the discussion, and would rate it valuable. For a thread in a forum, it seemed unusually thorough.
    I took issue with a couple of things, first although not incorrect, being overly pedantic often times, and in this case, serves to distract from the topic at hand and was not needed to correct any statement made previously. Second, if you are going to be pedantic to that level, you should be correct. Third, although not a "native type", a C string is a composite and built-in type which has been engrained in the standard since there was a standard.
    Quote Originally Posted by anduril462 View Post
    Now, please, for the love of all things good and holy, think about what you're doing! Don't just run around willy-nilly, coding like a drunk two-year-old....
    Quote Originally Posted by quzah View Post
    ..... Just don't be surprised when I say you aren't using standard C anymore, and as such,are off in your own little universe that I will completely disregard.
    Warning: Some or all of my posted code may be non-standard and as such should not be used and in no case looked at.

  10. #10
    Registered User Maz's Avatar
    Join Date
    Nov 2005
    Location
    Finland
    Posts
    194
    Quote Originally Posted by AndrewHunter View Post
    Ok, first that was overly pedantic and only served to confuse the topic and is not required. Second, a char is not guaranteed to be 8-bits by the standard, so if you are going that route, be correct about it. Third, you would be surprised at how few native types actually exist in C, so let's just not begin that discussion.
    Actually I never said I am writing C standard here. Nor did I say one must not simplify things. However, according to my experience, in order to be able to use C effectively you need to understand how data really resides in memory. That is what i tried to explain. And in order to understand %s specifier and "C strings" you need to know what i tried to explain. I've seen projects where abstraction layers are written on top of standard C so people one could hire huge quantity of people who do not fully understand language or OS.. Unfortunately quantity has not compensated quality this far. These projects have ended up being mess that never gets into a product...

    And it may well be that char is not guaranteed to be 8bits. But I've never seen a system where it is something else.

  11. #11
    Registered User
    Join Date
    May 2011
    Location
    Around 8.3 light-minutes from the Sun
    Posts
    1,949
    Quote Originally Posted by Maz View Post
    Actually I never said I am writing C standard here. Nor did I say one must not simplify things. However, according to my experience, in order to be able to use C effectively you need to understand how data really resides in memory. That is what i tried to explain. And in order to understand %s specifier and "C strings" you need to know what i tried to explain. I've seen projects where abstraction layers are written on top of standard C so people one could hire huge quantity of people who do not fully understand language or OS.. Unfortunately quantity has not compensated quality this far. These projects have ended up being mess that never gets into a product...
    I have no issue with pushing understanding, we however deal with newbies here all the time and as such material should be presented a certain way. However as Adak has commented, the discussion in and of itself is a good one. I may have jumped the gun a little there and no offense was intended. However:

    Quote Originally Posted by Maz View Post
    And it may well be that char is not guaranteed to be 8bits. But I've never seen a system where it is something else.
    If you are going to be pedantic, then be correct. This is a pet peeve of mine.
    Last edited by AndrewHunter; 09-27-2011 at 12:18 PM. Reason: spelling
    Quote Originally Posted by anduril462 View Post
    Now, please, for the love of all things good and holy, think about what you're doing! Don't just run around willy-nilly, coding like a drunk two-year-old....
    Quote Originally Posted by quzah View Post
    ..... Just don't be surprised when I say you aren't using standard C anymore, and as such,are off in your own little universe that I will completely disregard.
    Warning: Some or all of my posted code may be non-standard and as such should not be used and in no case looked at.

  12. #12
    Registered User Maz's Avatar
    Join Date
    Nov 2005
    Location
    Finland
    Posts
    194
    furthermore, explaining C strings as type gives false idea of what they are. It encourages you to use assignment with strings.it encourages using sizeof() badly. It is plain ugly way to explain things. It is better that a human confuses and explains than that compiler / segmentation faults do it.

  13. #13
    Registered User
    Join Date
    May 2011
    Location
    Around 8.3 light-minutes from the Sun
    Posts
    1,949
    Quote Originally Posted by Maz View Post
    furthermore, explaining C strings as type gives false idea of what they are.
    Ok, what are they? Because I am pretty sure they are in fact a composite built-in type.

    Quote Originally Posted by Maz View Post
    It encourages you to use assignment with strings.it encourages using sizeof() badly.
    Not if you understand how assignments work, as well as the sizeof operator.

    Quote Originally Posted by Maz View Post
    It is plain ugly way to explain things. It is better that a human confuses and explains than that compiler / segmentation faults do it.
    I don't think the explaination that Adak gave was in anyway incorrect or 'ugly'.
    Quote Originally Posted by anduril462 View Post
    Now, please, for the love of all things good and holy, think about what you're doing! Don't just run around willy-nilly, coding like a drunk two-year-old....
    Quote Originally Posted by quzah View Post
    ..... Just don't be surprised when I say you aren't using standard C anymore, and as such,are off in your own little universe that I will completely disregard.
    Warning: Some or all of my posted code may be non-standard and as such should not be used and in no case looked at.

  14. #14
    Registered User Maz's Avatar
    Join Date
    Nov 2005
    Location
    Finland
    Posts
    194
    Actually, I would not say the C-strings are a composite type. I'd say they're not data type, they're data object.
    For (compatible) types, assignment would work. For types, comparison would work. And sizeof would work too - unless type is incomplete.

    I guess you can see the fundamental difference between character arrays and C-strings?
    But anyways, you're correct that if one want's to be pedantic, then one should be correct. A point for you on that one And I am sorry for implying that Adak's explanation was ugly. I do appreciate help Adak offers. I'm sorry, I got off the line there. I just wanted to say that I would prefer people explaining C from such a point of view, that integers, characters, <nameithere> are nothing more but certain sized data in memory. Pointer is nothing more but a way to access data in some address. pointer +1 is a way to access data at next memory block. Sooner the new programmer understands that what really matters is data in memory, and when you know the size && representation of that data, you can point to address locations and do casts knowing what happens under the hood. Most of the C code out there uses these tricks. And it can really be understood only after this is understood. Hence detailed explanation about how data resdes in memory, and how it is interpreted is better than pointing out "it should be int main(), not void main()". Compiler can tell you it should be int main, but it won't explain you what is going on under the hood.

  15. #15
    Registered User
    Join Date
    May 2011
    Location
    Around 8.3 light-minutes from the Sun
    Posts
    1,949
    Like I said, I jumped the gun on that one and I meant no offense. However, for your own edification:

    Quote Originally Posted by Maz View Post
    Actually, I would not say the C-strings are a composite type. I'd say they're not data type, they're data object.
    C strings are in fact, a composite built-in datatype.(It isn't an opinion, it is a statement of fact) This is an important fact to understand when discussing the language and what the compiler is actually doing.
    Last edited by AndrewHunter; 09-28-2011 at 12:40 AM.
    Quote Originally Posted by anduril462 View Post
    Now, please, for the love of all things good and holy, think about what you're doing! Don't just run around willy-nilly, coding like a drunk two-year-old....
    Quote Originally Posted by quzah View Post
    ..... Just don't be surprised when I say you aren't using standard C anymore, and as such,are off in your own little universe that I will completely disregard.
    Warning: Some or all of my posted code may be non-standard and as such should not be used and in no case looked at.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 58
    Last Post: 01-31-2011, 02:45 AM
  2. [C] remove comments
    By Tool in forum C Programming
    Replies: 50
    Last Post: 11-29-2009, 04:57 AM
  3. Remove comments
    By St0rM-MaN in forum C Programming
    Replies: 4
    Last Post: 05-18-2007, 11:03 PM
  4. program to remove comments from source
    By Abda92 in forum C Programming
    Replies: 12
    Last Post: 12-25-2006, 05:18 PM
  5. remove comments from source code
    By limbo100 in forum C Programming
    Replies: 2
    Last Post: 09-29-2001, 06:25 PM