Thread: if white spaces are not tokens why are they required in declarations?

  1. #1
    Registered User
    Join Date
    Sep 2011
    Posts
    71

    if white spaces are not tokens why are they required in declarations?

    thinking about the unicode characters of a C# source file and how they are combined to represent different lexical elements such as white spaces, line breaks, and comments, which can be used to separate other lexical elements such as tokens, which are the elements that define the actual syntactical structure of a source file (ie: keywords, identifiers,literals,operators, punctuators), and how it is possible to omit all the other lexical elements and just have a program written on a single line, with each token being directly right next to each other in such a way that there is at least one punctuator or operator between each keyword, literal and identifier, i couldn't help but wonder if it's actually possible to concatenate everything including the tokens that make up a declaration statement without using a single white-space (unless it appears in the context of a string or character literal) or delimited comment. for example, this is a syntactically correct source file: class C{static/**/void M(){if(true){int x=10;}}} but notice that I couldn't get away without the spaces or comments between the tokens that make up the type, type member and variable declaration statements. just like it's possible to have white-spaces, line breaks, comments and pre-processing directives between tokens, why is it that white-spaces (or optionally delimited comments) are required to separate tokens that appear in certain statements and yet merely be considered non-token lexical elements?

  2. #2
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Think about what you are asking. How would it know that 'static' and 'void' was not a single object called 'staticvoid'.


    Quzah.
    Hope is the first step on the road to disappointment.

  3. #3
    Registered User
    Join Date
    Sep 2011
    Posts
    71
    it's not how i would know, but how the compiler would know. and why should those tokens be separated by something that's not another token ie: some type of punctuator or even operator.

  4. #4
    Registered User
    Join Date
    Jun 2003
    Posts
    129
    Think of white space in code the same as paragraphs not being in walls of text.
    He who asks is a fool for five minutes, but he who does not ask remains a fool forever.

    The fool wonders, the wise man asks. - Benjamin Disraeli

    There are no foolish questions and no man becomes a fool until he has stopped asking questions. Charles Steinmetz

  5. #5
    Registered User
    Join Date
    Sep 2011
    Posts
    71
    Quote Originally Posted by DanFraser View Post
    Think of white space in code the same as paragraphs not being in walls of text.
    and you think i dispute the fact that tokens (or words in the example you provided) need to be separated somehow? in a c# program, most tokens don't need to be separated by spaces, since the separation of tokens can be inferred by the relative position of operators and punctuators, which are also tokens. but in some cases the tokens are separated by white-space, which by definition are lexical elements but not tokens. for example, if you write i=10; you have 4 tokens which are not separated by anything and yet the compiler can tell that they are 4 tokens rather than a single token by observing the position of the operator = and the punctuator ; in relation to what's in between, which happens to be the identifier i and the literal 10. but if you want to declare i then obviously the compiler has no way of knowing what inti; means, and in fact it would only see an invalid expression (since an undeclared identifier is being referenced) as well as an invalid expression statement, since only invocation expressions, assignment expressions, and new object creation expressions can be used with a ; to form an expression statement, which is just a specific kind of statement. regardless, the point that I discuss is not that the compiler should be able to 'guess' that inti; is actually the declaration statement int i;. The point that I discuss is that, since some statements require the use of tokens that cannot be separated by the compiler the 'typical' way (ie: making an inference based on the relative position of operator and punctuator tokens to other tokens), then either whatever is used to separate those tokens should be another token (a new, special token, which could be, say, either a punctuator or sequence of punctuators) or white-spaces appearing in the contexts described should be considered something that even if it's not a 'token' at least is more than just a lexical element. look at this way, if you write a sentence on a piece of paper, you separate the words using spaces. but those spaces are probably not the only 'spaces' on the piece of paper. likely there are spaces on the margins of the page, and yet those 'spaces' can hardly be said to belong to the same 'category' of spaces that separate your words.
    Last edited by y99q; 12-03-2011 at 05:21 PM.

  6. #6
    Registered User
    Join Date
    Jun 2003
    Posts
    129
    Soeverythingshouldbelikethiswhenyoucodec#then?Theh umanmindismuchmorecapableofunderstandingcompressed informationinhumanformthanmostcomputersarecapableo fmanaging.Thisistheunderlyingreasonwhywhitespaceis needed,toallowthecompilertoinderstandyourintention withyourcode.

  7. #7
    Registered User
    Join Date
    Sep 2011
    Posts
    71
    well, it seems to me that this conversation is becoming more 'intellectual' than I intended it to be, so I think I'm done discussing this subject.

    but I'll add that I think it's funny how my question concerns the classification of the lexical and syntactical 'things' that make up the code, but for some reason some people, after reading my question, somehow ended up with the idea that I propose that there should be no separation between the 'things' that make up the code.
    Last edited by y99q; 12-03-2011 at 06:16 PM.

  8. #8
    [](){}(); manasij7479's Avatar
    Join Date
    Feb 2011
    Location
    *nullptr
    Posts
    2,657
    Quote Originally Posted by DanFraser View Post
    Soeverythingshouldbelikethiswhenyoucodec#then?Theh umanmindismuchmorecapableofunderstandingcompressed informationinhumanformthanmostcomputersarecapableo fmanaging.Thisistheunderlyingreasonwhywhitespaceis needed,toallowthecompilertoinderstandyourintention withyourcode.
    You really need to learn whitespace .
    Code:
    Thisisacomment.
               
       
          
                
    Allvisiblecharactersareignored.
    Last edited by manasij7479; 12-03-2011 at 06:34 PM.

  9. #9
    [](){}(); manasij7479's Avatar
    Join Date
    Feb 2011
    Location
    *nullptr
    Posts
    2,657
    Quote Originally Posted by y99q View Post
    well, it seems to me that this conversation is becoming more 'intellectual' than I intended it to be, so I think I'm done discussing this subject.

    but I'll add that I think it's funny how my question concerns the classification of the lexical and syntactical 'things' that make up the code, but for some reason some people, after reading my question, somehow ended up with the idea that I propose that there should be no separation between the 'things' that make up the code.
    Classification ?
    It all comes down to the grammar. ( I can't swallow most of the material there )

  10. #10
    Gawking at stupidity
    Join Date
    Jul 2004
    Location
    Oregon, USA
    Posts
    3,218
    So you want to introduce a new token that performs the same function that whitespace currently does just for the sake of being able to differentiate between necessary and unnecessary token separation?
    If you understand what you're doing, you're not learning anything.

  11. #11
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by y99q View Post
    well, it seems to me that this conversation is becoming more 'intellectual' than I intended it to be, so I think I'm done discussing this subject.
    So you really wanted to just blurt out some garbage and have everyone agree with you? Try some place else.
    Quote Originally Posted by y99q View Post
    but I'll add that I think it's funny how my question concerns the classification of the lexical and syntactical 'things' that make up the code, but for some reason some people, after reading my question, somehow ended up with the idea that I propose that there should be no separation between the 'things' that make up the code.
    That is what you said:
    Quote Originally Posted by y99q View Post
    i couldn't help but wonder if it's actually possible to concatenate everything including the tokens that make up a declaration statement without using a single white-space
    Which means, "I want to take out all the white space and have it work!"
    Quote Originally Posted by y99q View Post
    but notice that I couldn't get away without the spaces or comments between the tokens that make up the type, type member and variable declaration statements.
    Code:
    inttint;
    You want that to be valid. Which means that you did in fact "propose that there should be no separation between the 'things' that make up the code."


    Quzah.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Reading white spaces? oh and scanf_s
    By RobotGymnast in forum C++ Programming
    Replies: 7
    Last Post: 11-01-2008, 05:32 AM
  2. white spaces ...
    By eagles in forum C++ Programming
    Replies: 6
    Last Post: 01-13-2006, 07:55 AM
  3. white spaces
    By Jules in forum C++ Programming
    Replies: 2
    Last Post: 04-24-2004, 02:55 PM
  4. string input...with white spaces?
    By Pureghetto in forum C Programming
    Replies: 6
    Last Post: 03-10-2003, 01:52 PM
  5. White spaces
    By Garfield in forum A Brief History of Cprogramming.com
    Replies: 34
    Last Post: 11-18-2001, 08:24 PM