if white spaces are not tokens why are they required in declarations?

**y99q** · 12-03-2011

thinking about the unicode characters of a C# source file and how they are combined to represent different lexical elements such as white spaces, line breaks, and comments, which can be used to separate other lexical elements such as tokens, which are the elements that define the actual syntactical structure of a source file (ie: keywords, identifiers,literals,operators, punctuators), and how it is possible to omit all the other lexical elements and just have a program written on a single line, with each token being directly right next to each other in such a way that there is at least one punctuator or operator between each keyword, literal and identifier, i couldn't help but wonder if it's actually possible to concatenate everything including the tokens that make up a declaration statement without using a single white-space (unless it appears in the context of a string or character literal) or delimited comment. for example, this is a syntactically correct source file: class C{static/**/void M(){if(true){int x=10;}}} but notice that I couldn't get away without the spaces or comments between the tokens that make up the type, type member and variable declaration statements. just like it's possible to have white-spaces, line breaks, comments and pre-processing directives between tokens, why is it that white-spaces (or optionally delimited comments) are required to separate tokens that appear in certain statements and yet merely be considered non-token lexical elements?

**quzah** · 12-03-2011

Think about what you are asking. How would it know that 'static' and 'void' was not a single object called 'staticvoid'.

Quzah.

**y99q** · 12-03-2011

it's not how i would know, but how the compiler would know. and why should those tokens be separated by something that's not another token ie: some type of punctuator or even operator.

**DanFraser** · 12-03-2011

Think of white space in code the same as paragraphs not being in walls of text.

**y99q** · 12-03-2011

Originally Posted by DanFraser

Think of white space in code the same as paragraphs not being in walls of text.

and you think i dispute the fact that tokens (or words in the example you provided) need to be separated somehow? in a c# program, most tokens don't need to be separated by spaces, since the separation of tokens can be inferred by the relative position of operators and punctuators, which are also tokens. but in some cases the tokens are separated by white-space, which by definition are lexical elements but not tokens. for example, if you write i=10; you have 4 tokens which are not separated by anything and yet the compiler can tell that they are 4 tokens rather than a single token by observing the position of the operator = and the punctuator ; in relation to what's in between, which happens to be the identifier i and the literal 10. but if you want to declare i then obviously the compiler has no way of knowing what inti; means, and in fact it would only see an invalid expression (since an undeclared identifier is being referenced) as well as an invalid expression statement, since only invocation expressions, assignment expressions, and new object creation expressions can be used with a ; to form an expression statement, which is just a specific kind of statement. regardless, the point that I discuss is not that the compiler should be able to 'guess' that inti; is actually the declaration statement int i;. The point that I discuss is that, since some statements require the use of tokens that cannot be separated by the compiler the 'typical' way (ie: making an inference based on the relative position of operator and punctuator tokens to other tokens), then either whatever is used to separate those tokens should be another token (a new, special token, which could be, say, either a punctuator or sequence of punctuators) or white-spaces appearing in the contexts described should be considered something that even if it's not a 'token' at least is more than just a lexical element. look at this way, if you write a sentence on a piece of paper, you separate the words using spaces. but those spaces are probably not the only 'spaces' on the piece of paper. likely there are spaces on the margins of the page, and yet those 'spaces' can hardly be said to belong to the same 'category' of spaces that separate your words.

**DanFraser** · 12-03-2011

Soeverythingshouldbelikethiswhenyoucodec#then?Theh umanmindismuchmorecapableofunderstandingcompressed informationinhumanformthanmostcomputersarecapableo fmanaging.Thisistheunderlyingreasonwhywhitespaceis needed,toallowthecompilertoinderstandyourintention withyourcode.

**y99q** · 12-03-2011

well, it seems to me that this conversation is becoming more 'intellectual' than I intended it to be, so I think I'm done discussing this subject.

but I'll add that I think it's funny how my question concerns the classification of the lexical and syntactical 'things' that make up the code, but for some reason some people, after reading my question, somehow ended up with the idea that I propose that there should be no separation between the 'things' that make up the code.

**manasij7479** · 12-03-2011

Originally Posted by DanFraser

Soeverythingshouldbelikethiswhenyoucodec#then?Theh umanmindismuchmorecapableofunderstandingcompressed informationinhumanformthanmostcomputersarecapableo fmanaging.Thisistheunderlyingreasonwhywhitespaceis needed,toallowthecompilertoinderstandyourintention withyourcode.

You really need to learn whitespace .

Code:

Thisisacomment.
           
   
      
            
Allvisiblecharactersareignored.

**manasij7479** · 12-03-2011

Originally Posted by y99q

well, it seems to me that this conversation is becoming more 'intellectual' than I intended it to be, so I think I'm done discussing this subject.

but I'll add that I think it's funny how my question concerns the classification of the lexical and syntactical 'things' that make up the code, but for some reason some people, after reading my question, somehow ended up with the idea that I propose that there should be no separation between the 'things' that make up the code.

Classification ?
It all comes down to the grammar. ( I can't swallow most of the material there

)

**itsme86** · 12-03-2011

So you want to introduce a new token that performs the same function that whitespace currently does just for the sake of being able to differentiate between necessary and unnecessary token separation?

**quzah** · 12-04-2011

Originally Posted by y99q

well, it seems to me that this conversation is becoming more 'intellectual' than I intended it to be, so I think I'm done discussing this subject.

So you really wanted to just blurt out some garbage and have everyone agree with you? Try some place else.

Originally Posted by y99q

but I'll add that I think it's funny how my question concerns the classification of the lexical and syntactical 'things' that make up the code, but for some reason some people, after reading my question, somehow ended up with the idea that I propose that there should be no separation between the 'things' that make up the code.

That is what you said:

Originally Posted by y99q

i couldn't help but wonder if it's actually possible to concatenate everything including the tokens that make up a declaration statement without using a single white-space

Which means, "I want to take out all the white space and have it work!"

Originally Posted by y99q

but notice that I couldn't get away without the spaces or comments between the tokens that make up the type, type member and variable declaration statements.

Code:

inttint;

You want that to be valid. Which means that you did in fact "propose that there should be no separation between the 'things' that make up the code."

Quzah.

Thread: if white spaces are not tokens why are they required in declarations?

Thread Tools

Search Thread

Display

if white spaces are not tokens why are they required in declarations?

Similar Threads

Reading white spaces? oh and scanf_s

white spaces ...

white spaces

string input...with white spaces?

White spaces