Scanner?
Lexical analyzer (Lexer)?
Tokenizer?
Their job is just to grouping characters into a token based on a regular expression?
Are they the same thing?
So many book I've read and make me more confusing?
WTH.
Scanner?
Lexical analyzer (Lexer)?
Tokenizer?
Their job is just to grouping characters into a token based on a regular expression?
Are they the same thing?
So many book I've read and make me more confusing?
WTH.
Just GET it OFF out my mind!!
i don't know much about this stuff.
But I think they are different phases of a compiler.
I could be wrong.
Edit:
According to wikipedia
http://en.wikipedia.org/wiki/Lexical_analysisIn computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. Programs performing lexical analysis are called lexical analyzers or lexers. A lexer is often organized as separate scanner and tokenizer functions, though the boundaries may not be clearly defined.
Last edited by stevesmithx; 12-23-2008 at 12:12 PM.
Not everything that can be counted counts, and not everything that counts can be counted
- Albert Einstein.
No programming language is perfect. There is not even a single best language; there are only languages well suited or perhaps poorly suited for particular purposes.
- Herbert Mayer
It might not be actually based on a regular expression, but I think that is the general idea.Originally Posted by audinue
I would say yes, though I suppose it depends on context since "scanner" is a rather generic word.Originally Posted by audinue
Look up a C++ Reference and learn How To Ask Questions The Smart WayOriginally Posted by Bjarne Stroustrup (2000-10-14)
And then you need a semantic analyzer as well, to actually get what the code actually means.
But yes, a lexical analyzer and tokenizer is essentially the same thing - also depending on who you talk to.
--
Mats
Compilers can produce warnings - make the compiler programmers happy: Use them!
Please don't PM me for help - and no, I don't do help over instant messengers.
Oh, and Laserlight's point about "not a regular expression" is quite clear if you consider that sometimes a = would terminate the expression, in other cases it won't ( += or == for example).
Some things ALWAYS terminate a token, at other times, it depends on the context. So a lexical analyzer (for C or similar language) is more complex than a simple regular expression termination. Although with a reasonable set of regexp's, you may be able to parse all of C.
--
Mats
Compilers can produce warnings - make the compiler programmers happy: Use them!
Please don't PM me for help - and no, I don't do help over instant messengers.
So,
And... a...Code:lexer == scanner == tokenizer
Scanner : Scan, its job to scan thing.A lexer is often organized as separate scanner and tokenizer functions
Tokenizer : Tokenize, its job to tokenize thing.
While a Tokenizer need a Scanner, without it, what thing to be tokenized?
And a Tokenizer is always a Scanner?Code:Scanner | Tokenizer
So, it means a Tokenizer == Lexer and Scanner != Lexer, but it's part of Lexer?
I think scanner and tokenizer are part of lexer...
Just GET it OFF out my mind!!
What does it mean to scan and what does it mean to tokenize?
Along the same lines, what is a lexeme and what is a token?
Look up a C++ Reference and learn How To Ask Questions The Smart WayOriginally Posted by Bjarne Stroustrup (2000-10-14)
I think you got it correct.So, it means a Tokenizer == Lexer and Scanner != Lexer, but it's part of Lexer
That's what the dragon book for compiler says.
The Lexical analyser has a scanner which scans the source program and produces tokens as output which are later parsed by a parser to get a parse tree.
So a scanner is a functionality of Lexer which performs the tokenizing operation.
However as the wikipedia article suggests the boundaries of what is what may vary depending on the context.
This is also mentioned in the dragon book.
PS:if you wonder what the dragon book is, it is a book written by Alfred Aho et.al titled
Compilers: Principles, Techniques, and Tools which is considered as an authentic source for compiler by many.
Not everything that can be counted counts, and not everything that counts can be counted
- Albert Einstein.
No programming language is perfect. There is not even a single best language; there are only languages well suited or perhaps poorly suited for particular purposes.
- Herbert Mayer
I read the second edition three times LOL.PS:if you wonder what the dragon book is, it is a book written by Alfred Aho et.al titled
Compilers: Principles, Techniques, and Tools which is considered as an authentic source for compiler by many.
Compilers: Design and Principles.
Just GET it OFF out my mind!!