need sth about parsing
hello, im trying to make an emulator ! well the simplest form of the emulator has been coded so far , but it seems for the further development i need to know about parsing and you know that kind of stuff(parsing , scanning,tokens.. !)
i just know some basic definitions about parsing, nothing more!
so first i googled parsing! and found some related articles, but , those were for xml things! so again no luck !
couldnt find a proper article concerning C++ programming!
so i thought i could be a good idea to ask you for help.
anything (comment, sample source code, article ,etc)is welcomed .
What are you emulating/parsing?
It makes a whole lot of difference.
But essentially, a parser consists of two or three parts:
a lexical analyzer/tokenizer that takes the input and produces tokens or lexemes.
a syntactic/semantic analyzer that takes the tokens and "makes sense out of it" (which may be to generate an error, e.g. we found a token that doesn't belong here), e.g in C:
should give an error about either ; when it's not expected, or a ) expected before the ; .
The output of the syntactic/semantic analyzer is something the parser proper can interpret (e.g binary tree of expressions, or a sequence of commands in some other way).
tanx dear Mats, as you may know, i started coding the Dietels exercise "The simpletron computer emulation " its ok now , a nice console app that serves that exercise .
after that , it made me write an emulator for other lets say "Abstract" or conceptual Computer Systems, such as CLA, (a single adressed computer with octal base ,featuring a 16 bit Accumulator,a 9 bit programer Conter, 512 memory lines(each 16 bits), a 2^4 or 16 instructions, 2 flags of z and n (representing Zero and Negetive flag),and a 9 bit address line)
| Opcode | Address |
4 bit 9 bit
| Opcode| M| I|X| Address |
0 0 0
HLT //halt the program , used to state the end of the program!. only one instance is allowed( you cant use halt twice!)
LDA M //Loades value from M
STA Y//Store the Accumultor value to Y
SBA Y //subtracts Accumultor value from Y
ORA Y //Ors the value of y with the value stored in accumultor and places the result in acumultor
ANA Y //Ands the value of y with the value stored in accumultor and places the result in acumultor
JMP Y //jump to Y
JNG Y //if the value stored in acumultor is negetive then jump to Y
JZR Y //if the value stored in acumultor is zero then jump to Y
CLA //zeros the accumulator
INC //increments the value stored in accumulator
DEC //decrements as above
ADA Y //Adds the value of Y with the value of accumulator
RED Y //Reads users input and places it in Y
WRI Y //print s the value of Y
a sample code which can be executed on a CLA computer is as follows:
//simple mode of addressing
ORG 9 //stating where to start
9. RED 5
10. RED 6
11. LDA 5
12. SBA 6
13. JNG 17
14. WRI 5
15. JMP 20
17. WRI 6
1110 000 000 000 101
1110 000 000 000 110
0001 000 000 000 101
0100 000 000 000 110
1000 000 000 001 111
1111 000 000 000 101
0111 000 000 010 000
1111 000 000 000 110
0000 000 000 000 000
//Symbolic mode of addressing
//Index-Indirect mode of addressing
0 1 1 == LDA IX 10 //loads a memory location which XR+10 points to
and etc...(a couple of other addressing modes )
"pretty much like simpletron" but this one has more stuffs such as: labels,different modes of addressng ( immediate /indirect modes,...) & alot more! (its like Assembly language to some extend ! .)
so thats what im planing to code ! , at first i thought i could be accomplished by using if , and loops , ( it is possible i think)but using parsing would make it rather easy and also look professional and debuging the app could be more conviniet too
so thats all .
any help on parsing in C++ in highly appreciated
plz some one help me !:(
I'm not familiar with the C++ STL very well, but basically you want to split up everything separated by a ' ' (space) character or a ':' character. In C++ it knows that a token or term is separated by a space, or an operator, or someother syntactical mark. Just look through the documentation for a function with strings that "tokenizes" or "splits" strings. You'll need to specify the delimiting characters.
With C-style string you could easily just loop through this process:
Read until you find one of the delimiting characters, then create a string large enough for that token and copy it. Then the delimiting character would be the next token if it has any meaning, and you just loop through the process until you reach the end of the file. Then every meaningful part of your program is stored in memory - and you can process it syntactically as mentioned above.
edit: Moved to the C++ forum
many thanks sean ,Ok i 'll do it this way and will tell you about it.
Originally Posted by sean
again tanx, i really appreciate your answer .