I need directions
Hello ladies and gentlemen.
I have been playing with the idea of writing a simple BASIC parser/interpreter, something that would somewhat like Dartmouth BASIC in terms of supported commands, functions, syntax and operators. I've previously had experience writing expression parsers for math(and logic) expressions written in infix notation. I've also implemented a few data structures along the way(singly/doubly linked list, queues, stacks, circular buffers etc.)
So I'm in not new to C++, I'm just clueless as to what I'm supposed to look at to get started with this. I know about recursive decent parsers, but I'm not sure if that would be suitable for this application. Could anyone give me some pointers?
Obviously I know there are a million different BASIC dialects out there, so I could just use any of these if I wanted to write something in BASIC, but this is mainly for my own entertainment/learning. So therefore I'm not looking for a parser generator either, I'm looking for information on how to go about this, on my own.(Possibly with help from this forum if I get stuck)
Not sure what exactly you want advice on.
The syntax of BASIC isn't terribly difficult to sort out in the majority of cases - IF- THEN - ELSE is probably the hairest part.
One would probably use a base-class that is extended to define the keywords and standard functions. You may need several of those.
You already understand how to parse/process expressions, which is another of the not-so-easy parts.
You can then build some sort of data structure representing the program, a map of line numbers perhaps, with either a vector or a list as the content of a line.
A hashmap (or perhaps a basic vector if your variable names are a-z + optional single digit, like many old versions of basic) to store variables, I guess. Variables would again be a base-class that is derivent to have different properties (float, integer, string).
For things like gosub, you'd just have a stack of references to the objects representing a line of code.
I'd get a good book on compiler/interpreter writing. This is a big topic, and parsing infix expressions is not much of a help when it comes to parsing a whole language.
Since you want an interpreter I wouldn't even bother with fancy parsers. Their learning curve, and specifications you have to write for your language - you might as well just roll-your-own.
BASIC only has a few basic keywords... Start with parsing statements - get familiar with their common characteristics... keyword, followed by such-and-such, whether optional or required, etc. etc.
As was stated, parsing expressions are probably the trickiest, also the funnest to do. If you observe the language's operator precedence... and implement all of its built-in functions such as trig, logarithms, etc.
Next are your stack handling for GOSUB / RETURN, FOR / NEXT. And maybe a way to clean up the stack if the user GOTOs out of the loop "illegally"...
Handling variable names... (as stated, some type of lookup table that's reasonably fast)...
There are a lot of bits and pieces to consider. But they aren't all that difficult if you dedicate a week or two.
Alright guys, thanks for the replies. nonoob and matsp seem to have the same general suggestions, as for CornedBee, what book(s) would you recommend on this subject?
As for parsing expressions, I have a full expression parser down with all the common trig functions, exp, log, log10 etc. But I do get that interpreting a language is more difficult than this.
Well, aside from calculating expressions, it is not a HUGE challenge [not that I have ever written a Basic interpreter], you will have to deal with flow-control, which in Basic is pretty simple, as there aren't that many variants:
gosub / return - you need a stack for this. Not sure if you need to pop the stack when someone does a goto out of the subroutine (not sure how you would detect it).
goto - just change the current "where we are in the code
for next // remember where the for-loop starts in a stack kind of fashion. Remember to pop the stack if the user does a goto past the relevant next statement.
If you support user defined function (def fn ...), then you may need to do something to evaluate one of those expressions too, but that will be a single line of code, so you don't really need to support a stack for that [as long as they are "simple" user defined functions a'la old-style basic], as you can just call back into the standard evaluator functionality to perform the "evaluate user function", just like evaluating any other line of code.