Thread: need help with revised lexing algorithim

  1. #1
    Registered User
    Join Date
    Sep 2013
    Posts
    11

    need help with revised lexing algorithim

    so i rewrote my lexical analyzer so that its based off stroustrops calculator rather than using regex's. the only problem is trying to handle multi character operators like >>=, +=, and --. I can fix it, but my solution will just repeat a lot of code and i would rather not do that. thats why im trying to let switch cases fall through

    Code:
    void Lexer::Lex()
    {
        istringstream Stream(this->Source);
        char Current = 0,
             Temp    = 0;
    
    
        while(Stream)
        {
            do
            {
                if(!Stream.get(Current)) return;
            } while(isspace(Current));
    
    
            switch(Current)
            {
                case '<': case '>': case '*':
                    Stream.get(Temp);
    
    
                    if((Current + Temp + Stream.peek()) == Current + Temp + '=')
                    {
                        SymbolTable.PushBack("OPERTATOR", Current + Current + "=", -1, -1);
                        Stream.putback(Temp);
                        break;
                    }
    
    
                    Stream.putback(Temp);
    
    
    	        case '/': case '+': case '-':
                    if(Current == Stream.peek())
                    {
                        SymbolTable.PushBack("OPERTATOR", this->ChToS(Current) + this->ChToS(Current), -1, -1);
                        break;
                    }
    
    
    	        case '=': case '!': case '%': case '^': case '&': case '|':
                    if(Stream.peek() == '=')
                    {
                        SymbolTable.PushBack("OPERATOR", Current + "=", -1, -1);
                        break;
                    }
    
    
                case ';': case '(': case ')': case ',': case '[': case ']': case '{': case '}':
                    SymbolTable.PushBack("OPERATOR", this->ChToS(Current), -1, -1);
                    break;
    
    
                case '0': case '1': case '2': case '3': case '4':
    	        case '5': case '6': case '7': case '8': case '9':
                case '.':
                {
                    string NumberValue;
    
    
                    NumberValue = Current;
            		while(Stream.get(Current) && isdigit(Current)) NumberValue.push_back(Current);
      
                  Stream.putback(Current);
                    SymbolTable.PushBack("NUMBER", NumberValue, -1, -1);
                    break;
                }
    
    
                case '\'': case '\"':
                {
                    char Match = Current;
                    string QuoteValue;
                    QuoteValue = Current;
    
    
                    while(Stream.get(Current) && Current != Match) QuoteValue.push_back(Current);
    
    
                    QuoteValue.push_back(Current);
                    SymbolTable.PushBack("STRING", QuoteValue, -1, -1);
                }
    
    
    	        default:
                {
    
    
                    string StringValue;
    
    
                    if(isalpha(Current))
    	            {
    		            StringValue = Current;
    
    
            		    while(Stream.get(Current) && isalnum(Current)) StringValue.push_back(Current);
    
    
                		Stream.putback(Current);
    
    
                        if(this->IsKeyword(StringValue)) SymbolTable.PushBack("KEYWORD",    StringValue, -1, -1);
                        else                             SymbolTable.PushBack("IDENTIFIER", StringValue, -1, -1);
    	            }
                }
            }
        }
    }
    so in case you cant tell what im doing... I first test Current if its going to be a 3 character (which comprises of >>=, <<=, **=). if it is it breaks. otherwise it should fall through and test for dual character operators that are themselves + themselves (ie ++, --, //, **). then it tests for Current + = (ie +=, -=, ==, >=). and finally failing all of those it just logs it into the symbol table as an operator containing itself

  2. #2
    Registered User
    Join Date
    Apr 2006
    Posts
    2,149
    You should mention what's not working

    The way to avoid repeating code is to move the code that would be repeated into a function. I don't understand how you plan to solve your problem with switch case fall through (I only skimmed through the code), but it sounds like the wrong approach in any case.

    A Lexer should speak to a parser, not a symbol table. A parser is the proper name for the component that takes in lexical tokens. A parser may generate a symbol table, but that's still a logically distinct object.
    It is too clear and so it is hard to see.
    A dunce once searched for fire with a lighted lantern.
    Had he known what fire was,
    He could have cooked his rice much sooner.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. help with lexing algorithim
    By DTSCode in forum C++ Programming
    Replies: 4
    Last Post: 11-12-2013, 08:21 PM
  2. Lexing file into binary search tree
    By Vespasian in forum C++ Programming
    Replies: 6
    Last Post: 05-22-2013, 05:05 PM
  3. simple Lexing
    By jed in forum C++ Programming
    Replies: 2
    Last Post: 08-27-2006, 07:15 PM
  4. Revised Slope formula
    By Dangerous Dave in forum C Programming
    Replies: 3
    Last Post: 10-07-2001, 10:37 PM