need help with a lexer

This is a discussion on need help with a lexer within the C++ Programming forums, part of the General Programming Boards category; hey all! Im not new to forums in general, but i am to this one, so if im inadvertently breaking ...

  1. #1
    Registered User
    Join Date
    Sep 2013
    Posts
    11

    need help with a lexer

    hey all! Im not new to forums in general, but i am to this one, so if im inadvertently breaking a rule please let me know. anyways i need help with some code im writing and cant figure it out. im writing a compiler for a language im designing called Jade, and am on the step with the lexer. im basing this off the udacity tutorial teaching you how to build a web browser (its more to learn how to interpret html/javascript) and am using the purple dragon book for reference. anyways heres the code:
    Code:
    #include <iostream>
    #include <fstream>
    #include <sstream>
    #include <string>
    #include <vector>
    #include <regex.h>
    
    using std::ostream;
    using std::cout;
    using std::cerr;
    using std::endl;
    using std::cin;
    using std::ifstream;
    using std::istringstream;
    using std::string;
    using std::vector;
    
    class Token
    {
         string Type, Name;
         int LineNo, LineCol;
    
         public:
              Token(string type, string name, int lineno, int linecol)
                   : Type(type), Name(name), LineNo(lineno), LineCol(linecol) {}
              Token() {}
    
              void SetType   (string type) { Type    = type;    }
              void SetName   (string name) { Name    = name;    }
              void SetLineNo (int lineno)  { LineNo  = lineno;  }
              void SetLineCol(int linecol) { LineCol = linecol; }
    
              string GetType   () { return Type;    }
              string GetName   () { return Name;    }
              int    GetLineNo () { return LineNo;  }
              int    GetLineCol() { return LineCol; }
    };
    
    ostream& operator<<(ostream &out, Token token)
    {
         out<<"("<< token.GetType() <<", '"<< token.GetName() <<"', "<< token.GetLineNo() <<", "<< token.GetLineCol() <<")";
         return out;
    }
    
    void  ReadInFile (ifstream&, vector<string>&);
    void  Lex        (vector<string>&, vector<Token>&);
    Token Lex        (string, int&, int&);
    
    int main(int argc, char *argv[])
    {
         ifstream File(argv[1]);
         vector<string> FileContents;
         vector<Token> TokenList;
    
         ReadInFile(File, FileContents);
    
         Lex(FileContents, TokenList);
    
         for(auto &Counter : TokenList)
              cout<< Counter << endl;
    }
    
    void ReadInFile(ifstream &File, vector<string> &FileContents)
    {
         int Counter = 0;
         string Line;
    
         while(getline(File, Line))
              FileContents.push_back(Line);
    }
    
    void Lex(vector<string> &FileContents, vector<Token> &TokenList)
    {
         int LineNo = 1, MBegin, MEnd;
    
         for(auto &Counter : FileContents)
         {
              (MBegin = 0) && (MEnd = Counter.size() - 1);
    
              while(true)
              {
                   TokenList.push_back(Lex(Counter, MBegin, MEnd));
    
                   if(Counter.at(MEnd) == '\n')
                   {
                        if(TokenList[TokenList.size() - 1].GetType() == "UNINITIALIZED")
                             TokenList.pop_back();
    
                        break;
                   }
    
                   Counter = Counter.substr(MEnd, Counter.size() - MEnd - 1);
                   TokenList[TokenList.size() - 1].SetLineNo(LineNo);
              }
    
              LineNo++;
         }
    }
    
    Token Lex(string Line, int &MBegin, int &MEnd)
    {
         regex_t Regex;
         regmatch_t Match;
    
         regcomp(&Regex, "\"[^\"]+\"", REG_EXTENDED);
         if(regexec(&Regex, Line.c_str(), 1, &Match, 0) == 0)
         {
                   (MBegin = Match.rm_so) && (MEnd = Match.rm_eo);
                   return Token("STRING", Line.substr(MBegin + 1, MEnd - MBegin - 2), -1, MBegin + 1);
         }
         regfree(&Regex);
    
         (MBegin = 0) && (MEnd = Line.size() - 1);
         return Token("UNINITIALIZED", "", -1, -1);
    }
    terminate called after throwing an instance of 'std:ut_of_range'
    what(): basic_string::at
    Aborted (core dumped)
    the problem is when ever i run it i get this:
    terminate called after throwing an instance of 'std:ut_of_range'
    what(): basic_string::at
    Aborted (core dumped)

    i have asked cplusplus.com, which i have found to be a great forum, but they were unable to help.

  2. #2
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    This kind of thing is just silly, even if it worked correctly
    Code:
    (MBegin = 0) && (MEnd = Line.size() - 1);
    The expression (MBegin = 0) has a value of 0, i.e., false, so the short-circuit nature of && will skip the other expression/assignment.

    Just write it normally.

    And what is regex.h?
    Why aren't you using the C++11 regex?
    The cost of software maintenance increases with the square of the programmer's creativity. - Robert D. Bliss

  3. #3
    Registered User
    Join Date
    Sep 2013
    Posts
    11
    ah that makes sense. i was actually just testing the && to see if it would work and decided to not take it out, but i forgot that it will short circuit. regex.h is the c regex library. i would and have tried to use <regex>, but as im on linux using the gcc i cant, because its not supported yet. i know regex.h is outdated, so its really just a place holder right now until <regex> is supported or i find the time to get boost installed and built

    edit: thanks it worked! you have no idea how helpful that is. my lexer is finally done!
    Last edited by DTSCode; 09-12-2013 at 06:20 PM.

  4. #4
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    No problem. Easy one, actually.
    Get rid of both of those && contraptions and write what you actually mean.
    The cost of software maintenance increases with the square of the programmer's creativity. - Robert D. Bliss

  5. #5
    Registered User
    Join Date
    Oct 2006
    Posts
    2,540
    Quote Originally Posted by DTSCode View Post
    but as im on linux using the gcc i cant, because its not supported yet
    what linux distribution and gcc version do you have? do you know you can build your own custom version of gcc?
    Code:
    namespace life
    {
        const bool change = true;
    }

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Review my simple Lexer please..
    By manasij7479 in forum C++ Programming
    Replies: 4
    Last Post: 08-19-2011, 11:56 AM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21