Like Tree2Likes
  • 1 Post By phantomotap
  • 1 Post By Salem

hardest part of parsing c++?

This is a discussion on hardest part of parsing c++? within the C++ Programming forums, part of the General Programming Boards category; So for a university project I'm thinking of writing a c++ parser for analytical purposes. Now, there are open-source parsers ...

  1. #1
    Registered User
    Join Date
    Jan 2005
    Posts
    108

    hardest part of parsing c++?

    So for a university project I'm thinking of writing a c++ parser for analytical purposes.

    Now, there are open-source parsers that can do this (clang and g++'s parser, but only the former can probably be used easily), but since this is a parsing-themed project I can't use that.

    I asked around for a bit and someone said that the hardest thing about parsing c++ is NOT the #define, #ifdefs and templates, but something else. I hadn't got an answer back on what it was, but what do people think the hardest part of parsing c++ is?

  2. #2
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    4,261
    So for a university project I'm thinking of writing a c++ parser for analytical purposes.
    For the sake of your grades, do something else.

    Soma

  3. #3
    Registered User
    Join Date
    Jan 2005
    Posts
    108
    I forgot to mention, I have two semesters (a year) and the promise of 100% grade for this. Well, if it does what I want it to do, that is. That, and I don't have to attend much classes as opposed to taking some other courses.

    In short, it's like a super thesis. Would it still be worth it?

  4. #4
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    I'd say that would depend on your deadline...

    Parsing a name out of a string is one thing... parsing source code --especially with a couple of dozen keywords, operators, scopes, classes, templates, and over a thousand library functions is not something you do in a few days... or even a month.

    Unless you pick some extremely elemental task such as counting the frequency of keywords or checking bracket matching, you're in for a big job and a long haul.

    Yes it would be a great project... but can you do it in the time allotted?

  5. #5
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    4,261
    Would it still be worth it?
    Unless they only care if it parses some given percent (like 83%), you are asking to fail.

    Edison Design Group, the GCC community, Microsoft, Sun, and I can names dozens more, have all failed to parse all of C++ correctly after trying for years.

    Soma
    Salem likes this.

  6. #6
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    4,261
    Parsing a name out of a string is one thing... parsing source code --especially with a couple of dozen keywords, operators, scopes, classes, templates, and over a thousand library functions is not something you do in a few days... or even a month.
    O_o

    You only need to parse the language; if you can parse the language, you can parse "over a thousand library functions" by definition.

    Soma

  7. #7
    Registered User
    Join Date
    Jan 2005
    Posts
    108
    Quote Originally Posted by phantomotap View Post
    Unless they only care if it parses some given percent (like 83%), you are asking to fail.

    Edison Design Group, the GCC community, Microsoft, Sun, and I can names dozens more, have all failed to parse all of C++ correctly after trying for years.

    Soma
    Hmm, I see. Well, I doubt I'll have to parse absolutely everything, just enough to do code analysis on a code base of roughly 100k SLOC.

    From what you said though, it sounds like actually getting one working would be akin to finding gold.. and there' sactually stuff that the GCC can't parse? That's crazy.

  8. #8
    Registered User
    Join Date
    Jun 2005
    Posts
    6,299
    It is actually a topic for debate as to whether gcc (and most other compilers) have trouble parsing the language, or can parse the language and then have trouble processing the output from the parser in order to produce output (for example, the compiled object). The location of problems actually depend on the architecture of the particular compiler: for example, where are the boundaries between the compiler front end (which is normally where a C++ grammar parser might be considered to reside), middle end (which will notionally receive some form of intermediate representation of the program being compiled from the front end and do various transformations), or back end (which squirts out the final compiled code).

    In terms of your project, you would be better of designing a small language that is complete and unambiguous, and writing a parser for that. Practically, this is called "limit your scope to something achievable in the time you have". Or pick another language (for example pascal) for which writing a compiler is often considered to be less complex.
    Right 98% of the time, and don't care about the other 3%.

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    21,734
    Quote Originally Posted by grumpy
    In terms of your project, you would be better of designing a small language that is complete and unambiguous, and writing a parser for that. Practically, this is called "limit your scope to something achievable in the time you have".
    Agreed. I recall writing a static analysis tool in C++, including the parser, for a simplified version of C as a group project in university. Team of 6 students, 2 modules (out of a usual 5) worth of workload each, yet we barely completed it on time (with various bugs). That said, most of my team were rather new to C++, so they had to cross that hurdle too, which took a bit of time.
    C + C++ Compiler: MinGW port of GCC
    Version Control System: Bazaar

    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #10
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,546
    > That, and I don't have to attend much classes as opposed to taking some other courses.
    > In short, it's like a super thesis. Would it still be worth it?
    Is there a large market for C++ grammar wizards in your area?
    Other courses might be more work (though I doubt it compared to what you propose), and you'll have more choices later on.

    Also, if you're not really interested in the parsing, but the analysis which follows, then consider this
    GCC-XML

    Writing another (bad) parser for C++ won't get you very far - there are enough C++ parsers in the world to be getting on with.
    But new and interesting analysis tools might be unique and useful.
    rags_to_riches likes this.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.
    I support http://www.ukip.org/ as the first necessary step to a free Europe.

  11. #11
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,304
    Pascal is a much easier language to write a parser for, especially if you're familiar with the language.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. String parsing(parsing comments out of HTML file)
    By slcjoey in forum C# Programming
    Replies: 0
    Last Post: 07-29-2006, 08:28 PM
  2. Part 243?!?
    By RadRacer in forum A Brief History of Cprogramming.com
    Replies: 9
    Last Post: 11-28-2004, 06:56 PM
  3. Who are you? Part 2
    By Yoshi in forum A Brief History of Cprogramming.com
    Replies: 18
    Last Post: 12-05-2003, 10:31 AM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21