Quote Originally Posted by hamster_nz View Post
Compilers are very complex beasts, some of the most complex bits of software engineering, but don't let that discourage you! If I use the analogy of engineering, building a compiler could be thought of as building a structure like a bridge.

It is somewhat trivial to make a very simple bridge - to put a tree trunk over a small stream perhaps. It isn't going to be very useful to many people. It might just be useful to you. But that's OK.

But as you try to make bigger bridges that offer more utility to more people, with the strength and robustness to carry heavier workloads in safety the cost and complexity ramps up quickly.

Soon it becomes too difficult for an unassisted person to build a small bridge, and if you are not a bridgebuilder by trade you will need specialist advice, and possibly skills. Think of something that will allow you to safely drive a light vehicle over it. Such a bridge will need proper foundations and abutments to spread the load. It quickly becomes an engineering problem.

I'm not saying that you shouldn't try to build your own compiler - I would actively encourage it, and can point to projects like TCC ( TCC : Tiny C Compiler ) and LCC ( LCC (compiler) - Wikipedia ) where people have done this with great results. You will learn some really interesting stuff, and deep knowledge can be very useful. But to write a useful compiler is a very large undertaking, and will take many years of skill and effort.

Where you will meet resistance is if you start to use forums to answer every question and line of enquiry, rather than to solve a very specific problem (e.g. asking "how can I build a symbol table" vs "my symbol table implementation doesn't work correctly because of ..."). In my experience people on forums like fixing things, not explaining things that are better explained elsewhere.

Reading a good book on compiler building will cover that topic and more - tokenizing source, lexical analysis, maintaining symbol tables, code generation, parsing, building directed acyclic graphs, efficient register allocation, dataflow analysis, data type representation, operator precedence, code optimization strategies...

When I was young, The Dragon Book was a must-read for the budding compiler writer - Compilers: Principles, Techniques, and Tools - Wikipedia - I don't know what the current recommended one is.
Thanks a lot for your thoughts and info Sir! Like I said to a previous comment, I need to get more discipline (I'm young so this may play a role) so reading and learning is surely in my TODO list.
I love the projects you mentioned and these "small" and "portable" C compiles are the best compilers in my humble opinion! Yes, they don't do the best optimizations but they are made by one person
which acts as a motivation for people like me because it shows how far you can get and how many amazing things you can create even as a single individual (of course these projects have had help from
other contributors over the years but this is just natural and it's not bad. The exact opposite actually!). And on top of that, they are very small in size and they compile source code SUPER FAST!!! Which is something
that as it seems, people forgot about and they got used to the VERY SLOW compile times of the "industry standard" compilers. This motivates me! Of course, the fact that compilers like GCC are so slow is
because of the optimizations they do but there are projects like QBE which have show us than we can get 60-70% of the performance of these compilers (or even 90-100% in some cases) with 10+ times faster
compile times!!! This is why I used to be obsessed with TCC (and it's not that I don't like it anymore but I just... relaxed a little bit, lol!) and its unbelievably fast compile times I plan on reading its source code
and learn one thing or two! Output object code directly is a GigaChad achievement anyways so it's my dream to be able to create a compiler that will do that and then only need to use a linker to link them.
Of course, I would be happy with Assembly too (especially as the beginning) but machine code is every compiler designers dream!

I have searched about books. I have found one that I consider really good! It is called"Modern Compiler Design 2nd editon" and it was published in 2012.
The one you're talking about is the most popular and well-know! However, as it is very old, I'm not going to read it and this is because as I was told by
another user on the forum, compiles and interpreters are the most well studied fields in programming (and it makes sense) so new and better techniques
are discovered every day. For that reason, I'll try to read a book that is as up to date as possible or which is at least less that 15 years old. You will say that
it doesn't matter as I'm just a beginner but I'll try to spend my time as wisely as I can (because it's not infinite or even guaranteed) so I'll have better chances
with a book that is more up to date. Of course, this is not as easy as "find the latest one" as you have other things to consider like:
* how good the book is?
* how hard it will be to read for someone that hasn't read any books about compilers?
* What's the scope of the book? Remember, I want to learn how to create a full compiler that will be efficient and will output machine instructions (of course, I'll expect machine instructions if I'm reading a book).
So in general, finding a book is not an easy task. But like you said, there is no way that I'll think and invent everything myself.
So not reading a bookand not getting help, is shooting myself in the foot!

Also, thank you on your notes for the forum questions! I now see how asking small questions like the one I did is pointless. I'll focus on specific problems like you mentioned (and I love the example about the
"Symbol Table" as this is a problem that I'm thinking A LOT and there are tons of ways to do it!).