All Comments
TopTalkedBooks posted at August 19, 2017
For anyone interested in compiler writing and looking for a good resource to start, probably one of the best is the "Dragon Book":

http://www.amazon.com/Compilers-Principles-Techniques-Tools-...

I highly recommend it, but it's heavy stuff. There are probably simpler guides out there that just cover the basics.

TopTalkedBooks posted at August 19, 2017
There are many different styles and paths to learning 'Computer Science'.

But if what you are after is insight into how a computer works I found that I had my 'ah-ha' moment while learning C, Assembly (intel), and writing a compiler. I did have to have a slight basis in computer architecture, but that compiler project I worked on made everything click.

(side note on writing a compiler. Read, Decode, Execute. There are no short cuts around those series of steps).

If you are looking for a book I would recommend the 'Dragon Book'

http://www.amazon.com/Compilers-Principles-Techniques-Tools-...

I found a paper copy of the international version for cheap (like $10 US if I remember) that was amazing.

TopTalkedBooks posted at August 19, 2017
In 2006 a second edition of the dragon book was released: https://www.amazon.com/Compilers-Principles-Techniques-Tools...
TopTalkedBooks posted at August 20, 2017

You can automatically calculate first, follow, and predict sets using Calculate Predict, First, and Follow Sets from BNF (Backus Naur Form) Grammar Specification without having to download anything. It's a good way to verify answers or automate the tedium.

If you want to do it manually, the Dragon Book (2nd ed) covers it on pages 221-222.

TopTalkedBooks posted at August 20, 2017

Just as noted in my answer to another almost identical question of yours from earlier today: Don't do that.

As soon as a user enters a "#" or ";" into one of the text fields your csv file (or rather: what you call a CSV file, but actually isn't one at all) will get corrupted and crash your code once read in again (or at least result in malformed data).

Again: Do NOT do that.

Instead: stick with real CSV and a parser/writer written by a professional.

Generally speaking: Unless you have very good knowledge of Chomsky's hierarchy of formal languages and experience in writing language/file-format parsers stick with a field-tested library, instead of coding up your own.

Languages/Formats such as CSV look trivial at first glance but aren't by any means (as in type-2-language).

TopTalkedBooks posted at August 20, 2017

You want to know if the compiler produced "clean, concise and fast code".

"Clean" has little meaning here. Clean code is code which promotes readability and maintainability -- by human beings. Thus, this property relates to what the programmer sees, i.e. the source code. There is no notion of cleanliness for binary code produced by a compiler that will be looked at by the CPU only. If you wrote a nice set of classes to abstract your problem, then your code is as clean as it can get.

"Concise code" has two meanings. For source code, this is about saving the scarce programmer eye and brain resources, but, as I pointed out above, this does not apply to compiler output, since there is no human involved at that point. The other meaning is about code which is compact, thus having lower storage cost. This can have an impact on execution speed, because RAM is slow, and thus you really want the innermost loops of your code to fit in the CPU level 1 cache. The size of the functions produced by the compiler can be obtained with some developer tools; on systems which use GNU binutils, you can use the size command to get the total code and data sizes in an object file (a compiled .o), and objdump to get more information. In particular, objdump -x will give the size of each individual function.

"Fast" is something to be measured. If you want to know whether your code is fast or not, then benchmark it. If the code turns out to be too slow for your problem at hand (this does not happen often) and you have some compelling theoretical reason to believe that the hardware could do much better (e.g. because you estimated the number of involved operations, delved into the CPU manuals, and mastered all the memory bandwidth and cache issues), then (and only then) is it time to have a look at what the compiler did with your code. Barring these conditions, cleanliness of source code is a much more important issue.

All that being said, it can help quite a lot if you have a priori notions of what a compiler can do. This requires some training. I suggest that you have a look at the classic dragon book; but otherwise you will have to spend some time compiling some example code and looking at the assembly output. C++ is not the easiest language for that, you may want to begin with plain C. Ideally, once you know enough to be able to write your own compiler, then you know what a compiler can do, and you can guess what it will do on a given code.

TopTalkedBooks posted at August 20, 2017

The clasical dragon book explains very well how LR parsers work. There is also Parsing Techniques. A Practical Guide. where you can read about them, if I remember well. The article in wikipedia (at least the introduction) is not right. They were created by Donald Knuth, and he explains them in his The Art of Computer Programming Volume 5. If you understand spanish, there is a complete list of books here posted by me. Not all that books are in spanish, either.

Before to understand how they work, you must understand a few concepts like first, follows and lookahead. Also, I really recommend you to understand the concepts behind LL (descendant) parsers before trying to understand LR (ascendant) parsers.

There are a family of parsers LR, specially LR(K), SLR(K) and LALR(K), where K is how many lookahead they need to work. Yacc supports LALR(1) parsers, but you can make tweaks, not theory based, to make it works with more powerful kind of grammars.

About performance, it depends on the grammar being analyzed. They execute in linear time, but how many space they need depends on how many states do you build for the final parser.

TopTalkedBooks posted at August 20, 2017

How many contemporary algorithms are there to build a parser?

To start with one can look at the list of the common parser generators.
See: Comparison of parser generators and look under the heading Parsing algorithm.

ALL(*)  
Backtracking Bottom-up  
Backtracking LALR(1)  
Backtracking LALR(k)  
GLR  
LALR(1)  
LR(1)  
IELR(1)  
LALR(K)
LR(K)  
LL  
LL(1)
LL(*)  
LL(1), Backtracking, Shunting yard
LL(k) + syntactic and semantic predicates  
LL, Backtracking  
LR(0)  
SLR  
Recursive descent  
Recursive descent, Backtracking  
PEG parser interpreter, Packrat  
Packrat (modified)  
Packrat  
Packrat + Cut + Left Recursion  
Packrat (modified), mutating interpreter  
2-phase scannerless top-down backtracking + runtime support  
Packrat (modified to support left-recursion and resolve grammar ambiguity)  
Parsing Machine  
Earley  
Recursive descent + Pratt  
Packrat (modified, partial memoization)  
Hybrid recursive descent / operator precedence  
Scannerless GLR  
runtime-extensible GLR  
Scannerless, two phase  
Combinators  
Earley/combinators  
Earley/combinators, infinitary CFGs  
Scannerless GLR  
delta chain  

Besides parser generators, there are also other algorithms/means to parse. In particular Prolog has DCG and most people who have written their first parser from scratch without formal training typically start with recursive descent. Also Chart parser and Left corner parser.

In writing parsers the first question that I always ask myself is how can I make a grammar for the language at the highest type in the Chomsky hierarchy. Here lowest is Type-0 and highest is Type-3.

Almost 90% of the time it is a Type-2 grammar (context-free grammars), then for the easer task it is a Type-3 grammar (regular grammars). I have experimented with Type-1 grammars (context-sensitive grammars) and even Type-0 grammars (unrestricted grammars).

And what's the advantage/limitation of ANTLR's Adaptive LL(*) algorithm?

See the paper written by Terrence Parr the creator of Adaptive LL(*): Adaptive LL(*) Parsing: The Power of Dynamic Analysis

In practical terms Adaptive LL(*) lets you get from a grammar to a working parser faster because you do not have to understand as much parsing theory because Adaptive LL(*) is, shall I say, nimble enough to side step the mines you unknowingly place in the grammar. The price for this is that some of the mines you unknowingly place in the grammar can lead to inefficiencies in the runtime of the parser.

For most practical programming language purposes Adaptive LL(*) is enough. IIRC Adaptive LL(*) can NOT do Type-0 grammars (unrestricted grammars) which Prolog DCG can, but as I said, most people and most common programming task only need either type 2 or type 3.

Also most parser generators are for type 2, but that does not mean they can't do type 1 or possibly type 0. I cannot be more specific as I do not have practical experience with all of them.

Anytime you use a parsing tool or library there is a learning curve to learning how to use it and what it can and can not do.

If you are new to lexing/parsing and really want to understand it more then take a course and/or read Compilers: Principles, Techniques, and Tools (2nd Edition)

TopTalkedBooks posted at September 05, 2017

If you want to show it as a text, then VZ's answer is the way to go; however, if you want to show the result of an arithmetic operation, then you need to write a small parser, a good book would be: https://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811

TopTalkedBooks posted at September 07, 2017
You'll want the Dragon Book as a reference, not necessary to get started with, but to build the semantic tree on which you will need to pin concepts on. https://www.amazon.com/Compilers-Principles-Techniques-Tools...

If you aren't yet committed to any language, you can start building a parser with PyParsing. It's really easy. http://pyparsing.wikispaces.com/

If you want to take a quick (albeit expensive) class on it, Dave Beazley offers one: http://www.dabeaz.com/chicago/compiler.html

TopTalkedBooks posted at October 05, 2017
I strongly suggest you build something esoteric and fun first. This should be a "bare minimum" VM or interpreter. Here's one of mine that I wrote like a 8 years ago[1]: https://www.amazon.com/Compilers-Principles-Techniques-Tools...
TopTalkedBooks posted at October 23, 2017
I used the dragon book (2nd edition) - https://www.amazon.com/Compilers-Principles-Techniques-Tools...

That proof isn't a problem in the book; it's mentioned, with a hint for those inclined to derive it themselves, in the running text.

TopTalkedBooks posted at February 16, 2018

She's trying to hide her disappointment. I'm pretty sure this is the one she wanted .

TopTalkedBooks posted at February 16, 2018

Was expecting this dragon book

TopTalkedBooks posted at February 16, 2018

Should have gotten her this dragon book so she can get a headstart on writing compilers.

TopTalkedBooks posted at March 18, 2018

Yes, you can certainly use lex and yacc to build a compiler/translator for a programming or scripting language. There are GNU variants of these tools, called flex and bison. John Levine's lex & yacc was for many years the gold standard for books about these tools. That book may be out of print, but I expect that the successor book, Flex & Bison, is just as good. To dig deeper into building a compiler, start with Aho et al., Compilers: Principles, Techniques, and Tools, 2/e. (Again, my recommendation is based on the first edition of this book.)

Top Books
We collected top books from hacker news, stack overflow, Reddit, which are recommended by amazing people.