Compilers: Principles, Techniques, and Tools (2nd Edition)

Category: Hardware & DIY
Author: Alfred V. Aho
4.1
All Stack Overflow 42
This Year Stack Overflow 5
This Month Reddit 2

Comments

by 0xf3e   2019-08-24

For anyone who wants to learn more about compilers and loves reading books, the so called dragon book is highly recommended lecture on this topic: https://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811

by fbhc   2019-08-24

My compilers course in college used the Dragon Book, which is one of the more quintessential books on the subject.

​

But you might also consider Basics of Compiler Design which is a good and freely available resource.

​

I'd also suggest that you have familiarity with formal languages and automata, preferably through a Theory of Computation course (Sipser's Introduction to the Theory of Computation is a good resource). But these texts provide a brief primer.

by anonymous   2019-07-21

I've head Compilers: Principles, Techniques, and Tools is solid.

http://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811/ref=sr_1_1?s=books&ie=UTF8&qid=1328013630&sr=1-1

by lordyod   2019-07-21

What sort of stuff are you interested in?

I haven't taken CS109 but I took a very similar course at my community college. The way C++ implements object-oriented programming is different (and much more detailed) than some other languages so if you're only familiar with, say, Java, and you want to get into video game programming, or something like that, it might be worth taking it. I don't know a whole lot about the programming assignments or lectures or anything but it seems like the sort of thing someone should know if they want to get into that field.

CS104A will take you through (roughly) the first half of the dragon book, which is considered by many to be the standard textbook on compilers. You will construct a compiler for a C-like language which takes source code as input and outputs code that looks like assembly and compiles with GCC. The programming projects are significant - many students do not complete all of them because they involve a substantial amount of work. The lecture topics build on material taught in CS130, and if you haven't taken that course (I hadn't) you may find yourself on shaky ground when it comes to the material dealing with finite automata (this is a big part of the course). The class is in some ways about what make programming languages work, how computers take the code that we write and turn it into code a machine can run, and how doing that differently may create big differences in the languages used. For myself I found the class very rewarding because I am quite interested in this sort of machine-level view of computer science and engineering, and I find comparisons between different languages at this very low-level perspective to be very interesting. I imagine that many students would be bored to tears at this however. I assume you are aware of Mackey's reputation as an instructor, and I won't try to sugar coat it, it's difficult to sit through. That said, he is extremely knowledgeable on the subject and I feel like ignoring the opportunity to learn this sort of thing from him while you are here is a mistake.

If you want to dive head-first into what makes computers, you know, compute, I'd suggest you take 104a.

by nemec   2019-07-21

The good news is that .NET already performs Tokenization for you, so what you put on the command line (program.exe arg1 arg2) becomes a string array in your Main method - some of the work is already done for you.

The standard book for compilers (parsing is integral to compilers, so you will find a lot of overlap in concepts with CLI parsing) is The Dragon Book but it can be difficult for people who haven't spent time learning academic CS theory.

This series of blog posts is a bit of a more gentle introduction to parsing. It's in Java but the code is very, very similar to C# (the biggest differences are foreach loops, implements instead of : to implement an interface, and naming conventions for interfaces (Java doesn't add an I) and methods (Java camelCases method names instead of PascalCase).

A lot of parsing involves creating a kind of "state machine" that allows you to loop through Tokens (items in your args array) and decide, "what do I do next?"

by ezweave   2019-07-21

Before you make a radical change, I would consider what you find challenging or, perhaps boring about other Engineering disciplines. I can't speak to your school's CS program (at some University's it's under the college of Engineering, other's math, and some are ABET accredited and others are not), but "there be dragons" (especially if you use the classic text on Compiler Design in your Compiler class). It isn't really any easier, but it might be easier for you.

I've been working in Software (including my internship) for 20 years. I have a BS in CS, an MS in CS, and whatever credibility getting 1/3 of the way through a PhD gets me in this discussion (very little).

I disagree with the assertion that a CS minor is as useful as a major... there's just a great deal of interesting, useful topics you will not approach and, yeah, they're probably the more "mathy" ones. Most undergrads find the following courses/subjects difficult and often drop out of a CS program (sometimes to an IS degree, which still involves programming but is more applicable to most software jobs... you just don't have as deep of a background or understanding regarding the "meat" of CS) due to their rigor: * Automata * Algorithms * Principles of Computer Programming (this is where most people learn FP via LISP/Scheme or maybe even Haskell... I have been out of school for a while) * Statistics * Machine Learning * Compiler Design

If those terms are alien to you, do some digging and learn a little about them, because most ABET programs will require them. A minor won't expose you to too much of that... anecdotally, I minored in Math, Physics, and Economics, but wouldn't consider myself an expert in any. E.g. I wouldn't think I could jump into a career in any of those solely (unless it was teaching, not denigrating teachers, just being honest).

However, I've also worked with some talented developers who dropped out of school and some young kids who, while lacking depth or breadth, aren't half bad in certain tech stacks who only went to one of those "Coding Boot Camps." If you're smart enough, and a self-starter (like someone who contributes to OS projects, works their way through "The Art of Computer Programming" and the like) you don't necessarily need a piece of paper and the debt that comes with it.

Now that I've said that, Computer Science may still be your bag.

Even if you never use the knowledge you acquire in many of the courses, you will be at least aware of what's actually going on when you, say, invoke a given function. While the "Code Camp" kids are often eager learners, there's a great deal they don't know and... they're not always quick with abstraction and composition of logic (some of the most creative aspects of programming). You'll also (sorry if it comes off as cheesy) expand your mind. Pushing yourself to understand things like Automata and Discrete Mathematics, to name a few topics, will just make you smarter. Even if you end up working as a "full stack dev" at a Ruby/React/whatever shop, it will be useful and most of the devs that have some formal education, can more easily pick up new languages because they better understand the underlying concepts. In a world where Node has made JS an actual, viable language, for many projects (and the boost by FP-lite and full FP libraries like lodash, ramda, monet and such) being able to also debug and rewrite a block of C++ or Scala is a good thing.

What's great about CS is that you can do so many things with it. You can get a job as a web dev, or work more in data science. You can work on natural language processing or video games. You're basically learning "how and why" to paint, using art as an analogy for CS and programming. That's one of the biggest differences between other Engineering disciplines. You're not limited by materials or thermodynamics, you're only limited by your willingness to work and learn and your own imagination.

To be fair, each of the disciplines that have been discussed have their own hurdles. None of them are "easier" than the other. ME is not greater than CS. CE is not less than. They're all just different things. They will all have points where they "become difficult" and you have to have the fortitude to push on. I know lots of folks who jumped around from major to major only to end up with a degree in Communications from the Arts College and no career path. The only real ding in CS is that a lot of the academic knowledge is not directly applicable to a job. They don't always teach you about version control systems (like Git) or Software Engineering processes like Agile, Less, Rational, et cetera. (My program did a bit of all of that, but experience has taught me that not all do.) I would strongly encourage you to get aquatinted with the Node ecosystem and find some open source projects to contribute to, if you ever have down time in your studies. The practical knowledge you will gain will help you get a job.

by anonymous   2019-07-21

Your question is a little vague, however it sounds like you need to develop some kind of parser to read in the cfg files and translate it into some form of intermediate language or object graph, optimize it, and then output it to c++. Sounds to me like a job for a home-grown compiler.

If you aren't familar with the different phases of a compiler I would highly recommend you check out the infamous dragon book

http://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811/ref=sr_1_2?ie=UTF8&s=books&qid=1244657404&sr=8-2

Then again, if this is for an important project with a deadline you probably don't have a lot of time to spend in the world of compiler theory. Instead you might want to check out antlr. It is really useful for creating a lexar and parser for you based on grammar rules that you define from the syntax of the cfg files. You can use the antlr parser to translate the cfg files into an AST or some other form of object graph. At that point you are going to be responsible for manipulating, optimizing and outputting the c++ syntax to a new file.

I haven't read it yet but this is supposed to be an excellent book for novice and experienced antlr users

http://www.pragprog.com/titles/tpantlr/the-definitive-antlr-reference

plus there a plenty of antlr tutorials and examples online that I've used to help learn it. Hope that helps put you in the right direction.

by anonymous   2019-07-21

When you have to access local variables from other function calls, I think you had better redesign your code.

Theoretically, you can direct to modify the i of main() in fun() if you can completely understand how the compilers deal with the activation records of function calls on the run-time stack. You can read "Compilers: Principles, Techniques, and Tools" for details( http://www.amazon.com/Compilers-Principles-Techniques-Tools-Edition/dp/0321486811 )

enter image description here

The value of j depends on the run-time stack address between the int i; in the fun() and int i = 10; in main() . In this case, when fun() is called, their relative distance on stack is just 12. That is why the j is 12. So the *(p + j) = 20; actually changed the i of the main(). If you change your code by adding int a = 14; as following, you will find the value of j changed for the activation record on the run-time stack has been changed.

#include <stdio.h>

void fun()
{
  int i;
  int *p=&i;
  int j;
  for(j=0;*(p+j)!=10;j++);
  printf("%d",j);
  /* Stack Frame size is j int pointers. */
  *(p+j)=20;
}

main()
{
  int i=10;
  int a=14;  /* add this for example to change the relative address i in the main and i in the fun*/
  fun();
  printf("\n %d \n",i);
}
by anonymous   2019-07-21

There's no big difference between writing an ordinary C compiler and writing a shader compiler. The standard book on writing compilers is the so called "Dragon Book":

http://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811/ref=ntt_at_ep_dpi_1

by anonymous   2019-01-13

Damn, whenever you get to studying compilers this will be crazy to look back to.

Anyway you are probably not in the level of building a C compiler from scratch.

So think of it this way, you need to have a set of instructions and yes you will have to go through the program string and transform the string into MIPS instructions.

The most naive way to do it and it might be what your teacher expects unless it is a compiler course, would be to parse the program text line by line as it is expected to be a simple program and as you said have a lot of conditions for each type of expression that you will evaluate.

A tip: Save for / while loops as a label as soon as you read them and from there you need to check where is the endpoint of it for the jump when necessary (completed for/while conditions etc)

Now if it is a compiler project. I'd strongly recommend you read into this book:

Compilers: Principles, Techniques, and Tools

Because to build a real compiler you need to understand all the stages of a compiler and how does it work together... You would also need to know some Language Theory.

by Michael Stum   2019-01-13

Big List of Resources:

Legend:

  • ¶ Link to a PDF file
  • $ Link to a printed book
by anonymous   2019-01-13

The Abstract Syntax Tree of different languages are usually not compatible. This is the case because the AST represents the code that has been written in the respective language. That means you could traverse the tree and format it to code with identical syntax again (i.e. the whitespaces will differ, the rest is the same). However they cannot be compatible, because the languages have different constructs. For example Groovy has closures which is not the case for Java. You can usually find a mapping to different concepts that will be equivalent, but that's not the point of an AST.

The reason AST transformations are a part of Groovy whereas they are not part of Java is the same closures are part of Groovy but not of Java: different design decisions. Java was designed to be simple. Easy to get into and easy to read albeit often verbose. Groovy had a different focus. The syntax is more concise and things like domain specific languages are desired.

If you are more interested in the internals of compilers I recommend the "Dragon Book". It's as far as I know the standard you read in academics (I read it when I was studying).

by anonymous   2018-03-26
[Compilers: Principles, Techniques, and Tools](https://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811) by Aho, et al.
by anonymous   2018-03-19

Yes, you can certainly use lex and yacc to build a compiler/translator for a programming or scripting language. There are GNU variants of these tools, called flex and bison. John Levine's lex & yacc was for many years the gold standard for books about these tools. That book may be out of print, but I expect that the successor book, Flex & Bison, is just as good. To dig deeper into building a compiler, start with Aho et al., Compilers: Principles, Techniques, and Tools, 2/e. (Again, my recommendation is based on the first edition of this book.)

by Bluegorilla101   2018-02-16

Should have gotten her this dragon book so she can get a headstart on writing compilers.

by correctsbadalts   2018-02-16

Was expecting this dragon book

by OhYourFuckingGod   2018-02-16

She's trying to hide her disappointment. I'm pretty sure this is the one she wanted .

by dvt   2017-10-05
I strongly suggest you build something esoteric and fun first. This should be a "bare minimum" VM or interpreter. Here's one of mine that I wrote like a 8 years ago[1]: https://www.amazon.com/Compilers-Principles-Techniques-Tools...