Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages (Pragmatic Programmers)

Category: Programming
Author: Terence Parr
4.4
This Month Stack Overflow 2

Comments

by anonymous   2019-07-21

"From the ground up" is a quite relative term, especially if you consider Python as the implementation language. I think what you are looking for is the implementation of a domain specific language (DSL). Good starting points might be this book or this one. DSLs are a wide topic, so if you provide more details, we might be able to give better tips.

by anonymous   2019-07-21

This is a great book to help get started http://www.amazon.com/Language-Implementation-Patterns-Domain-Specific-Programming/dp/193435645X/

The stages of building a language are

  1. Lexing. Lexing means being able to read certain categories of tokens. A token can be a series of digits 12376 or text strings like 'Hello'. The lexing looks at the first character (and it may also look ahead to the second character) to determine what it is. In the case of a number, it sees a digit and then proceeds to read the series of digits (by calling a subroutine), or in the case of a string it sees a quote then proceeds to read a string. The result of the lexer is a token which is a type (a number or string in this example) and the text of the token. This is normally stored in a struct as Kind int and Text string with constants declared to represent the kinds.

  2. The next building block is the parser. The parser sees the series of tokens, so it might see Identifier then looking ahead will see an =. Then it will branch off into an assignment. The parser builds a tree. In the case of an assignment, it will build a "node" of type "assign" then it will store the identifier in the first child and the expression in the second child. All tree nodes are "operations", meaning that they do something. You will not just a string or integer as a Node, you will have "Add" or "Append" etc as nodes (unless it is an expression, but expressions are contained by operations).

  3. The last part is execution. This is done by walking the tree and executing the nodes.

There is a lot of other machinery involved such as Memory, Scope, and the look ahead machinery. This is explained in the link above.