Tiger Compiler

Yudai Chen

A toy compiler that includes lexical analysis, parsing, AST generation, code generation, and GUI. It supports comments, basic data types / operations, and function calls.

Tiger language:

Please check the Tiger Language Manual.

1. Lexical Analysis

The main task of lexical analysis is to break the input into individual tokens, and keep the position of every token. To achieve a lexical analyzer, we use Lex, an automatic lexical analyzer generator, to translate regular expressions into a DFA.

There are two kinds of input words to deal with: one is the lexical tokens, including reserved words and punctuation symbols; the other is the special character sequences, such as string constants, comments, escape sequences and so on. To process the former kind, we write regular expression to specify the lexical tokens; while for the latter one, we use state machines to deal with them.

2. Syntax Analysis

The main task for syntax analysis is to parse the phrase structure of the program. To achieve a syntax parser, I use Yacc, a classic and widely used parser generator, to complete LR parsing for me.

3. Semantic Analysis

After the syntax analysis, we get an abstract tree. This tree will be transmitted to “Semant” module to do the semantic analysis whose main tasks are to construct environment tables and type check. In Semant module, we will call the Translate module to generate the intermediate code.

4. Trace Produce

After a few previous steps, we finally got a Intermediate Representation Tree. What’s more has to be done is to convert the Intermediate Representation Tree to assembly language or machine language. However, there are certain aspects of the Tree language that do not correspond exactly with machine languages, and some aspects of the Tree language interfere with compile-time optimization analyses.

To deal with this problem, we should perform an extra step, to translate the original Tree to another form, which is easy to translate to machine language.

5. Target Code Generation

I implemented the Maximal Munch algorithm to generate x86 instructions. For each of the abstract assembly-language types, I attached it with an instruction.

6. Performance Tests

For more information, please refer to the PDF attached to this page.

Yudai Chen
Yudai Chen
Master’s Student at Rice University, former ZJUer

I obtained my Bachelor of Engineering degree in Computer Science from Chu Kochen Honors College of Zhejiang University in July 2019. Currently I’m a master’s student majoring in Computer Science at Rice University. My interests include web development(mainly backend), distributed system, cloud computing, high performance computing, and infrastructure.