← Back to all projects

tac

Three Address Codes and Register Allocation from scratch

Motivation

Compilers are something that have always fascinated me. As I went deeper into my time as a student, the kinds of problems that exist in that world just always seemed so exciting and challenging. As such, I wanted to move past interpreter space and get into compilers and optimization more.

This project was my first dive into the basics of IR and Register Allocation, with a simple arithmetic language that supports mathematical expressions and jumping. The final code generated is NASM compatible ASM.

Implementation

Architecture Overview

The TAC compiler follows a traditional multi-stage compilation pipeline:

Source Code → Tokenizer → Token Stream → Parser → AST → TAC Generator → TAC Stream → Register Allocator → TAC Stream w/ Context → Assembly
                    

The Tokenizer and Parser are both nothing amazing, just thrown together so I could get down to the new stuff I hadn't touched yet:

1. Three Address Code Generation

The IR generator transforms the AST into Three Address Code (TAC) intermediate representation:

2. Register Allocation

The register allocator maps virtual temporaries to physical CPU registers using a graph coloring approach:

3. Assembly Generation

The assembler translates register-allocated TAC instructions to NASM-compatible x86-64 assembly:

Overall, this project was a great start in the world of compilers, bridging me further from AST land to an optimized binary. I'm looking forward to my next step towards this world ;)