Mao is implemented in Rust as a recursive descent parser with a unique twist: the syntax rules are dynamically generated based on a hash of the source code itself.
Core Architecture
The interpreter follows a traditional pipeline:
- Source Hashing: The interpreter computes a SHA256 hash of the source code (that excludes whitespace so you can't cheat) to create a deterministic seed
- Rule Generation: Using this seed, the interpreter generates random but consistent syntax rules for the current file
- Tokenizing: Using the generated rules, the tokenizer will properly tokenize the valid keywords for the rules, but also keep track of any identifiers that look like they belong to a different ruleset for "nice" error messages.
- Parsing: The parser enforces the dynamically generated grammar rules
- Execution: A simple treewalker executes the validated AST
Dynamic Syntax Generation
The syntax randomization covers:
- Variable Declaration: 6 possible forms including `let`, `var`, `def`, `make`, `create`, and `new`
- Conditional Statements: 3 different keywords for if statements: `if`, `when`, `check`
- Boolean Values: 4 variations for true and false, including `:)` as true and `:(` as false
- Output Statements: Multiple possible print commands like `print`, `say`, `output`, `fmt.Println`
- Statement Terminators: Random choice between `;`, `.`, or the word `done`
- Parenthesis Rules: Sometimes required, sometimes optional for function calls and expressions
Testing
Generating valid Mao is hard, very hard. Even harder as your file gets larger. As such, I created a brute forcing script to try all syntax combinations for a desired file, which fails most of the time as well.
After a lot of trial and error, a few working Mao scripts were eventually created, here is the longest valid Mao script in all it's beauty:
if ( :( ) :
console.log("If/Else statements are working :D");
} then :
console.log("It's False");
}