Domain-Specific Languages (DSLs) - Project-Based Learning Path

Goal: Design, implement, and ship DSLs that make complex domains executable. You will learn how to model domain concepts, define syntax, parse and validate input, and execute or compile it into real outcomes.


Internal vs. External DSLs

Internal DSLs live inside a host language (fluent APIs, method chaining). External DSLs define their own syntax and need a parser. The tradeoff is expressiveness vs. tooling cost.

Internal (fluent):
query.select("name").from("users").where("age > 18")

External:
SELECT name FROM users WHERE age > 18

Domain Modeling and Language Design

Good DSLs reflect the domain vocabulary and constraints. You decide which concepts are first-class, which rules are implicit, and how strict the language should be.

Domain concepts -> Language constructs
Product, Bundle, Rule, Constraint

Lexing, Parsing, and ASTs

External DSLs require a lexer and parser. The AST is the canonical model for evaluation and code generation.

Input:  price > 100 AND in_stock
AST:
    AND
   /   \
 (>)   in_stock
 / \
price 100

Evaluation and Code Generation

Some DSLs interpret ASTs directly; others compile to bytecode, SQL, or target language code.

DSL -> AST -> (Interpreter | SQL | Bytecode)

Error UX and Tooling

The language is only useful if errors are actionable. Line/column reporting, suggestions, and examples are part of the product.


What You’ll Master

By completing these projects, you’ll understand:

  • How to design languages for specific problem domains
  • Lexing, parsing, and AST construction
  • Internal DSLs using fluent APIs and method chaining
  • Macro-based metaprogramming for compile-time DSLs
  • Parser combinators and parser generators
  • Code generation and interpretation
  • When to use DSLs vs. general-purpose code

Concept Summary Table

Concept Cluster What You Need to Internalize
DSL Design The domain vocabulary drives syntax and semantics.
Internal DSLs Fluent APIs and constraints can express a language without parsing.
External DSLs Custom syntax requires lexing, parsing, and ASTs.
Semantics Evaluation rules, types, and validation make the DSL safe.
Code Generation DSLs can emit SQL, bytecode, or host-language code.
Tooling UX Error messages and debugging experience matter as much as syntax.

Deep Dive Reading by Concept

DSL Design and Patterns

Concept Book & Chapter
DSL patterns Domain Specific Languages Ch. 1-3 (Martin Fowler)
Language design trade-offs Language Implementation Patterns Ch. 1 (Terence Parr)

Parsing and ASTs

Concept Book & Chapter
Lexing and parsing Crafting Interpreters Ch. 4-6 (Robert Nystrom)
Expression parsing Language Implementation Patterns Ch. 5

Semantics and Execution

Concept Book & Chapter
Semantic analysis Engineering a Compiler Ch. 4-5 (Cooper & Torczon)
Interpretation and execution Crafting Interpreters Ch. 7

Metaprogramming DSLs

Concept Book & Chapter
Macro systems Metaprogramming Elixir Ch. 1-2 (Chris McCord)

Project 1: Fluent Query Builder (Internal DSL)

  • File: DOMAIN_SPECIFIC_LANGUAGES_DSL_PROJECTS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Ruby, Kotlin, TypeScript
  • Coolness Level: Level 2: Practical but Forgettable
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 1: Beginner
  • Knowledge Area: API Design / Fluent Interfaces
  • Software or Tool: ORM-like Query Builder
  • Main Book: “Domain Specific Languages” by Martin Fowler

What you’ll build: A chainable API for building database queries that reads like English: query.select("name", "email").from("users").where("age > 18").order_by("name").limit(10)

Why it teaches DSLs: This is the gentlest introduction to DSL thinking. You’ll learn that a DSL doesn’t require parsing—it can be embedded directly in your host language using method chaining. Every method returns self, enabling the fluent pattern.

Core challenges you’ll face:

  • Method chaining design (each method returns self) → maps to fluent interface pattern
  • State accumulation (tracking what’s been configured) → maps to builder pattern
  • Type safety (preventing invalid combinations) → maps to DSL constraints
  • SQL generation (converting builder state to string) → maps to code generation basics

Key Concepts:

  • Fluent Interface Pattern: “Domain Specific Languages” Chapter 35 - Martin Fowler
  • Builder Pattern: “Design Patterns” Chapter 3 - Gang of Four
  • Method Chaining: “Clean Code” Chapter 3 (Functions) - Robert C. Martin

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic Python, understanding of classes

Learning milestones:

  1. Basic chaining works → You understand fluent interfaces
  2. Multiple where clauses combine correctly → You understand state accumulation
  3. Invalid queries raise helpful errors → You understand DSL validation
  4. Generated SQL is correct and safe → You understand code generation

Real World Outcome

You have a small library that teams can import to build SQL queries safely and readably, with immediate feedback when the API is misused.

query = (Query()
    .select("id", "name")
    .from_table("users")
    .where("status = 'active'")
    .order_by("created_at", "DESC")
    .limit(20))

print(query.to_sql())
# SELECT id, name FROM users WHERE status = 'active' ORDER BY created_at DESC LIMIT 20

The Core Question You’re Answering

“How can a fluent API feel like a language while still enforcing correct structure?”

Concepts You Must Understand First

Stop and research these before coding:

  1. Fluent interface patterns
    • How does method chaining preserve state?
    • Book Reference: Domain Specific Languages Ch. 35
  2. Builder pattern
    • How do you prevent invalid call sequences?
    • Book Reference: Design Patterns (GoF) Ch. 3
  3. SQL query structure
    • What is the canonical clause order?
    • Book Reference: SQL Antipatterns Ch. 1 (Bill Karwin)

Questions to Guide Your Design

  1. What is the minimal set of methods to build a useful query?
  2. How will you prevent .where() before .from_table()?
  3. How will you escape identifiers and values safely?
  4. What does to_sql() return when optional clauses are missing?

Thinking Exercise

Sketch a state diagram of allowed method sequences (select -> from -> where -> order -> limit).

The Interview Questions They’ll Ask

  1. What is the difference between an internal and external DSL?
  2. How does method chaining work under the hood?
  3. How would you enforce ordering constraints in a fluent API?
  4. What are the risks of building SQL via string concatenation?
  5. How do you make a DSL discoverable in an IDE?

Hints in Layers

Hint 1: Store clauses in a struct Keep selects, from, wheres, order_by, limit as fields.

Hint 2: Return a different builder type Use separate types for different stages to prevent invalid calls.

Hint 3: Add a .debug() Print intermediate state so you can see how each call changes it.

Books That Will Help

Topic Book Chapter
Fluent interfaces Domain Specific Languages Ch. 35
Builder pattern Design Patterns Ch. 3
API design Clean Code Ch. 3
SQL structure SQL Antipatterns Ch. 1

Project 2: Configuration File Parser (External DSL - Simple)

  • File: DOMAIN_SPECIFIC_LANGUAGES_DSL_PROJECTS.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go, Python
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Parsing / Lexical Analysis
  • Software or Tool: Config Parser (like TOML/INI)
  • Main Book: “Crafting Interpreters” by Robert Nystrom (free online)

What you’ll build: A parser for a simple configuration format that supports sections, key-value pairs, comments, and basic types (strings, numbers, booleans, arrays).

# Server configuration
[server]
host = "localhost"
port = 8080
debug = true

[database]
connection_string = "postgres://localhost/mydb"
max_connections = 10
allowed_origins = ["http://localhost", "https://example.com"]

Why it teaches DSLs: This is your first “real” external DSL. You’ll implement the fundamental pipeline: Source Text → Lexer → Tokens → Parser → Data Structure. This exact pattern scales to any language.

Core challenges you’ll face:

  • Tokenization (breaking text into meaningful chunks) → maps to lexical analysis
  • Handling whitespace and comments (ignoring irrelevant text) → maps to token filtering
  • Nested structures (arrays, sections) → maps to recursive descent parsing
  • Error messages (line numbers, helpful context) → maps to error recovery
  • Type coercion (strings to numbers/booleans) → maps to semantic analysis

Key Concepts:

  • Lexical Analysis: “Crafting Interpreters” Chapter 4 - Robert Nystrom
  • Recursive Descent Parsing: “Crafting Interpreters” Chapter 6 - Robert Nystrom
  • Finite State Machines for Lexing: “Engineering a Compiler” Chapter 2 - Cooper & Torczon
  • Error Handling in Parsers: “Language Implementation Patterns” Chapter 4 - Terence Parr

Resources for key challenges:

Difficulty: Beginner-Intermediate Time estimate: Weekend Prerequisites: String manipulation, basic data structures

Learning milestones:

  1. Lexer produces correct tokens → You understand tokenization
  2. Parser handles nested arrays → You understand recursive structures
  3. Error messages show line numbers → You understand error tracking
  4. Round-trip works (parse → serialize → parse) → You have a complete implementation

Real World Outcome

You can parse a real config file, inspect the typed values, and get precise errors when the file is malformed.

$ ./config_parser app.conf
Parsed configuration:
  [server]
    host: "0.0.0.0" (string)
    port: 8080 (integer)
    debug: true (boolean)

The Core Question You’re Answering

“How do I turn a human-readable config file into a reliable, typed data structure?”

Concepts You Must Understand First

Stop and research these before coding:

  1. Tokenization and whitespace handling
    • How do you skip comments without losing line numbers?
    • Book Reference: Crafting Interpreters Ch. 4
  2. Recursive descent parsing
    • How do you parse sections and arrays?
    • Book Reference: Crafting Interpreters Ch. 6
  3. Type coercion
    • How do you parse numbers vs. strings vs. booleans?
    • Book Reference: Engineering a Compiler Ch. 4
  4. Error recovery
    • How do you keep parsing after a failure?
    • Book Reference: Language Implementation Patterns Ch. 4

Questions to Guide Your Design

  1. What is the grammar for sections, keys, and values?
  2. How will you handle duplicate keys or conflicting types?
  3. How will you report errors with line/column and context?
  4. What is the data structure that represents the final config?

Thinking Exercise

Write the grammar for this snippet in EBNF:

[server]
host = "localhost"
port = 8080

The Interview Questions They’ll Ask

  1. What is the difference between lexing and parsing?
  2. How do you represent nested structures in a parser?
  3. Why is error reporting critical in config languages?
  4. How would you add support for inline arrays?
  5. What is the tradeoff between strict and permissive parsing?

Hints in Layers

Hint 1: Implement a token dump first Make sure you can see the tokens for each line.

Hint 2: Parse a single section Handle [section] and key = value before arrays.

Hint 3: Keep raw strings until type parsing Parse value tokens into typed values at the end.

Books That Will Help

Topic Book Chapter
Lexing Crafting Interpreters Ch. 4
Parsing Crafting Interpreters Ch. 6
Error handling Language Implementation Patterns Ch. 4
Semantic analysis Engineering a Compiler Ch. 4

Project 3: Filter Expression Language

  • File: DOMAIN_SPECIFIC_LANGUAGES_DSL_PROJECTS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: TypeScript, Rust, Go
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Parsing / Expression Evaluation
  • Software or Tool: Query Language (like MongoDB queries)
  • Main Book: “Language Implementation Patterns” by Terence Parr

What you’ll build: A mini query language for filtering collections, similar to MongoDB queries or Elasticsearch Query DSL:

name == "John" AND (age >= 21 OR status == "verified") AND tags CONTAINS "premium"

This filter expression gets parsed into an AST, then evaluated against data.

Why it teaches DSLs: This project introduces operator precedence, boolean logic, and AST evaluation—the core of any expression-based language. You’ll build something actually useful: a filter that can be stored in a database and evaluated at runtime.

Core challenges you’ll face:

  • Operator precedence (AND binds tighter than OR) → maps to grammar design
  • Parentheses for grouping (overriding precedence) → maps to recursive parsing
  • AST construction (building a tree from flat tokens) → maps to parse tree design
  • Tree evaluation (walking the AST to compute result) → maps to interpreter pattern
  • Multiple operators (==, !=, >, <, CONTAINS, IN) → maps to extensible grammars

Key Concepts:

  • Abstract Syntax Trees: “Crafting Interpreters” Chapter 5 - Robert Nystrom
  • Operator Precedence Parsing: “Language Implementation Patterns” Chapter 5 - Terence Parr
  • Tree-Walking Interpreters: “Crafting Interpreters” Chapter 7 - Robert Nystrom
  • Visitor Pattern for ASTs: “Design Patterns” Chapter 5 - Gang of Four

Resources for key challenges:

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Recursion, tree data structures, Project 2 concepts

Learning milestones:

  1. Simple comparisons parse and evaluate → You understand basic parsing
  2. AND/OR with correct precedence → You understand operator precedence
  3. Parentheses override precedence → You understand recursive grouping
  4. Invalid expressions give helpful errors → You understand error handling

Real World Outcome

You can store filter strings in a database and evaluate them against data safely and predictably.

expr = 'status == "active" AND (role == "admin" OR permissions CONTAINS "write")'
ast = parse_filter(expr)
print(ast.pretty())
results = [u for u in users if evaluate(ast, u)]

The Core Question You’re Answering

“How do I parse and evaluate boolean expressions with correct precedence and grouping?”

Concepts You Must Understand First

Stop and research these before coding:

  1. Operator precedence parsing
    • How do AND/OR precedence rules affect the AST?
    • Book Reference: Language Implementation Patterns Ch. 5
  2. AST evaluation
    • How does a visitor walk a tree to compute a result?
    • Book Reference: Crafting Interpreters Ch. 7
  3. Extensible grammars
    • How will you add new operators later?
    • Book Reference: Crafting Interpreters Ch. 6
  4. Error messaging
    • What error context helps users fix expressions?
    • Book Reference: Language Implementation Patterns Ch. 4

Questions to Guide Your Design

  1. What is the grammar for comparisons and boolean logic?
  2. How will you represent operators in the AST?
  3. How will you handle unknown fields or invalid types?
  4. How will you add functions like contains(a, b)?

Thinking Exercise

Manually parse this into an AST:

age >= 21 AND status == "active" OR role == "admin"

The Interview Questions They’ll Ask

  1. How does precedence affect parse trees?
  2. What is a Pratt parser and when would you use it?
  3. How do you evaluate an AST efficiently?
  4. What is the difference between parsing and evaluation?
  5. How would you add a new operator to your language?

Hints in Layers

Hint 1: Start with comparisons only Get field == value working before AND/OR.

Hint 2: Add precedence with a small parser table Define binding powers for each operator.

Hint 3: Write a pretty-printer If the tree looks right, evaluation will follow.

Books That Will Help

Topic Book Chapter
Operator precedence Language Implementation Patterns Ch. 5
AST evaluation Crafting Interpreters Ch. 7
Parsing strategies Crafting Interpreters Ch. 6
Error recovery Language Implementation Patterns Ch. 4

Project 4: Product Catalog Rules DSL

  • File: DOMAIN_SPECIFIC_LANGUAGES_DSL_PROJECTS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Ruby, Elixir, TypeScript
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Business Rules / Constraint Systems
  • Software or Tool: Product Rules Engine
  • Main Book: “Domain Specific Languages” by Martin Fowler

What you’ll build: A DSL specifically designed for your use case—defining product relationships, bundles, constraints, and pricing rules:

# Product definitions
PRODUCT laptop "MacBook Pro 16" {
    category: electronics
    base_price: 2499.00
    attributes: [color, memory, storage]
}

PRODUCT case "Laptop Sleeve" {
    category: accessories
    base_price: 49.00
}

# Bundle rules
BUNDLE "Work From Home Kit" {
    includes: [laptop, case, keyboard, mouse]
    discount: 15%
    constraint: laptop.memory >= 16GB
}

# Compatibility rules
RULE laptop_case_compatibility {
    when: cart CONTAINS laptop
    suggest: case WITH message "Protect your investment!"
    discount: case BY 20% IF purchased_together
}

# Constraint rules
RULE storage_memory_constraint {
    when: laptop.storage == "2TB"
    require: laptop.memory >= 32GB
    error: "2TB storage requires at least 32GB memory"
}

# Pricing rules
RULE bulk_discount {
    when: cart.quantity(category: accessories) >= 3
    apply: 10% OFF category: accessories
}

Why it teaches DSLs: This is where DSL design becomes real business value. You’ll create a language that non-programmers can read and modify. The challenge is designing syntax that’s both expressive and unambiguous.

Core challenges you’ll face:

  • Domain modeling (what concepts exist? how do they relate?) → maps to language design
  • Rule evaluation order (which rules fire first?) → maps to execution semantics
  • Constraint propagation (if A requires B, and B requires C…) → maps to inference engines
  • Conflict resolution (two rules contradict each other) → maps to rule priority systems
  • User-friendly errors (business users need to understand what’s wrong) → maps to error UX

Key Concepts:

  • Semantic Model Design: “Domain Specific Languages” Chapters 11-12 - Martin Fowler
  • Rule-Based Systems: “Artificial Intelligence: A Modern Approach” Chapter 7 - Russell & Norvig
  • Forward Chaining: Martin Fowler’s Rules Engine - When to use rules
  • Grammar Design: “Language Implementation Patterns” Chapter 5 - Terence Parr

Resources for key challenges:

Difficulty: Intermediate-Advanced Time estimate: 2-3 weeks Prerequisites: Projects 2-3 concepts, understanding of business rules

Learning milestones:

  1. Products and bundles parse correctly → You understand domain-specific syntax
  2. Rules trigger on cart changes → You understand event-driven evaluation
  3. Constraint violations show clear errors → You understand validation design
  4. Complex rule interactions work correctly → You understand rule engines

Real World Outcome

You can run an interactive engine that evaluates cart changes, applies discounts, and explains why a rule fired.

$ ./catalog_engine products.rules --interactive
> add laptop
Added: MacBook Pro ($2499.00)
💡 Suggestion: Don't forget protection! Add "Sleeve" for $49.00

The Core Question You’re Answering

“How do I design a readable business language that non-programmers can safely use?”

Concepts You Must Understand First

Stop and research these before coding:

  1. Domain modeling
    • What are the nouns and verbs of the business?
    • Book Reference: Domain Specific Languages Ch. 11-12
  2. Rule evaluation order
    • How do priorities or phases affect outcomes?
    • Book Reference: Artificial Intelligence: A Modern Approach Ch. 7
  3. Constraint checking
    • How do you detect and report conflicts?
    • Book Reference: Engineering a Compiler Ch. 4
  4. Error UX
    • How do you present rule errors to non-technical users?
    • Book Reference: Domain Specific Languages Ch. 4

Questions to Guide Your Design

  1. What syntax is easiest for business users to read?
  2. How will you model rules: events, conditions, actions?
  3. How will you handle conflicting discounts or suggestions?
  4. What data structures make rule evaluation fast?

Thinking Exercise

Design a rule that enforces “if storage is 2TB, memory must be >= 32GB” and decide how to report violations.

The Interview Questions They’ll Ask

  1. What makes a DSL “business friendly”?
  2. How do you decide rule execution order?
  3. How do you detect conflicts between rules?
  4. What is the difference between validation and evaluation?
  5. How do you keep DSLs maintainable as the domain grows?

Hints in Layers

Hint 1: Start with a tiny grammar Support only PRODUCT, RULE, and when/then.

Hint 2: Separate parsing from evaluation Build a clean AST and a separate evaluator.

Hint 3: Add tracing Keep a log of which rules fired and why.

Books That Will Help

Topic Book Chapter
DSL design Domain Specific Languages Ch. 11-12
Rule systems Artificial Intelligence: A Modern Approach Ch. 7
Semantic validation Engineering a Compiler Ch. 4
Error UX Domain Specific Languages Ch. 4

Project 5: Macro-Based HTML DSL (Compile-Time DSL)

  • File: DOMAIN_SPECIFIC_LANGUAGES_DSL_PROJECTS.md
  • Main Programming Language: Elixir
  • Alternative Programming Languages: Rust (proc macros), Lisp/Clojure, Nim
  • Coolness Level: Level 5: Pure Magic
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 4: Expert
  • Knowledge Area: Metaprogramming / Macros
  • Software or Tool: HTML Template DSL
  • Main Book: “Metaprogramming Elixir” by Chris McCord

What you’ll build: A macro-based DSL that generates HTML at compile time, similar to Phoenix’s HEEx or React’s JSX:

defmodule MyPage do
  use HtmlDsl

  def render(user) do
    html do
      head do
        title "Welcome, #{user.name}!"
      end
      body class: "dark-mode" do
        div id: "main" do
          h1 "Hello, #{user.name}!"

          if user.admin? do
            div class: "admin-panel" do
              a href: "/admin", "Admin Dashboard"
            end
          end

          ul do
            for item <- user.items do
              li item.name
            end
          end
        end
      end
    end
  end
end

Why it teaches DSLs: Macros transform code at compile time, meaning your DSL has zero runtime overhead. You’ll learn how Elixir’s quote and unquote work, how the AST is represented, and how to manipulate code as data.

Core challenges you’ll face:

  • Understanding AST representation (code is data in Elixir) → maps to homoiconicity
  • Quoting and unquoting (capturing vs. injecting code) → maps to macro hygiene
  • Recursive macro expansion (nested tags) → maps to macro composition
  • Compile-time validation (catch errors before runtime) → maps to static analysis
  • Integration with host language (using if, for inside DSL) → maps to seamless embedding

Key Concepts:

Resources for key challenges:

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Elixir basics, understanding of compile vs. runtime

Learning milestones:

  1. Simple tags generate correct HTML → You understand basic macros
  2. Nested tags work recursively → You understand macro composition
  3. Attributes render correctly → You understand keyword arguments in AST
  4. Host language constructs (if/for) work inside DSL → You’ve mastered integration

Real World Outcome

You can render HTML from pure Elixir DSL code, and invalid syntax fails at compile time with helpful errors.

iex> MyPage.render(%User{name: "Alice", admin?: true})
"<html><head><title>Welcome, Alice!</title></head><body class=\"dark-mode\"><div id=\"main\"><h1>Hello, Alice!</h1><div class=\"admin-panel\"><a href=\"/admin\">Admin Dashboard</a></div></div></body></html>"

The Core Question You’re Answering

“How can compile-time macros transform a DSL into zero-overhead, safe code?”

Concepts You Must Understand First

Stop and research these before coding:

  1. AST representation
    • How does Elixir represent code as data?
    • Book Reference: Metaprogramming Elixir Ch. 1
  2. Macro hygiene
    • How do you avoid variable capture?
    • Book Reference: Metaprogramming Elixir Ch. 2
  3. Quote and unquote
    • How do you build and inject AST nodes?
    • Book Reference: Metaprogramming Elixir Ch. 1
  4. Compile-time validation
    • How do you report attribute errors early?
    • Book Reference: Metaprogramming Elixir Ch. 3

Questions to Guide Your Design

  1. What is the AST shape for a tag with attributes and children?
  2. How will you allow host language constructs inside the DSL?
  3. How will you validate tag names and attributes?
  4. What does the final HTML string generation look like?

Thinking Exercise

Manually write the quoted AST for div class: "note", do: "Hello".

The Interview Questions They’ll Ask

  1. What is macro hygiene and why does it matter?
  2. How does a macro differ from a function?
  3. What are the benefits of compile-time DSLs?
  4. How do you validate DSL syntax during compilation?
  5. How do you embed host-language constructs in a DSL?

Hints in Layers

Hint 1: Start with a single tag Implement div "text" before nested tags.

Hint 2: Reuse the host AST Represent attributes as keyword lists and children as lists.

Hint 3: Add compile-time guards Fail fast on unknown attributes.

Books That Will Help

Topic Book Chapter
Macro basics Metaprogramming Elixir Ch. 1-2
AST manipulation Metaprogramming Elixir Ch. 2
Compile-time validation Metaprogramming Elixir Ch. 3
DSL patterns Domain Specific Languages Ch. 27

Project 6: Template Engine with Custom Syntax

  • File: DOMAIN_SPECIFIC_LANGUAGES_DSL_PROJECTS.md
  • Main Programming Language: Rust
  • Alternative Programming Languages: Go, C, Zig
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 4: Expert
  • Knowledge Area: Parsing / Code Generation
  • Software or Tool: Template Engine (like Jinja2/Handlebars)
  • Main Book: “Crafting Interpreters” by Robert Nystrom

What you’ll build: A full template engine with custom syntax, variable interpolation, control flow, and includes:

{# This is a comment #}
<html>
<head><title>{{ page.title }}</title></head>
<body>
  <h1>Hello, {{ user.name | uppercase }}!</h1>

  {% if user.is_authenticated %}
    <nav>
      {% for item in menu_items %}
        <a href="{{ item.url }}"
           {% if item.active %}class="active"{% endif %}>
          {{ item.label }}
        </a>
      {% endfor %}
    </nav>
  {% else %}
    <a href="/login">Sign In</a>
  {% endif %}

  {% include "footer.html" %}
</body>
</html>

Why it teaches DSLs: This combines everything: lexing, parsing, AST construction, and code generation. Template engines are one of the most practical DSLs you can build—every web framework has one.

Core challenges you’ll face:

  • Mixed-mode lexing (switching between text and code) → maps to lexer states/modes
  • Expression parsing (variable access, filters, function calls) → maps to expression grammars
  • Control flow compilation (if/for become code) → maps to bytecode/IR generation
  • Filter/pipe system (value | filter1 | filter2) → maps to function composition
  • Template inheritance (extends, blocks) → maps to symbol tables and scoping

Key Concepts:

  • Lexer Modes: “Crafting Interpreters” Chapter 4 (extended) - Robert Nystrom
  • Expression Parsing: “Language Implementation Patterns” Chapter 5 - Terence Parr
  • Bytecode Compilation: “Crafting Interpreters” Chapters 14-15 - Robert Nystrom
  • Template Compilation: Study Jinja2’s Template Designer Documentation

Resources for key challenges:

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Strong parsing skills, Projects 2-3, systems programming basics

Learning milestones:

  1. Variable interpolation works → You understand basic expression evaluation
  2. Control flow (if/for) works → You understand compiling control structures
  3. Filters chain correctly → You understand function composition
  4. Compiled templates run fast → You understand bytecode benefits

Real World Outcome

You can render templates from data, or compile them to bytecode for fast repeated execution.

$ ./template_engine template.html data.json
<h1>My List</h1>
  <p>Apple</p>
  <p>Banana</p>
  <p>Cherry</p>

$ ./template_engine --compile template.html -o template.tplc
Compiled: template.html -> template.tplc (47 instructions)

The Core Question You’re Answering

“How do I parse mixed text/code templates and execute them efficiently?”

Concepts You Must Understand First

Stop and research these before coding:

  1. Lexer modes
    • How do you switch between text and code tokens?
    • Book Reference: Crafting Interpreters Ch. 4
  2. Expression grammars
    • How do you parse filters and property access?
    • Book Reference: Language Implementation Patterns Ch. 5
  3. Template compilation
    • How do you lower templates to a faster form?
    • Book Reference: Crafting Interpreters Ch. 14-15
  4. Scoping
    • How do loop variables and nested blocks resolve?
    • Book Reference: Engineering a Compiler Ch. 5

Questions to Guide Your Design

  1. What tokens separate text from code ({{ }}, {% %})?
  2. How will you represent templates in the AST?
  3. What does “compiled template” mean in your system?
  4. How will you handle missing variables?

Thinking Exercise

Take this snippet and list the token stream (text vs code):

Hello {{ user.name }}{% if user.admin %} (admin){% endif %}

The Interview Questions They’ll Ask

  1. How do lexer modes work?
  2. What is the difference between interpretation and compilation here?
  3. How would you cache compiled templates?
  4. How do you handle escaping and safety in templates?
  5. How do you test template engines?

Hints in Layers

Hint 1: Build a token stream with modes Switch mode on {{, {%, and end tokens.

Hint 2: Start with rendering only Interpret the AST before adding bytecode compilation.

Hint 3: Add bytecode as a second backend Reuse the same AST, different backend.

Books That Will Help

Topic Book Chapter
Lexer modes Crafting Interpreters Ch. 4
Expression parsing Language Implementation Patterns Ch. 5
Bytecode compilation Crafting Interpreters Ch. 14-15
Scoping rules Engineering a Compiler Ch. 5

Final Project: Complete Business Rules Engine

  • File: DOMAIN_SPECIFIC_LANGUAGES_DSL_PROJECTS.md
  • Main Programming Language: Rust
  • Alternative Programming Languages: Go, Elixir, C++
  • Coolness Level: Level 5: Pure Magic
  • Business Potential: 5. The “Industry Disruptor”
  • Difficulty: Level 5: Master
  • Knowledge Area: Language Design / Rule Systems
  • Software or Tool: Business Rules Engine
  • Main Book: “Language Implementation Patterns” by Terence Parr + “Domain Specific Languages” by Martin Fowler

What you’ll build: A production-grade business rules engine combining everything you’ve learned:

# Domain model
ENTITY Customer {
    id: UUID
    name: String
    tier: Enum(bronze, silver, gold, platinum)
    total_spent: Decimal
    registered_at: DateTime
    tags: List<String>
}

ENTITY Order {
    id: UUID
    customer: Customer
    items: List<OrderItem>
    total: Decimal
    created_at: DateTime
}

# Derived facts (computed at runtime)
FACT customer_lifetime_days(c: Customer) =
    days_between(c.registered_at, now())

FACT order_value_category(o: Order) =
    CASE
        WHEN o.total >= 1000 THEN "high"
        WHEN o.total >= 100 THEN "medium"
        ELSE "low"
    END

# Rules with priorities
RULESET pricing_rules {
    priority: 100  # Higher = runs first

    RULE loyal_customer_discount {
        WHEN customer_lifetime_days(order.customer) > 365
         AND order.customer.tier IN (gold, platinum)
        THEN
            APPLY discount(10%) TO order
            LOG "Applied loyal customer discount"
    }

    RULE bulk_order_discount {
        WHEN order.items.count() >= 10
        THEN
            APPLY discount(5%) TO order
    }

    # Rules can conflict - engine handles it
    RULE max_discount_cap {
        WHEN order.total_discount > 20%
        THEN
            SET order.total_discount = 20%
            LOG WARNING "Discount capped at 20%"
    }
}

RULESET fraud_detection {
    priority: 200  # Runs before pricing
    mode: short_circuit  # Stop on first match

    RULE suspicious_velocity {
        WHEN order.customer.orders_last_hour() > 5
        THEN
            FLAG order AS suspicious
            REQUIRE manual_review
            STOP  # Don't process more rules
    }
}

Why it teaches DSLs at the highest level: This project requires mastery of:

  • Complex grammar design
  • Type systems and semantic analysis
  • Efficient rule evaluation (Rete algorithm)
  • Conflict resolution strategies
  • Debugging and explanation facilities (“why did this rule fire?”)

Core challenges you’ll face:

  • Type system implementation (ensuring rules are type-safe) → maps to semantic analysis
  • Efficient pattern matching (Rete algorithm for rule networks) → maps to optimization
  • Conflict resolution (what happens when rules contradict?) → maps to priority systems
  • Explanation facility (“why did I get this result?”) → maps to debugging DSLs
  • Hot reloading (update rules without restart) → maps to runtime compilation
  • Performance at scale (thousands of rules, millions of facts) → maps to indexing/caching

Key Concepts:

  • Rete Algorithm: “Production Matching for Large Learning Systems” - Robert Doorenbos (PhD thesis)
  • Type Systems: “Types and Programming Languages” Chapters 1-3 - Benjamin Pierce
  • Semantic Analysis: “Engineering a Compiler” Chapter 4 - Cooper & Torczon
  • Rule Engine Architecture: Business Rules Engine Comparison 2024

Resources for key challenges:

Difficulty: Expert (Master level) Time estimate: 1-2 months Prerequisites: All previous projects, strong CS fundamentals

Learning milestones:

  1. Rules parse and type-check → You understand semantic analysis
  2. Basic forward chaining works → You understand rule evaluation
  3. Rete algorithm improves performance → You understand optimization
  4. Explanation facility works → You understand rule tracing
  5. Hot reload doesn’t lose state → You understand incremental compilation

Real World Outcome

You can load a ruleset, insert facts, run inference, and get human-readable explanations of why a rule fired.

$ ./rules_engine --load business.rules --repl
> fact Customer { id: "c1", tier: gold, total_spent: 5000.00 }
> fact Order { id: "o1", customer: @c1, total: 500.00 }
> run
Applied loyal_customer_discount: -$50.00
Final order total: $450.00

The Core Question You’re Answering

“How do I build a scalable, explainable rules language that can power real business decisions?”

Concepts You Must Understand First

Stop and research these before coding:

  1. Type systems
    • How do you validate rule inputs and outputs?
    • Book Reference: Types and Programming Languages Ch. 1-3
  2. Rete algorithm
    • How does a rule network speed up matching?
    • Book Reference: Doorenbos, Production Matching for Large Learning Systems
  3. Conflict resolution
    • How do you pick which rule wins?
    • Book Reference: Domain Specific Languages Ch. 14
  4. Explainability
    • How do you record “why” a rule fired?
    • Book Reference: Designing Data-Intensive Applications Ch. 10 (auditability)

Questions to Guide Your Design

  1. What is your fact model and how do you index it?
  2. How will you represent rules so they can be optimized?
  3. How will you detect and resolve conflicting rules?
  4. What data will you store for explainability?

Thinking Exercise

Define two conflicting rules (one applies a discount, another caps it). Decide how your engine resolves the conflict and how you explain the outcome.

The Interview Questions They’ll Ask

  1. What is the Rete algorithm and why is it useful?
  2. How do you ensure rules are type-safe?
  3. How do you handle conflicting rules?
  4. How do you explain a decision made by a rules engine?
  5. What are the tradeoffs between forward and backward chaining?

Hints in Layers

Hint 1: Start with a simple forward-chaining engine Get correctness before performance.

Hint 2: Add rule priorities Encode conflict resolution as a deterministic ordering.

Hint 3: Log rule traces Keep a structured trace of conditions and actions.

Books That Will Help

Topic Book Chapter
Type systems Types and Programming Languages Ch. 1-3
Rete algorithm Doorenbos thesis Ch. 2-4
Rule conflicts Domain Specific Languages Ch. 14
Explainability Designing Data-Intensive Applications Ch. 10

Based on your starting point (complete beginner to DSLs), I recommend:

Start Here: Project 1 (Fluent Query Builder)

Why: Minimal parsing, focuses on API design. You’ll understand the mindset of DSL design—making code read like prose—without getting lost in lexer details.

Then: Project 2 (Config Parser)

Why: Introduces the fundamental lexer → parser → data structure pipeline. This pattern applies to every DSL you’ll ever build.

The Critical Jump: Project 3 (Filter Expressions)

Why: This is where you learn AST construction and evaluation. If you can build this, you can build almost any expression language.

Your Goal: Project 4 (Product Rules DSL)

Why: This is exactly what you asked for! By this point you’ll have the skills to design a custom language for your product catalog domain.

Optional Deep Dives:

  • Project 5 if you want compile-time metaprogramming (Elixir/Rust macros)
  • Project 6 if you want to build a template engine
  • Final Project if you want to build something enterprise-grade

Project Comparison Table

Project Difficulty Time Depth of Understanding Fun Factor Business Potential
Fluent Query Builder Beginner Weekend ⭐⭐ ⭐⭐⭐ Micro-SaaS
Config File Parser Beginner-Intermediate Weekend ⭐⭐⭐ ⭐⭐ Service & Support
Filter Expression Language Intermediate 1-2 weeks ⭐⭐⭐⭐ ⭐⭐⭐⭐ Open Core
Product Catalog Rules DSL Intermediate-Advanced 2-3 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ Service & Support
Macro-Based HTML DSL Advanced 1-2 weeks ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ Resume Gold
Template Engine Advanced 2-3 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ Open Core
Final: Business Rules Engine Expert 1 month+ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ Industry Disruptor

Essential Resources

Books (In Order of Relevance)

  1. “Crafting Interpreters” by Robert Nystrom - FREE at craftinginterpreters.com
    • The absolute best introduction to building languages
    • Covers lexing, parsing, interpreters, bytecode VMs
    • Two complete implementations (Java and C)
  2. “Domain Specific Languages” by Martin Fowler
    • The definitive guide to DSL patterns
    • Covers internal DSLs, external DSLs, language workbenches
    • Use Chapter 35 for fluent interfaces, Chapter 11-12 for semantic models
  3. “Language Implementation Patterns” by Terence Parr
    • Practical patterns for language implementation
    • Created by the author of ANTLR
    • Great for understanding different parsing strategies
  4. “Metaprogramming Elixir” by Chris McCord
    • If you want to explore macro-based DSLs
    • Written by the creator of Phoenix
    • Excellent for understanding compile-time DSLs

Online Resources


Sources