Project 5: Macro-Based HTML DSL (Compile-Time DSL)
Build a compile-time HTML DSL with macro expansion, AST transformation, and static validation.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 4: Expert |
| Time Estimate | 1-2 weeks |
| Main Programming Language | Elixir |
| Alternative Programming Languages | Rust (proc macros), Lisp/Clojure, Nim |
| Coolness Level | Level 5: Pure Magic |
| Business Potential | 1. The “Resume Gold” |
| Prerequisites | AST basics, compile-time vs runtime distinction, recursion |
| Key Topics | Macros, quote/unquote, hygiene, compile-time checks |
1. Learning Objectives
By completing this project, you will:
- Represent HTML-like DSL syntax as host-language AST.
- Expand nested macro forms into deterministic runtime output.
- Enforce compile-time validation for tag/attribute misuse.
- Preserve hygiene and avoid accidental variable capture.
- Understand zero-runtime-overhead DSL patterns.
2. All Theory Needed (Per-Concept Breakdown)
Compile-Time Metaprogramming and AST Transformation
Fundamentals Compile-time DSLs operate before runtime execution. Instead of parsing strings at runtime, macros transform source forms into host-language AST during compilation. This yields stronger validation and lower runtime overhead. The key mental model is “code as data”: syntax is represented as tree structures that can be inspected and rewritten. In Elixir and Lisp-like systems, this is explicit and ergonomic. For DSL builders, compile-time expansion means mistakes can be caught earlier and output can be optimized.
Deep Dive into the concept Compilation typically involves parsing source to AST, macro expansion, semantic checks, and bytecode/native output. Your DSL participates in expansion stage. A macro receives AST nodes, rewrites them, and returns transformed AST. The transformation must be deterministic and hygienic.
For HTML DSL design, define a canonical AST for tags:
- tag name,
- attribute list,
- child node list,
- text/interpolation nodes.
Macro entrypoint (html do ... end) captures nested forms and rewrites them into calls over this canonical structure or direct iodata-like output structures. The expansion strategy can be direct string-building AST or intermediate IR then renderer. Intermediate IR is easier to validate and test.
Compile-time validation can check:
- allowed attribute names per tag (optional strict profile),
- required attributes (
imgrequiresaltin accessibility mode), - invalid nesting policies (if enforced),
- malformed macro arguments.
Because checks run at compile time, feedback arrives before deployment. That is a major product advantage.
Macro expansion must preserve source location where possible for diagnostics. Use metadata from input AST to surface errors at author line numbers.
Deterministic expansion is critical for reproducible builds. Same input module should emit semantically identical output regardless of compile order. Avoid hidden global state in macros.
Security is another benefit: by constraining the DSL output path, you reduce injection risks relative to runtime string concatenation. You can enforce escaping at transformation level.
Limitations: compile-time DSLs are host-language-dependent and can be harder for newcomers. Debugging macro-expanded code requires tooling (AST inspection) and discipline. Therefore include helper debug macros to inspect expanded forms during learning.
This project teaches a distinct DSL strategy compared to parser-based external DSLs. Instead of lexers/parsers, you leverage host compiler phases.
How this fit on projects
- Primary architecture in this project.
- Contrasts with runtime parsers in Project 2.
- Useful for compile-time validation concepts in Project 7.
Definitions & key terms
- Macro: compile-time AST transformation function.
- Homoiconicity: code represented in structures manipulable as data.
- Expansion: rewriting macro forms to lower-level AST.
- Hygiene: preventing accidental variable capture during expansion.
- Iodata/IR: intermediate representation for efficient output generation.
Mental model diagram
DSL source form --> host parser AST --> macro expansion --> validated IR/AST --> runtime output
How it works
- Parse host-language module to AST.
- Encounter DSL macro call.
- Rewrite nested tag forms into canonical IR.
- Validate IR at compile time.
- Emit runtime rendering AST.
Minimal concrete example
Source DSL form:
div class: "note" do
h1 "Hello"
end
Expanded idea:
render_tag("div", [class:"note"], [render_tag("h1", [], ["Hello"])])
Common misconceptions
- “Macros are just faster functions.” -> functions run at runtime, macros transform syntax pre-runtime.
- “Compile-time means no need for runtime tests.” -> runtime data-dependent behavior still needs tests.
- “Hygiene is optional.” -> hygiene bugs are hard to diagnose and can be severe.
Check-your-understanding questions
- Why can macros provide better error timing than runtime parsers?
- What is the risk of non-hygienic macro expansion?
- Predict benefit of intermediate IR vs direct string generation.
Check-your-understanding answers
- They fail during compilation before execution paths are reached.
- Variable capture and shadowing causing incorrect behavior.
- Easier validation, testing, and alternate renderer backends.
Real-world applications
- Phoenix/HEEx-style templating.
- compile-time route declarations.
- framework-specific declarative UI layers.
Where you’ll apply it
- §3.2 compile-time validation requirements.
- §4.1 macro pipeline architecture.
- §6.2 expansion snapshot tests.
References
- Chris McCord, Metaprogramming Elixir.
- Elixir School metaprogramming docs.
Key insights Compile-time DSLs shift correctness and performance benefits left, at the cost of macro complexity.
Summary Macro DSLs are language engineering inside compiler phases, offering early feedback and efficient output generation.
Homework/Exercises to practice the concept
- Sketch canonical tag IR.
- Define three compile-time validation rules.
- Write one non-hygienic expansion scenario.
Solutions to the homework/exercises
Tag(name, attrs, children)plusTextnodes.- unknown attr, invalid arg shape, forbidden nesting.
- expansion introduces local variable named
xconflicting with callerx.
Macro Hygiene, Validation, and Host-Language Integration
Fundamentals
Macro hygiene prevents transformed code from accidentally capturing or shadowing variables in caller scope. Integration means DSL blocks can still use host constructs (if, loops, interpolation) without breaking expansion semantics. Validation ensures misuse fails clearly at compile time. Together, these determine whether a macro DSL is practical or fragile.
Deep Dive into the concept Hygiene issues appear when generated identifiers collide with caller names. Solutions include generated unique names and explicit variable scoping controls provided by host macro system. In Elixir-style macros, quoting/unquoting controls which parts are literal AST and which are evaluated/injected.
Host integration is usually the hardest part. Users expect to place conditionals and loops inside DSL blocks naturally. Your macro must recognize those constructs and either preserve them as AST or convert them into IR nodes that evaluate at runtime. Avoid evaluating runtime expressions during compilation unless values are known constants.
Validation strategy should be layered:
- Syntax-shape validation in macro entry (argument arity/form).
- Structural validation after expansion to IR.
- Optional domain validation (allowed tags/attrs) configurable by strictness mode.
Diagnostics should include original source location and suggested fix. Example: unknown tag attribute should list allowed alternatives if strict schema exists.
Escape handling is essential. Decide where escaping occurs (at node rendering stage, not ad-hoc in macros). Keep deterministic encoding rules.
Testing macros requires expansion tests and runtime output tests. Expansion tests assert transformed AST/IR structure for fixed input forms. Runtime tests assert produced HTML string for deterministic input data.
A practical integration pattern:
- macro expands DSL forms to IR-builder calls.
- runtime renderer interprets IR with escaped interpolation. This balances compile-time validation and runtime flexibility.
Be conservative with compile-time execution. Calling arbitrary user functions during expansion can produce non-deterministic builds and side effects. Keep macro phase pure.
As DSL grows, provide lints: warn on deeply nested tags, duplicate attributes, or deprecated constructs. These are excellent compile-time UX improvements.
How this fit on projects
- Core correctness and usability for this project.
- hygiene and validation mindset transfers to Project 7 semantic checks.
Definitions & key terms
- Hygiene: macro expansion preserving lexical correctness.
- Quote/Unquote: mechanisms for constructing/injecting AST fragments.
- Strict mode: validation profile with tighter constraints.
- Expansion snapshot: stored representation of transformed AST for tests.
- Interpolation: injecting runtime values into rendered output safely.
Mental model diagram
Caller scope vars
|
DSL macro form --> hygienic expansion --> IR --> runtime render with escaped interpolation
How it works
- Capture AST with quote.
- Pattern-match DSL nodes.
- Build hygienic expanded AST.
- Validate nodes/attrs.
- Render at runtime with escaping.
Minimal concrete example
Rule:
if user.admin? do
div class: "admin" do ... end
end
Expansion preserves `if` as runtime host AST and only rewrites tag forms.
Common misconceptions
- “Macros should evaluate all embedded expressions at compile time.” -> breaks runtime semantics.
- “Escaping can be skipped for trusted templates.” -> future input paths often change trust boundaries.
- “Hygiene problems are rare.” -> they appear quickly in nested macros.
Check-your-understanding questions
- Why preserve host control-flow AST instead of evaluating in macro?
- What kind of tests catch hygiene issues?
- Where should escaping happen for consistency?
Check-your-understanding answers
- Because runtime data may not be known at compile time.
- Expansion snapshots with deliberately conflicting variable names.
- In renderer layer applied to interpolated dynamic content.
Real-world applications
- compile-time UI/template DSLs.
- safe email template builders.
- routing/query declaration macros.
Where you’ll apply it
- §5.10 phase 2 and 3.
- §6.2 expansion and runtime tests.
References
- Metaprogramming Elixir.
- Elixir docs on macros and hygiene.
Key insights Macro DSL quality depends more on hygiene and validation discipline than on syntax cleverness.
Summary Reliable macro DSLs preserve host-language semantics, validate early, and avoid scope corruption.
Homework/Exercises to practice the concept
- Design one strict-mode validation table for 5 tags.
- Create a hygiene failure fixture and expected fix.
- Define expansion snapshot format.
Solutions to the homework/exercises
- include tag -> allowed attrs mapping.
- detect collision and generate unique identifiers.
- serialize simplified AST nodes to deterministic text.
3. Project Specification
3.1 What You Will Build
A compile-time HTML DSL package with nested tags, attributes, interpolation, and host control-flow support.
Included:
- macro entrypoint for DSL blocks.
- compile-time validation and diagnostics.
- runtime rendering from expanded IR.
Excluded:
- full HTML5 validation matrix.
- CSS/JS minification pipeline.
3.2 Functional Requirements
- Support nested tag composition in DSL syntax.
- Support attribute maps and text children.
- Integrate host
if/forconstructs inside DSL blocks. - Compile-time detect malformed DSL forms.
- Deterministically render output for fixed input.
3.3 Non-Functional Requirements
- Performance: no parser cost at runtime; render overhead near plain template function.
- Reliability: expansion and output deterministic for fixtures.
- Usability: compile-time errors include source locations and hints.
3.4 Example Usage / Output
Input DSL:
html do
body class: "dark" do
h1 "Welcome"
end
end
Output:
<html><body class="dark"><h1>Welcome</h1></body></html>
3.5 Data Formats / Schemas / Protocols
IR Node variants:
- Tag(name, attrs, children)
- Text(content)
- Dynamic(expr)
3.6 Edge Cases
- duplicate attribute keys.
- unsupported attribute value types.
- invalid nesting in strict mode.
- unescaped interpolation risks.
3.7 Real World Outcome
Developers author HTML structure in readable DSL form with compile-time feedback and fast runtime rendering.
3.7.1 How to Run (Copy/Paste)
cd project_based_ideas/COMPILERS_RUNTIMES/DOMAIN_SPECIFIC_LANGUAGES_DSL_PROJECTS
make p05-test
./bin/p05-macro-demo --fixture fixtures/p05_golden.dsl
3.7.2 Golden Path Demo (Deterministic)
Fixture expands to stable IR hash and exact HTML output.
3.7.3 If CLI: exact terminal transcript
$ ./bin/p05-macro-demo --fixture fixtures/p05_golden.dsl
[ok] expansion_hash=31f9a844
[ok] html=<html><body class="dark"><h1>Welcome, Alice!</h1></body></html>
exit=0
$ ./bin/p05-macro-demo --fixture fixtures/p05_bad_attr.dsl
[error] CompileError 12:9 unknown attribute 'clas' for tag 'body'
[hint] did you mean 'class'?
exit=2
4. Solution Architecture
4.1 High-Level Design
DSL macro call -> AST capture -> Hygienic expansion -> IR validation -> runtime render
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Macro entry | capture and dispatch DSL forms | pattern matching by node shape |
| IR builder | canonical representation | tag/text/dynamic node variants |
| Validator | compile-time checks | strict mode toggle |
| Renderer | runtime HTML output | escaping + deterministic ordering |
4.4 Data Structures (No Full Code)
ValidationError { span, code, message, hint }
ExpandedTemplate { ir_nodes, metadata }
4.4 Algorithm Overview
Key Algorithm: Nested Tag Expansion
- Match tag macro form.
- Normalize attrs and children.
- Recursively expand child forms.
- Emit Tag IR node.
- Validate IR tree.
- Generate runtime rendering AST.
Complexity Analysis
- Expansion: O(n) AST nodes.
- Rendering: O(n + text_size).
5. Implementation Guide
5.1 Development Environment Setup
mkdir -p bin fixtures tests
5.2 Project Structure
p05-macro-html-dsl/
├── src/
│ ├── macro_entry.*
│ ├── expander.*
│ ├── ir.*
│ ├── validator.*
│ └── renderer.*
├── fixtures/
└── tests/
5.3 The Core Question You’re Answering
“How can compile-time macros express a rich DSL while keeping runtime behavior predictable and safe?”
5.4 Concepts You Must Understand First
- Quote/unquote AST mechanics.
- Hygiene and scope isolation.
- Compile-time diagnostics design.
- Runtime escaping strategy.
5.5 Questions to Guide Your Design
- What AST shape should represent tags and children?
- Which validations are compile-time vs runtime?
- How do host control-flow nodes pass through expansion?
- How will you test expansion determinism?
5.6 Thinking Exercise
Take one nested DSL snippet and hand-write expected IR tree including dynamic interpolations.
5.7 The Interview Questions They’ll Ask
- Macro vs function differences in practice?
- What is hygiene and how do you enforce it?
- Why use intermediate IR for compile-time DSLs?
- How does compile-time validation improve developer UX?
- What tradeoffs make macro DSLs hard to maintain?
5.8 Hints in Layers
Hint 1: implement one tag and one text child first.
Hint 2: build IR printer for debugging expansions.
Hint 3: add strict validation after expansion, not during parsing.
Hint 4: separate escape logic from expansion logic.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Macro fundamentals | Metaprogramming Elixir | Ch. 1-2 |
| DSL design patterns | Domain Specific Languages | internal DSL sections |
| compiler pipeline intuition | Crafting Interpreters | front-end architecture |
5.10 Implementation Phases
Phase 1: Foundation (3-5 hours)
- macro entry and basic tag expansion.
- runtime renderer for static nodes.
Checkpoint: one-page fixture renders correctly.
Phase 2: Core Functionality (6-10 hours)
- host control-flow passthrough.
- compile-time validator and error formatting.
Checkpoint: strict-mode fixtures pass/fail correctly.
Phase 3: Polish & Edge Cases (4-6 hours)
- expansion snapshots.
- escaping and interpolation hardening.
Checkpoint: deterministic expansion hash tests pass.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Output strategy | direct strings / iodata-like IR | IR then render | easier validation + optimization |
| Validation timing | runtime only / compile-time + runtime | both | early feedback + runtime safety |
| Strictness | fixed / configurable | configurable strict profiles | supports learning + production posture |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Expansion tests | AST/IR correctness | nested tags, control-flow |
| Validation tests | compile-time errors | unknown attrs, malformed nodes |
| Render tests | output correctness | escaped interpolation, deterministic output |
6.2 Critical Test Cases
- nested tag expansion with attributes.
ifandforintegration inside DSL block.- compile-time error on unknown attribute.
- escaped interpolation prevents raw HTML injection.
6.3 Test Data
fixtures/p05_golden.dsl
fixtures/p05_bad_attr.dsl
fixtures/p05_control_flow.dsl
fixtures/p05_escape.dsl
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| non-hygienic vars | random runtime misbehavior | generated unique names + scoped expansion |
| mixing expansion and rendering | hard-to-debug macros | strict IR boundary |
| weak error spans | confusing compile errors | preserve metadata from source AST |
7.2 Debugging Strategies
- macro expansion inspect mode for fixture snippets.
- IR snapshot diffing in tests.
7.3 Performance Traps
Building large concatenated strings eagerly can increase allocations; structured output buffering is preferable.
8. Extensions & Challenges
8.1 Beginner Extensions
- add fragment nodes.
- add conditional class helper macro.
8.2 Intermediate Extensions
- strict accessibility mode (required alt/aria rules).
- precompile template cache for hot paths.
8.3 Advanced Extensions
- source-map style error mapping from expanded AST.
- compile-time lints for deeply nested structures.
9. Real-World Connections
9.1 Industry Applications
- compile-time HTML/template generation.
- declarative UI DSLs.
9.2 Related Open Source Projects
- Phoenix LiveView/HEEx: https://hexdocs.pm/phoenix_live_view/
- Rust proc-macro ecosystem: https://doc.rust-lang.org/reference/procedural-macros.html
9.3 Interview Relevance
- metaprogramming architecture.
- compile-time validation strategies.
10. Resources
10.1 Essential Reading
- McCord, Metaprogramming Elixir.
- Elixir docs on macros and AST.
10.2 Video Resources
- macro internals deep dives.
- talks on compile-time DSL ergonomics.
10.3 Tools & Documentation
- macro expansion inspection tools.
- language AST docs.
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain compile-time vs runtime responsibilities.
- I can describe hygiene with an example.
- I can justify my validation split.
11.2 Implementation
- nested tag expansion works.
- compile-time diagnostics include spans/hints.
- deterministic expansion tests pass.
11.3 Growth
- I can propose one strict-mode extension.
- I can discuss macro DSL tradeoffs in interviews.
- I can compare this approach to parser-based DSLs.
12. Submission / Completion Criteria
Minimum Viable Completion:
- macro expansion for static nested tags.
- one compile-time validation error and one successful render path.
Full Completion:
- host control-flow support, strict validation, deterministic snapshots.
Excellence (Going Above & Beyond):
- accessibility lint mode and advanced source mapping.
13 Additional Content Rules (Applied)
13.1 Determinism
Use fixed fixtures; verify expansion and render hashes.
13.2 Outcome Completeness
Include both compile success and compile failure demos with exit codes.
13.3 Cross-Linking
Concepts complement parser-based approaches in Project 2 and scale semantics in Project 7.
13.4 No Placeholder Text
All sections are concrete and implementable.