Project 10: Procedural Macro for Trait Reflection
Project 10: Procedural Macro for Trait Reflection
Goal: Build a derive macro that generates compile-time metadata for Rust structs, enabling runtime reflection capabilities that Rust intentionally omits. Master the art of code that writes code.
- Main Programming Language: Rust
- Coolness Level: Level 3: Genuinely Clever
- Difficulty: Level 4: Expert
- Knowledge Area: Metaprogramming / Compiler Plugins
- Estimated Time: 1 week
- Prerequisites: Solid Rust fundamentals, basic trait understanding, familiarity with Cargo workspace structure
Learning Objectives
By completing this project, you will be able to:
- Distinguish declarative from procedural macros and explain when each is appropriate for metaprogramming tasks
- Navigate the TokenStream API to parse and generate Rust code at compile time
- Use the syn crate to parse Rust source code into a structured Abstract Syntax Tree (AST)
- Use the quote crate to generate Rust code from templates with proper hygiene
- Implement a derive macro that hooks into the Rust compilerâs macro expansion phase
- Handle edge cases including generics, lifetimes, visibility modifiers, and error reporting
- Debug procedural macros using cargo expand and compile-time error messages
- Understand why proc-macro crates require special configuration and the compilation model behind them
- Apply reflection patterns used by production crates like serde, diesel, and bevy_reflect
- Compare Rustâs compile-time reflection with runtime reflection in languages like Java, C#, and Python
Deep Theoretical Foundation
Before writing any code, you must understand the landscape of Rustâs macro system and why procedural macros exist as a separate mechanism from declarative macros.
The Reflection Problem: What Other Languages Have
In many languages, you can inspect types at runtime:
+------------------------------------------------------------------------+
| RUNTIME REFLECTION IN OTHER LANGUAGES |
+------------------------------------------------------------------------+
| |
| JAVA: |
| Class<?> clazz = obj.getClass(); |
| for (Field field : clazz.getDeclaredFields()) { |
| System.out.println(field.getName() + ": " + field.getType()); |
| } |
| |
| PYTHON: |
| for attr in dir(obj): |
| print(f"{attr}: {type(getattr(obj, attr))}") |
| |
| C#: |
| foreach (var prop in obj.GetType().GetProperties()) { |
| Console.WriteLine($"{prop.Name}: {prop.PropertyType}"); |
| } |
| |
| JAVASCRIPT: |
| for (const key in obj) { |
| console.log(`${key}: ${typeof obj[key]}`); |
| } |
| |
+------------------------------------------------------------------------+
Rust intentionally does NOT have runtime reflection. Why?
- Zero-cost abstractions: Reflection requires storing metadata in the binary, increasing size
- Compile-time safety: If types can be inspected at runtime, the compiler loses control
- Performance: Runtime type inspection adds overhead to every type access
- Monomorphization: Generic types are specialized at compile time; thereâs no single âtypeâ at runtime
But sometimes you NEED reflection-like capabilities:
- Serialization libraries (serde) need to iterate over struct fields
- ORM libraries (diesel) need to map struct fields to database columns
- Game engines (bevy) need to expose components to editors
- Debug tools need to inspect arbitrary types
The solution: Generate the metadata at compile time using procedural macros.
Rustâs Macro System: Two Distinct Worlds
Rust has two macro systems that serve different purposes:
+------------------------------------------------------------------------+
| RUST MACRO SYSTEM OVERVIEW |
+------------------------------------------------------------------------+
| |
| DECLARATIVE MACROS (macro_rules!) |
| ================================ |
| - Pattern matching on token trees |
| - Defined inline in your code |
| - Limited to token substitution |
| - Cannot inspect or analyze code structure |
| - Examples: vec![], println![], assert![] |
| |
| Example: |
| macro_rules! create_getter { |
| ($field:ident : $ty:ty) => { |
| fn $field(&self) -> &$ty { |
| &self.$field |
| } |
| }; |
| } |
| |
| Limitations: |
| - Cannot iterate over struct fields (don't know what fields exist) |
| - Cannot generate code based on field count or types |
| - Cannot produce compile errors with custom messages |
| |
+------------------------------------------------------------------------+
| |
| PROCEDURAL MACROS (proc_macro) |
| ============================== |
| - Full Rust code that runs at compile time |
| - Must live in a separate crate (proc-macro = true) |
| - Receives TokenStream, returns TokenStream |
| - Can parse code, inspect structure, generate arbitrary output |
| - Examples: #[derive(Debug)], #[tokio::main], serde's derives |
| |
| Three Types: |
| 1. derive macros: #[derive(MyMacro)] |
| 2. attribute macros: #[my_attribute] |
| 3. function-like: my_macro!(...) |
| |
+------------------------------------------------------------------------+
Why Procedural Macros Need Separate Crates
This is one of the most confusing aspects for newcomers. Why canât proc macros live alongside regular code?
+------------------------------------------------------------------------+
| THE PROC-MACRO COMPILATION MODEL |
+------------------------------------------------------------------------+
| |
| Normal Rust Compilation: |
| |
| [Source Code] ---> [Compiler] ---> [Machine Code/WASM] |
| |
| Proc-Macro Compilation (Two-Phase): |
| |
| Phase 1: Compile the proc-macro crate |
| ========================================= |
| [proc-macro src] --> [Compiler] --> [.dylib/.so/.dll] |
| | |
| v |
| Dynamically Loadable Library |
| (Runs inside the compiler!) |
| |
| Phase 2: Use the proc-macro in your code |
| ========================================== |
| [Your code with #[derive(...)]] |
| | |
| v |
| [Compiler loads .dylib] |
| | |
| v |
| [Macro expands your code] |
| | |
| v |
| [Expanded code compiles normally] |
| |
| Key Insight: |
| The proc-macro crate is compiled for the HOST machine (where you |
| run rustc), not the TARGET machine. It runs INSIDE the compiler. |
| |
| This is why you can't have proc-macros in the same crate as |
| regular code - they have different compilation targets! |
| |
+------------------------------------------------------------------------+
Token Streams: The Raw Material of Macros
At the lowest level, Rust code is a stream of tokens. A TokenStream is a sequence of TokenTrees:
+------------------------------------------------------------------------+
| TOKEN STREAM STRUCTURE |
+------------------------------------------------------------------------+
| |
| Source Code: |
| struct User { |
| name: String, |
| age: u32, |
| } |
| |
| TokenStream Representation: |
| +---------+ |
| | Ident | "struct" |
| +---------+ |
| | Ident | "User" |
| +---------+ |
| | Group | Delimited by { } |
| | +------+-------------------------------------------+ |
| | | Ident | "name" | |
| | +---------+ | |
| | | Punct | ":" | |
| | +---------+ | |
| | | Ident | "String" | |
| | +---------+ | |
| | | Punct | "," | |
| | +---------+ | |
| | | Ident | "age" | |
| | +---------+ | |
| | | Punct | ":" | |
| | +---------+ | |
| | | Ident | "u32" | |
| | +---------+ | |
| | | Punct | "," | |
| | +----------------------------------------------+ | |
| +---------+ |
| |
| Token Types: |
| - Ident: Identifiers (struct, User, name, String) |
| - Punct: Punctuation (:, ,, ;, +, etc.) |
| - Literal: Literals ("hello", 42, 3.14) |
| - Group: Delimited groups ({}, [], ()) |
| |
+------------------------------------------------------------------------+
The syn Crate: Parsing Tokens into AST
Raw tokens are hard to work with. The syn crate parses them into a structured AST:
+------------------------------------------------------------------------+
| SYN CRATE: TOKEN STREAM TO AST |
+------------------------------------------------------------------------+
| |
| TokenStream (raw tokens) |
| | |
| v |
| syn::parse_macro_input!(input as DeriveInput) |
| | |
| v |
| DeriveInput { |
| attrs: Vec<Attribute>, // #[...] attributes |
| vis: Visibility, // pub, pub(crate), etc. |
| ident: Ident, // The type name |
| generics: Generics, // <T, U, 'a> |
| data: Data { // struct, enum, or union |
| Struct(DataStruct) | |
| Enum(DataEnum) | |
| Union(DataUnion) |
| } |
| } |
| |
| DataStruct { |
| fields: Fields { |
| Named(FieldsNamed) | // struct Foo { x: i32, y: i32 } |
| Unnamed(FieldsUnnamed) | // struct Foo(i32, i32) |
| Unit // struct Foo; |
| } |
| } |
| |
| Field { |
| attrs: Vec<Attribute>, // #[serde(rename = "...")] |
| vis: Visibility, // pub, etc. |
| ident: Option<Ident>, // Field name (None for tuple structs) |
| colon_token: Option<Token![:]>, |
| ty: Type, // The field's type |
| } |
| |
+------------------------------------------------------------------------+
DeriveInput Structure Walkthrough
Letâs trace through a concrete example:
#[derive(Reflect)]
#[reflect(debug)]
pub struct User<'a, T: Clone> {
#[reflect(skip)]
pub name: &'a str,
pub age: u32,
metadata: T,
}
+------------------------------------------------------------------------+
| DERIVEINPUT FOR USER STRUCT |
+------------------------------------------------------------------------+
| |
| DeriveInput { |
| attrs: [ |
| Attribute { |
| path: "reflect", |
| tokens: "(debug)" |
| } |
| ], |
| vis: Visibility::Public, |
| ident: Ident("User"), |
| generics: Generics { |
| params: [ |
| LifetimeParam { lifetime: 'a }, |
| TypeParam { ident: T, bounds: [Clone] } |
| ], |
| where_clause: None |
| }, |
| data: Data::Struct(DataStruct { |
| fields: Fields::Named(FieldsNamed { |
| named: [ |
| Field { |
| attrs: [Attribute { path: "reflect", ... }], |
| vis: Visibility::Public, |
| ident: Some(Ident("name")), |
| ty: Type::Reference(&'a str) |
| }, |
| Field { |
| attrs: [], |
| vis: Visibility::Public, |
| ident: Some(Ident("age")), |
| ty: Type::Path(u32) |
| }, |
| Field { |
| attrs: [], |
| vis: Visibility::Inherited, |
| ident: Some(Ident("metadata")), |
| ty: Type::Path(T) |
| } |
| ] |
| }) |
| }) |
| } |
| |
+------------------------------------------------------------------------+
The quote Crate: Generating Rust from Templates
Once youâve analyzed the input, you need to generate output code. The quote! macro makes this ergonomic:
+------------------------------------------------------------------------+
| QUOTE CRATE: AST TO TOKEN STREAM |
+------------------------------------------------------------------------+
| |
| quote! Syntax: |
| |
| let name = format_ident!("User"); |
| let field_name = format_ident!("age"); |
| let field_type = quote!(u32); |
| |
| let output = quote! { |
| impl Reflect for #name { |
| fn fields() -> &'static [FieldInfo] { |
| &[ |
| FieldInfo { |
| name: stringify!(#field_name), |
| type_name: stringify!(#field_type), |
| } |
| ] |
| } |
| } |
| }; |
| |
| Key Features: |
| - #variable interpolates a variable into the output |
| - #(#items)* repeats for each item in an iterator |
| - Regular Rust syntax is preserved as tokens |
| - format_ident! creates new identifiers |
| |
| Repetition Example: |
| |
| let field_names = vec!["name", "age", "email"]; |
| let output = quote! { |
| &[#(#field_names),*] // Produces: &["name", "age", "email"] |
| }; |
| |
| With Separator: |
| #(#items),* // comma-separated |
| #(#items);* // semicolon-separated |
| #(#items)* // no separator |
| |
+------------------------------------------------------------------------+
Macro Hygiene and Identifier Scoping
Macro hygiene prevents macros from accidentally capturing or shadowing user variables:
+------------------------------------------------------------------------+
| MACRO HYGIENE EXPLAINED |
+------------------------------------------------------------------------+
| |
| PROBLEM: Without hygiene, macros can cause name collisions |
| |
| // User code |
| let result = 5; |
| my_macro!(); // Macro also defines 'result' internally |
| println!("{}", result); // Which result? |
| |
| SOLUTION: Hygiene gives each macro invocation its own "scope" |
| |
| Macro's 'result' variable: result#42 (synthetic identifier) |
| User's 'result' variable: result#0 (original identifier) |
| |
| They don't collide because they have different "hygiene marks" |
| |
| In proc macros, you control hygiene: |
| |
| // Using Span::call_site() - identifier uses caller's scope |
| let ident = Ident::new("foo", Span::call_site()); |
| |
| // Using Span::mixed_site() - default, usually what you want |
| let ident = Ident::new("foo", Span::mixed_site()); |
| |
| // Using quote! - identifiers inherit span from interpolated vars |
| let field_name = &field.ident; // Has span from source |
| quote! { self.#field_name } // Error messages point to source |
| |
+------------------------------------------------------------------------+
The Token Stream Transformation Pipeline
Hereâs the complete flow of a derive macro:
+------------------------------------------------------------------------+
| TOKEN STREAM TRANSFORMATION PIPELINE |
+------------------------------------------------------------------------+
| |
| 1. User writes code with derive attribute |
| +----------------------------------+ |
| | #[derive(Reflect)] | |
| | struct User { | |
| | name: String, | |
| | age: u32, | |
| | } | |
| +----------------------------------+ |
| | |
| v |
| 2. Compiler tokenizes the input |
| +----------------------------------+ |
| | TokenStream: | |
| | [Ident("struct"), Ident("User"), | |
| | Group{...}] | |
| +----------------------------------+ |
| | |
| v |
| 3. Compiler calls your proc-macro function |
| +----------------------------------+ |
| | #[proc_macro_derive(Reflect)] | |
| | pub fn reflect_derive( | |
| | input: TokenStream | |
| | ) -> TokenStream | |
| +----------------------------------+ |
| | |
| v |
| 4. You parse tokens with syn |
| +----------------------------------+ |
| | let input = parse_macro_input!( | |
| | input as DeriveInput | |
| | ); | |
| +----------------------------------+ |
| | |
| v |
| 5. You analyze the AST |
| +----------------------------------+ |
| | let name = &input.ident; | |
| | let fields = extract_fields(...);| |
| +----------------------------------+ |
| | |
| v |
| 6. You generate output with quote |
| +----------------------------------+ |
| | quote! { | |
| | impl Reflect for #name { | |
| | fn fields() -> ... { | |
| | &[#(#field_infos),*] | |
| | } | |
| | } | |
| | } | |
| +----------------------------------+ |
| | |
| v |
| 7. Compiler appends generated code to module |
| +----------------------------------+ |
| | struct User { ... } | <- Original code |
| | impl Reflect for User { | <- Generated code |
| | fn fields() -> ... { ... } | |
| | } | |
| +----------------------------------+ |
| |
+------------------------------------------------------------------------+
Proc-Macro Compilation Phases
+------------------------------------------------------------------------+
| PROC-MACRO COMPILATION PHASES |
+------------------------------------------------------------------------+
| |
| Build starts: cargo build |
| | |
| v |
| +----------------------------------+ |
| | Phase 1: Compile proc-macro crate| |
| +----------------------------------+ |
| | - Cargo sees proc-macro = true | |
| | - Compiles for HOST target | |
| | - Produces .dylib/.so/.dll | |
| | - Links against libproc_macro | |
| +----------------------------------+ |
| | |
| v |
| +----------------------------------+ |
| | Phase 2: Load proc-macro | |
| +----------------------------------+ |
| | - rustc loads the .dylib | |
| | - Runs in the compiler process | |
| | - Has access to proc_macro API | |
| +----------------------------------+ |
| | |
| v |
| +----------------------------------+ |
| | Phase 3: Expand macros | |
| +----------------------------------+ |
| | - For each #[derive(Reflect)] | |
| | - Call reflect_derive(tokens) | |
| | - Insert output tokens | |
| +----------------------------------+ |
| | |
| v |
| +----------------------------------+ |
| | Phase 4: Compile expanded code | |
| +----------------------------------+ |
| | - All macros now expanded | |
| | - Normal compilation continues | |
| | - Compiles for TARGET platform | |
| +----------------------------------+ |
| |
+------------------------------------------------------------------------+
Real-World Examples: How Production Crates Use Proc Macros
serde: Serialization/Deserialization
#[derive(Serialize, Deserialize)]
struct User {
name: String,
#[serde(rename = "user_age")]
age: u32,
#[serde(skip)]
internal_id: u64,
}
// serde generates:
// - impl Serialize for User { ... }
// - impl Deserialize for User { ... }
// - Respects attributes like rename, skip, default
diesel: Database ORM
#[derive(Queryable, Insertable)]
#[diesel(table_name = users)]
struct User {
id: i32,
name: String,
created_at: NaiveDateTime,
}
// diesel generates:
// - impl Queryable<users::SqlType, Pg> for User { ... }
// - impl Insertable<users::table> for User { ... }
// - Type-safe SQL query building
bevy_reflect: Game Engine Reflection
#[derive(Reflect)]
struct Transform {
translation: Vec3,
rotation: Quat,
scale: Vec3,
}
// bevy_reflect generates:
// - impl Reflect for Transform { ... }
// - Field access by name: transform.field("translation")
// - Used by the editor to inspect/modify components
Compile-Time Reflection vs Runtime Reflection
+------------------------------------------------------------------------+
| REFLECTION: COMPILE-TIME VS RUNTIME |
+------------------------------------------------------------------------+
| |
| RUNTIME REFLECTION (Java, C#, Python) |
| ===================================== |
| |
| Pros: |
| + Can inspect ANY type without preparation |
| + Works with dynamically loaded code |
| + Simple API: obj.getClass().getFields() |
| |
| Cons: |
| - Binary bloat: metadata stored for ALL types |
| - Runtime overhead: type lookup on every access |
| - Breaks encapsulation: can access private fields |
| - Not type-safe: returns Object, needs casting |
| |
| Memory overhead example (Java): |
| class Point { int x, y; } |
| Stores: class name, field names, field types, method signatures... |
| Overhead: ~500 bytes per class minimum |
| |
+------------------------------------------------------------------------+
| |
| COMPILE-TIME REFLECTION (Rust with proc-macros) |
| ================================================ |
| |
| Pros: |
| + Only types with derive have metadata (opt-in) |
| + Zero runtime overhead for reflection operations |
| + Type-safe: field types known at compile time |
| + Errors caught at compile time |
| |
| Cons: |
| - Must annotate types that need reflection |
| - Cannot reflect on external types without wrappers |
| - Compile time increases with complex macros |
| - More complex to implement |
| |
| Memory overhead (Rust with #[derive(Reflect)]): |
| Only types you opt-in have metadata |
| Metadata is static, stored in read-only section |
| Overhead: controlled, only what you generate |
| |
+------------------------------------------------------------------------+
| |
| TRADE-OFF SUMMARY: |
| |
| Runtime Reflection: Convenient but costly |
| Compile-Time Reflection: Explicit but zero-cost |
| |
| Rust's philosophy: Pay only for what you use |
| |
+------------------------------------------------------------------------+
Real World Outcome
Youâll be able to print the fields of a struct without manually writing a Debug implementation or using external reflection libraries. This is perfect for building your own serialization, GUI inspectors, or debugging tools.
Example Usage:
#[derive(Reflect)]
struct User {
name: String,
age: u32,
email: String,
}
#[derive(Reflect)]
struct Product {
id: u64,
price: f32,
in_stock: bool,
}
fn main() {
println!("=== User Reflection ===");
for field in User::fields() {
println!(" Field: {}, Type: {}", field.name, field.type_name);
}
println!("\n=== Product Reflection ===");
for field in Product::fields() {
println!(" Field: {}, Type: {}", field.name, field.type_name);
}
}
Console Output:
$ cargo run
=== User Reflection ===
Field: name, Type: alloc::string::String
Field: age, Type: u32
Field: email, Type: alloc::string::String
=== Product Reflection ===
Field: id, Type: u64
Field: price, Type: f32
Field: in_stock, Type: bool
Generated Code (via cargo expand):
When you use #[derive(Reflect)], the macro generates code like this:
// Original code
#[derive(Reflect)]
struct User {
name: String,
age: u32,
email: String,
}
// What the macro generates (shown via `cargo expand`)
struct User {
name: String,
age: u32,
email: String,
}
impl Reflect for User {
fn fields() -> &'static [FieldInfo] {
&[
FieldInfo {
name: "name",
type_name: "alloc::string::String",
offset: 0usize,
},
FieldInfo {
name: "age",
type_name: "u32",
offset: 24usize,
},
FieldInfo {
name: "email",
type_name: "alloc::string::String",
offset: 32usize,
},
]
}
fn type_name() -> &'static str {
"User"
}
fn field_count() -> usize {
3usize
}
}
Advanced Usage - Building a Generic Inspector:
fn inspect<T: Reflect>(type_name: &str) {
println!("\n+=======================================+");
println!("| Type Inspector: {:<22} |", type_name);
println!("+=======================================+");
println!("| Field Count: {:<24} |", T::field_count());
println!("+=======================================+");
for (i, field) in T::fields().iter().enumerate() {
println!("| [{}] {:<32} |", i, field.name);
println!("| Type: {:<27} |", field.type_name);
println!("| Offset: {} bytes{:<17} |", field.offset, "");
if i < T::field_count() - 1 {
println!("+---------------------------------------+");
}
}
println!("+=======================================+");
}
fn main() {
inspect::<User>("User");
inspect::<Product>("Product");
}
Output:
+=======================================+
| Type Inspector: User |
+=======================================+
| Field Count: 3 |
+=======================================+
| [0] name |
| Type: alloc::string::String |
| Offset: 0 bytes |
+---------------------------------------+
| [1] age |
| Type: u32 |
| Offset: 24 bytes |
+---------------------------------------+
| [2] email |
| Type: alloc::string::String |
| Offset: 32 bytes |
+=======================================+
+=======================================+
| Type Inspector: Product |
+=======================================+
| Field Count: 3 |
+=======================================+
| [0] id |
| Type: u64 |
| Offset: 0 bytes |
+---------------------------------------+
| [1] price |
| Type: f32 |
| Offset: 8 bytes |
+---------------------------------------+
| [2] in_stock |
| Type: bool |
| Offset: 12 bytes |
+=======================================+
Verification via cargo expand:
$ cargo install cargo-expand
$ cargo expand --lib
# Shows the exact code generated by your procedural macro
# Compare this to what you expected to verify correctness
Complete Project Specification
Build a working #[derive(Reflect)] macro that:
- Generates a trait implementation that provides metadata about struct fields
- Works with named structs (e.g.,
struct Foo { x: i32 }) - Handles generics and lifetimes (e.g.,
struct Foo<'a, T> { data: &'a T }) - Provides field offsets using
std::mem::offset_of!or manual calculation - Supports a helper attribute
#[reflect(skip)]to exclude fields - Generates helpful error messages for unsupported inputs (enums, unions)
Solution Architecture
Workspace Structure
reflect-macro/
+-- Cargo.toml # Workspace manifest
+-- reflect-core/ # Core trait and types (regular library)
| +-- Cargo.toml
| +-- src/
| +-- lib.rs # Reflect trait, FieldInfo struct
+-- reflect-derive/ # Proc-macro crate
| +-- Cargo.toml # Contains [lib] proc-macro = true
| +-- src/
| +-- lib.rs # The derive macro implementation
+-- tests/ # Integration tests
+-- reflect_tests.rs
Core Trait Definition (reflect-core/src/lib.rs)
/// Information about a single field in a reflected struct
#[derive(Debug, Clone, Copy)]
pub struct FieldInfo {
/// The name of the field as a string
pub name: &'static str,
/// The full type name of the field
pub type_name: &'static str,
/// The offset of the field in bytes from the struct's start
pub offset: usize,
}
/// Trait for types that support compile-time reflection
pub trait Reflect {
/// Returns information about all reflected fields
fn fields() -> &'static [FieldInfo];
/// Returns the name of the type
fn type_name() -> &'static str;
/// Returns the number of fields
fn field_count() -> usize {
Self::fields().len()
}
}
Proc-Macro Crate Configuration (reflect-derive/Cargo.toml)
[package]
name = "reflect-derive"
version = "0.1.0"
edition = "2021"
[lib]
proc-macro = true # CRITICAL: This makes it a proc-macro crate
[dependencies]
syn = { version = "2.0", features = ["full"] }
quote = "1.0"
proc-macro2 = "1.0"
TokenStream Input/Output Flow
+------------------------------------------------------------------------+
| REFLECT MACRO DATA FLOW |
+------------------------------------------------------------------------+
| |
| Input TokenStream (from #[derive(Reflect)]): |
| ============================================ |
| |
| struct User { |
| name: String, |
| age: u32, |
| } |
| |
| | |
| v |
| +----------------------------------+ |
| | Parse with syn | |
| +----------------------------------+ |
| | DeriveInput { | |
| | ident: "User" | |
| | fields: [ | |
| | { name: "name", ty: String } | |
| | { name: "age", ty: u32 } | |
| | ] | |
| | } | |
| +----------------------------------+ |
| | |
| v |
| +----------------------------------+ |
| | Generate with quote | |
| +----------------------------------+ |
| | |
| v |
| Output TokenStream (appended to module): |
| ======================================== |
| |
| impl Reflect for User { |
| fn fields() -> &'static [FieldInfo] { |
| &[ |
| FieldInfo { |
| name: "name", |
| type_name: "alloc::string::String", |
| offset: std::mem::offset_of!(User, name), |
| }, |
| FieldInfo { |
| name: "age", |
| type_name: "u32", |
| offset: std::mem::offset_of!(User, age), |
| }, |
| ] |
| } |
| |
| fn type_name() -> &'static str { |
| "User" |
| } |
| } |
| |
+------------------------------------------------------------------------+
Field Extraction from DeriveInput
fn extract_fields(data: &Data) -> syn::Result<Vec<&Field>> {
match data {
Data::Struct(data_struct) => {
match &data_struct.fields {
Fields::Named(fields_named) => {
Ok(fields_named.named.iter().collect())
}
Fields::Unnamed(_) => {
Err(syn::Error::new_spanned(
data_struct.fields,
"Reflect does not support tuple structs"
))
}
Fields::Unit => {
Ok(vec![]) // Unit structs have no fields
}
}
}
Data::Enum(data_enum) => {
Err(syn::Error::new_spanned(
&data_enum.enum_token,
"Reflect does not support enums (yet)"
))
}
Data::Union(data_union) => {
Err(syn::Error::new_spanned(
&data_union.union_token,
"Reflect does not support unions"
))
}
}
}
Generated Trait Implementation Design
// For a struct like:
// struct User<'a, T: Clone> {
// name: &'a str,
// data: T,
// }
// Generate:
impl<'a, T: Clone> Reflect for User<'a, T> {
fn fields() -> &'static [FieldInfo] {
&[
FieldInfo {
name: "name",
type_name: std::any::type_name::<&'a str>(),
offset: 0, // Using memoffset for accurate offsets
},
FieldInfo {
name: "data",
type_name: std::any::type_name::<T>(),
offset: std::mem::size_of::<&'a str>(),
},
]
}
fn type_name() -> &'static str {
std::any::type_name::<Self>()
}
}
Phased Implementation Guide
Phase 1: Set Up Proc-Macro Crate (Day 1)
Objectives:
- Create the workspace structure
- Configure the proc-macro crate correctly
- Verify the crate compiles and links
Tasks:
- Create the workspace:
mkdir reflect-macro && cd reflect-macro cargo new reflect-core --lib cargo new reflect-derive --lib - Create the workspace Cargo.toml:
[workspace] members = ["reflect-core", "reflect-derive"] - Configure reflect-derive/Cargo.toml:
[package] name = "reflect-derive" version = "0.1.0" edition = "2021" [lib] proc-macro = true [dependencies] syn = { version = "2.0", features = ["full", "extra-traits"] } quote = "1.0" proc-macro2 = "1.0" - Add a minimal derive macro:
// reflect-derive/src/lib.rs use proc_macro::TokenStream; #[proc_macro_derive(Reflect)] pub fn reflect_derive(input: TokenStream) -> TokenStream { // Just return empty for now TokenStream::new() } - Verify compilation:
cargo build
Verification:
- The crate compiles without errors
- You see
libproc_macrolinking in verbose output
Phase 2: Parse DeriveInput with syn (Day 2)
Objectives:
- Parse the input TokenStream into a structured DeriveInput
- Extract the struct name
- Print debug info during compilation (using
eprintln!)
Tasks:
- Add parsing:
use proc_macro::TokenStream; use syn::{parse_macro_input, DeriveInput}; #[proc_macro_derive(Reflect)] pub fn reflect_derive(input: TokenStream) -> TokenStream { let input = parse_macro_input!(input as DeriveInput); // Debug: print during compilation eprintln!("Deriving Reflect for: {}", input.ident); TokenStream::new() } - Create a test file:
// tests/basic.rs use reflect_derive::Reflect; #[derive(Reflect)] struct User { name: String, age: u32, } fn main() {} - Run the test to see debug output:
cargo test --test basic # Should see: "Deriving Reflect for: User" in compiler output
Verification:
- You see the struct name printed during compilation
- No parsing errors occur
Phase 3: Extract Struct Fields (Day 3)
Objectives:
- Access the fields of a named struct
- Handle errors for unsupported types (enums, unions, tuple structs)
- Extract field names and types
Tasks:
- Add field extraction:
use syn::{Data, Fields, DeriveInput}; fn extract_named_fields(input: &DeriveInput) -> syn::Result<&syn::FieldsNamed> { match &input.data { Data::Struct(data_struct) => { match &data_struct.fields { Fields::Named(fields) => Ok(fields), _ => Err(syn::Error::new_spanned( input, "Reflect only supports structs with named fields" )) } } _ => Err(syn::Error::new_spanned( input, "Reflect only supports structs" )) } } - Iterate over fields:
#[proc_macro_derive(Reflect)] pub fn reflect_derive(input: TokenStream) -> TokenStream { let input = parse_macro_input!(input as DeriveInput); let fields = match extract_named_fields(&input) { Ok(f) => f, Err(e) => return e.to_compile_error().into(), }; for field in &fields.named { let name = field.ident.as_ref().unwrap(); let ty = &field.ty; eprintln!(" Field: {} : {:?}", name, quote::quote!(#ty)); } TokenStream::new() }
Verification:
- Field names and types are printed during compilation
- Trying to derive on an enum produces a helpful error
Phase 4: Generate Trait Impl with quote (Days 4-5)
Objectives:
- Generate the
impl Reflectblock - Use quoteâs repetition syntax for fields
- Return valid TokenStream
Tasks:
- Define the core trait in reflect-core:
// reflect-core/src/lib.rs #[derive(Debug, Clone, Copy)] pub struct FieldInfo { pub name: &'static str, pub type_name: &'static str, pub offset: usize, } pub trait Reflect { fn fields() -> &'static [FieldInfo]; fn type_name() -> &'static str; fn field_count() -> usize { Self::fields().len() } } - Generate the implementation:
use quote::quote; use proc_macro2::TokenStream as TokenStream2; fn generate_impl(input: &DeriveInput) -> syn::Result<TokenStream2> { let name = &input.ident; let fields = extract_named_fields(input)?; let field_infos = fields.named.iter().map(|f| { let field_name = f.ident.as_ref().unwrap(); let field_ty = &f.ty; let name_str = field_name.to_string(); quote! { reflect_core::FieldInfo { name: #name_str, type_name: std::any::type_name::<#field_ty>(), offset: std::mem::offset_of!(#name, #field_name), } } }); let field_count = fields.named.len(); let name_str = name.to_string(); Ok(quote! { impl reflect_core::Reflect for #name { fn fields() -> &'static [reflect_core::FieldInfo] { &[#(#field_infos),*] } fn type_name() -> &'static str { #name_str } fn field_count() -> usize { #field_count } } }) } - Complete the derive function:
#[proc_macro_derive(Reflect)] pub fn reflect_derive(input: TokenStream) -> TokenStream { let input = parse_macro_input!(input as DeriveInput); match generate_impl(&input) { Ok(output) => output.into(), Err(e) => e.to_compile_error().into(), } }
Verification:
cargo expandshows the generated impl- The trait methods can be called at runtime
Phase 5: Add Error Handling for Invalid Inputs (Day 6)
Objectives:
- Produce helpful compiler errors for enums and unions
- Handle tuple structs gracefully
- Verify error spans point to correct source locations
Tasks:
- Create comprehensive error handling:
fn extract_named_fields(input: &DeriveInput) -> syn::Result<&syn::FieldsNamed> { match &input.data { Data::Struct(data) => { match &data.fields { Fields::Named(fields) => Ok(fields), Fields::Unnamed(fields) => Err(syn::Error::new_spanned( fields, "Reflect does not support tuple structs. \ Use named fields: `struct Foo { field: Type }`" )), Fields::Unit => Err(syn::Error::new_spanned( &input.ident, "Reflect does not support unit structs. \ Add at least one field." )), } } Data::Enum(data) => Err(syn::Error::new_spanned( data.enum_token, "Reflect does not support enums. \ Consider using #[derive(ReflectEnum)] instead." )), Data::Union(data) => Err(syn::Error::new_spanned( data.union_token, "Reflect does not support unions due to safety concerns." )), } } - Create compile-fail tests using trybuild:
// tests/compile_fail/enum.rs use reflect_derive::Reflect; #[derive(Reflect)] enum Color { Red, Green, Blue, } fn main() {}// tests/ui.rs #[test] fn compile_fail_tests() { let t = trybuild::TestCases::new(); t.compile_fail("tests/compile_fail/*.rs"); }
Verification:
cargo testpasses with proper error message checks- Error messages point to the correct token spans
Phase 6: Support Generics and Lifetimes (Day 7)
Objectives:
- Handle generic type parameters
- Handle lifetime parameters
- Propagate where clauses correctly
Tasks:
- Update the impl generation to include generics:
fn generate_impl(input: &DeriveInput) -> syn::Result<TokenStream2> { let name = &input.ident; let generics = &input.generics; let (impl_generics, ty_generics, where_clause) = generics.split_for_impl(); let fields = extract_named_fields(input)?; // ... field processing ... Ok(quote! { impl #impl_generics reflect_core::Reflect for #name #ty_generics #where_clause { fn fields() -> &'static [reflect_core::FieldInfo] { &[#(#field_infos),*] } fn type_name() -> &'static str { std::any::type_name::<Self>() } } }) } - Test with generic structs:
#[derive(Reflect)] struct Container<'a, T: Clone> { data: &'a T, count: usize, } fn main() { println!("{}", Container::<i32>::type_name()); // Prints: "test::Container<i32>" }
Verification:
- Generic structs compile and work correctly
- Type names include generic parameters
Testing Strategy
Compile Tests
Test that valid inputs produce working code:
#[cfg(test)]
mod tests {
use super::*;
#[derive(Reflect)]
struct SimpleStruct {
a: i32,
b: String,
}
#[test]
fn test_field_count() {
assert_eq!(SimpleStruct::field_count(), 2);
}
#[test]
fn test_field_names() {
let fields = SimpleStruct::fields();
assert_eq!(fields[0].name, "a");
assert_eq!(fields[1].name, "b");
}
#[test]
fn test_type_name() {
assert!(SimpleStruct::type_name().contains("SimpleStruct"));
}
}
trybuild for Error Messages
Use the trybuild crate to test compile-time errors:
// tests/ui.rs
#[test]
fn ui_tests() {
let t = trybuild::TestCases::new();
t.compile_fail("tests/ui/fail/*.rs");
t.pass("tests/ui/pass/*.rs");
}
// tests/ui/fail/enum.rs
use reflect_derive::Reflect;
#[derive(Reflect)]
enum Foo { A, B }
fn main() {}
// tests/ui/fail/enum.stderr
error: Reflect does not support enums
--> tests/ui/fail/enum.rs:4:1
|
4 | enum Foo { A, B }
| ^^^^
cargo expand Verification
Always verify generated code with cargo expand:
# Install cargo-expand
cargo install cargo-expand
# View expanded code
cargo expand --package your-test-crate
# View specific test
cargo expand --test integration_test
Common Pitfalls
1. Forgetting proc-macro = true in Cargo.toml
Problem:
error[E0658]: `proc-macro` crate type is experimental
Solution:
[lib]
proc-macro = true
2. Hygiene Issues with Generated Identifiers
Problem: Generated code uses identifiers that conflict with user code.
// User's code
let offset = 42;
// Your macro generates:
let offset = std::mem::offset_of!(Foo, bar); // CONFLICT!
Solution: Use unique identifiers or fully qualified paths:
quote! {
// Use fully qualified paths
::std::mem::offset_of!(#name, #field)
// Or generate unique identifiers
let __reflect_offset = ...;
}
3. Handling Enums and Unit Structs
Problem: Trying to access .named on tuple struct fields.
// This panics for tuple structs:
fields.named.iter()
Solution: Match on all variants:
match &data.fields {
Fields::Named(f) => process_named(f),
Fields::Unnamed(f) => process_unnamed(f),
Fields::Unit => process_unit(),
}
4. Generic Parameter Propagation
Problem: Forgetting to include generics in the impl.
// WRONG: Missing generics
impl Reflect for Foo<T> { ... }
// CORRECT: Include all generic parameters
impl<T: Clone> Reflect for Foo<T> { ... }
Solution: Use split_for_impl():
let (impl_generics, ty_generics, where_clause) = input.generics.split_for_impl();
quote! {
impl #impl_generics Reflect for #name #ty_generics #where_clause {
...
}
}
5. TokenStream vs TokenStream2
Problem: Mixing proc_macro::TokenStream with proc_macro2::TokenStream.
// WRONG: Can't use proc_macro2 in function signature
#[proc_macro_derive(Reflect)]
pub fn derive(input: proc_macro2::TokenStream) -> proc_macro2::TokenStream
Solution: Convert at the boundaries:
use proc_macro::TokenStream;
use proc_macro2::TokenStream as TokenStream2;
#[proc_macro_derive(Reflect)]
pub fn derive(input: TokenStream) -> TokenStream {
let input2: TokenStream2 = input.into();
// ... work with TokenStream2 internally ...
output.into() // Convert back
}
6. Missing quote Feature Dependencies
Problem: quote! macro not found or identifier interpolation fails.
Solution: Ensure correct dependencies:
[dependencies]
quote = "1.0"
proc-macro2 = "1.0"
syn = { version = "2.0", features = ["full", "extra-traits"] }
Extensions and Challenges
Extension 1: Attribute Macros for Field Customization
Add support for #[reflect(skip)] and #[reflect(rename = "...")]:
#[derive(Reflect)]
struct User {
name: String,
#[reflect(skip)]
password_hash: String,
#[reflect(rename = "user_age")]
age: u32,
}
Implementation hint:
fn should_skip_field(field: &Field) -> bool {
field.attrs.iter().any(|attr| {
if attr.path().is_ident("reflect") {
attr.parse_nested_meta(|meta| {
if meta.path.is_ident("skip") {
return Ok(true);
}
Ok(false)
}).unwrap_or(false)
} else {
false
}
})
}
Extension 2: Derive for Enums
Extend the macro to support enums:
#[derive(Reflect)]
enum Message {
Text { content: String },
Image { url: String, width: u32, height: u32 },
Ping,
}
// Generated:
// - Reflect implementation
// - Variant enumeration
// - Field access per variant
Extension 3: Custom Attributes on the Derive
Add container-level attributes:
#[derive(Reflect)]
#[reflect(debug, crate = "my_reflect")]
struct User { ... }
Extension 4: Runtime Field Access
Extend the trait to allow getting/setting fields by name:
pub trait ReflectMut: Reflect {
fn get_field(&self, name: &str) -> Option<&dyn std::any::Any>;
fn get_field_mut(&mut self, name: &str) -> Option<&mut dyn std::any::Any>;
}
Extension 5: Nested Struct Reflection
Add support for recursively reflecting nested structs:
#[derive(Reflect)]
struct Outer {
inner: Inner, // Also has #[derive(Reflect)]
}
// Generate code that can traverse the hierarchy
The Interview Questions Theyâll Ask
1. âWhat is a procedural macro and how does it differ from a declarative macro (macro_rules!)?â
Answer: Declarative macros (macro_rules!) use pattern matching on token trees - they can only do token substitution based on patterns. Procedural macros are full Rust programs that run at compile time - they receive a TokenStream, can parse it into an AST, perform arbitrary analysis, and generate arbitrary output. Proc macros can inspect types, iterate over fields, and make decisions based on structure. They must live in separate crates because theyâre compiled for the host machine and loaded into the compiler as dynamic libraries.
2. âWhy do procedural macros need to live in their own crate?â
Answer: Proc macros are compiled for the HOST machine (where rustc runs) and dynamically loaded into the compiler process. Regular code is compiled for the TARGET machine. Since these can be different platforms (e.g., cross-compilation), they must be separate compilation units. The proc-macro crate produces a .dylib/.so/.dll that rustc loads at compile time to expand macros. This is fundamentally different from normal library crates.
3. âExplain the role of the syn and quote crates.â
Answer: syn is a parsing library that converts a raw TokenStream into a structured AST (Abstract Syntax Tree). It provides types like DeriveInput, ItemFn, Expr that you can pattern match on and navigate programmatically. quote is the inverse - it converts Rust-like syntax back into a TokenStream. The quote! macro lets you write what looks like Rust code with interpolation (#variable) for inserting values. Together they form the parse-transform-generate pipeline: syn for input, your logic for transformation, quote for output.
4. âWhat is âmacro hygieneâ?â
Answer: Macro hygiene prevents accidental name collisions between macro-generated code and user code. Each identifier in a macro has a âhygiene markâ or âspanâ that determines its scope. Identifiers from different macro invocations donât collide even if they have the same name. In proc macros, you control hygiene through the Span type - Span::call_site() uses the callerâs scope (can reference callerâs variables), while Span::mixed_site() provides typical hygiene behavior. This prevents macros from accidentally capturing user variables or vice versa.
5. âHow would you handle errors in a procedural macro?â
Answer: Use syn::Error with new_spanned to attach errors to specific source locations:
Err(syn::Error::new_spanned(tokens, "error message"))
Then convert to compiler error:
match result {
Ok(output) => output.into(),
Err(e) => e.to_compile_error().into(),
}
This produces error messages that point to the correct line in the userâs source code.
6. âWhat is the difference between proc_macro::TokenStream and proc_macro2::TokenStream?â
Answer: proc_macro::TokenStream is the compilerâs internal type - it can only be used in proc-macro crate entry points. proc_macro2::TokenStream is a third-party crate that provides a similar API but can be used anywhere, including in tests and non-proc-macro code. Most proc macros use proc_macro at the entry point and convert to proc_macro2 internally because syn and quote work with proc_macro2. The conversion is free: .into() works in both directions.
Books That Will Help
| Topic | Book | Chapter/Section |
|---|---|---|
| Proc Macros Introduction | âProgramming Rustâ by Jim Blandy & Jason Orendorff | Ch. 20: Macros |
| Macro System Overview | âThe Rust Programming Languageâ | Ch. 19: Advanced Features - Macros |
| syn Crate Patterns | syn crate documentation | Full documentation and examples |
| quote Crate Usage | quote crate documentation | Full documentation and examples |
| Real-World Examples | serde_derive source code | Study how serde implements derives |
| Deep Macro Internals | âRust for Rustaceansâ by Jon Gjengset | Ch. 7: Macros (advanced patterns) |
| Compiler Internals | The Rustonomicon | Advanced topics on unsafe and FFI |
AST Structure for a Simple Struct
+------------------------------------------------------------------------+
| AST STRUCTURE: struct User { name: String, age: u32 } |
+------------------------------------------------------------------------+
| |
| DeriveInput |
| +-- attrs: [] // No outer attributes |
| +-- vis: Visibility::Inherited // No pub keyword |
| +-- ident: Ident("User") // The struct name |
| +-- generics: Generics // Empty (no generics) |
| | +-- params: [] |
| | +-- where_clause: None |
| +-- data: Data::Struct(DataStruct) |
| +-- struct_token: Token![struct] |
| +-- fields: Fields::Named(FieldsNamed) |
| +-- brace_token: Brace |
| +-- named: Punctuated<Field> |
| +-- [0] Field |
| | +-- attrs: [] |
| | +-- vis: Visibility::Inherited |
| | +-- ident: Some(Ident("name")) |
| | +-- colon_token: Some(Token![:]) |
| | +-- ty: Type::Path(TypePath) |
| | +-- path: Path |
| | +-- segments: [PathSegment] |
| | +-- ident: Ident("String") |
| | +-- arguments: None |
| +-- [1] Field |
| +-- attrs: [] |
| +-- vis: Visibility::Inherited |
| +-- ident: Some(Ident("age")) |
| +-- colon_token: Some(Token![:]) |
| +-- ty: Type::Path(TypePath) |
| +-- path: Path |
| +-- segments: [PathSegment] |
| +-- ident: Ident("u32") |
| +-- arguments: None |
| |
+------------------------------------------------------------------------+
Generated Code Flow
+------------------------------------------------------------------------+
| GENERATED CODE FLOW |
+------------------------------------------------------------------------+
| |
| User Source File (before macro expansion): |
| ========================================== |
| |
| use reflect_derive::Reflect; |
| |
| #[derive(Reflect)] |
| struct User { |
| name: String, |
| age: u32, |
| } |
| |
| fn main() { |
| for field in User::fields() { |
| println!("{}", field.name); |
| } |
| } |
| |
| | |
| v |
| |
| User Source File (after macro expansion): |
| ========================================= |
| |
| use reflect_derive::Reflect; |
| |
| struct User { |
| name: String, |
| age: u32, |
| } |
| |
| impl ::reflect_core::Reflect for User { // GENERATED! |
| fn fields() -> &'static [::reflect_core::FieldInfo] { |
| &[ |
| ::reflect_core::FieldInfo { |
| name: "name", |
| type_name: "alloc::string::String", |
| offset: 0usize, |
| }, |
| ::reflect_core::FieldInfo { |
| name: "age", |
| type_name: "u32", |
| offset: 24usize, |
| }, |
| ] |
| } |
| |
| fn type_name() -> &'static str { |
| "User" |
| } |
| |
| fn field_count() -> usize { |
| 2usize |
| } |
| } |
| |
| fn main() { |
| for field in User::fields() { |
| println!("{}", field.name); |
| } |
| } |
| |
+------------------------------------------------------------------------+
Summary
This project teaches you to:
- Master the proc-macro ecosystem - Understand why macros need separate crates and how they integrate with rustc
- Parse Rust code with syn - Navigate the AST to extract struct names, fields, types, and attributes
- Generate code with quote - Use template-based code generation with proper hygiene
- Handle edge cases - Produce helpful errors for unsupported inputs
- Support generics - Propagate type parameters and lifetimes correctly
- Test comprehensively - Use trybuild for compile-time error testing
By the end, youâll understand how production crates like serde, diesel, and bevy implement their derive macros - and youâll be able to build your own.
+------------------------------------------------------------------------+
| PROJECT COMPLETE WHEN: |
+------------------------------------------------------------------------+
| + #[derive(Reflect)] compiles for named structs |
| + Field names and types are accessible at runtime |
| + cargo expand shows correct generated code |
| + Enums/unions produce helpful compile errors |
| + Generic structs work correctly |
| + trybuild tests verify error messages |
| + You can explain the syn/quote/proc-macro2 relationship |
+------------------------------------------------------------------------+
Conclusion
Procedural macros are Rustâs answer to the reflection capabilities found in other languages, but with a crucial difference: everything happens at compile time. This means zero runtime overhead, full type safety, and errors caught before your code ever runs.
By building your own #[derive(Reflect)] macro, youâve learned:
- The compilation model - Why proc macros are dynamic libraries loaded by rustc
- Token stream processing - How Rust code is represented and transformed
- AST navigation - Using syn to understand code structure
- Code generation - Using quote to produce valid Rust code
- Error handling - Producing helpful, well-located compiler errors
- Generic handling - Propagating type parameters correctly
These skills directly apply to understanding and contributing to major Rust ecosystem crates. Every time you use #[derive(Serialize)], #[tokio::main], or #[derive(Component)], you now understand exactly whatâs happening under the hood.
Next Steps: Try Project 8 (Building a Custom Runtime) to see how macros combine with async for building executors.