Project 7: Implement a Self-Referential Struct with Pin
Build a self-referential struct safely using
Pin,PhantomPinned, and documented unsafe invariants.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Master |
| Time Estimate | 2-3 weeks |
| Main Programming Language | Rust |
| Alternative Programming Languages | C++ (self-referential classes) |
| Coolness Level | Very High |
| Business Potential | Medium |
| Prerequisites | Lifetimes, unsafe, ownership, raw pointers |
| Key Topics | Pin, Unpin, self-referential safety, projection |
1. Learning Objectives
By completing this project, you will:
- Explain why self-referential structs are unsafe without pinning.
- Implement a safe pinned self-referential type.
- Document and uphold unsafe invariants.
- Understand projection and
!Unpinbehavior.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Pin and Unpin Semantics
Fundamentals
Pin<T> is a wrapper that guarantees a value will not be moved in memory. Most types are Unpin, meaning they can be safely moved even when pinned. To create a self-referential struct, you must ensure it is !Unpin, typically by including PhantomPinned. Pinning is essential because a self-reference would become invalid if the value moved.
Deep Dive into the concept
Rust assumes that values can be moved freely unless a type opts out by being !Unpin. This assumption makes self-referential structs unsafe because any move would invalidate internal pointers. Pin provides a contract: once a value is pinned, it will not move unless it is Unpin. The standard pattern is to allocate the value on the heap (Box<T>), then pin it (Pin<Box<T>>). The heap allocation gives a stable address; the pinning ensures the value is never moved out of that address.
The key idea is that pinning is about address stability, not immutability. You can still mutate pinned values through Pin<&mut T> if you do so safely. But you cannot move the value out. This is why you add PhantomPinned to your struct: it prevents the compiler from auto-implementing Unpin, and thus prevents moves through safe APIs. This is a deliberate opt-out from Rust’s default move semantics.
Pinning is most commonly used in async programming. A Future may contain self-referential state, so it must not move after being polled. The compiler uses Pin to enforce that. Your project mirrors this pattern with a simpler example: a struct that owns a String and stores a pointer to its internal buffer. Without pinning, moving the struct would change the address of the String, invalidating the pointer.
Projection is another subtlety. If you have a pinned struct and want to access its fields, you must ensure you do not move them. This is why the pin-project crate exists. For this project, you can implement manual projection with unsafe code, but you must document the invariants: you may obtain Pin<&mut Field> only if the field itself is pinned and will not move. This is complex; for the project, you can keep the API narrow and avoid exposing projections.
The safety story comes down to invariants. You must ensure that: (1) the internal pointer always points to the correct buffer, (2) the struct is never moved after initialization, and (3) the pointer is only read while the struct is alive. You can enforce (2) with Pin and !Unpin, and you can enforce (1) by initializing the pointer only after pinning.
How this fit on projects
This concept is central to §3.1 and §4.1. It also connects to Project 10 (safe abstraction with unsafe internals) and Project 5 (stable storage alternatives).
Definitions & key terms
- Pin: Wrapper guaranteeing no moves.
- Unpin: Trait for types that can be moved even when pinned.
- PhantomPinned: Marker to prevent auto-
Unpin. - Projection: Accessing fields of pinned structs safely.
Mental model diagram (ASCII)
Pin<Box<Self>>
|
v
[heap allocation at stable address]
self.ptr -> self.data (stable)
How it works (step-by-step)
- Allocate struct on heap.
- Pin it so it cannot move.
- Initialize internal pointer to field.
- Access pointer safely via pinned references.
Minimal concrete example
let pinned = Box::pin(SelfRef::new("hello"));
Common misconceptions
- “Pin makes things immutable.” It only prevents moves.
- “Pin is only for async.” It is for address stability generally.
Check-your-understanding questions
- Why is
PhantomPinnedrequired? - What does
Pin<Box<T>>guarantee? - Why must you initialize the self-reference after pinning?
Check-your-understanding answers
- It prevents auto-
Unpinand thus prevents safe moves. - The value is allocated on the heap and will not move.
- Because the address must be stable when you take the reference.
Real-world applications
- Async futures.
- Self-referential parser state machines.
Where you’ll apply it
- This project: §5.10 Phase 1-2, §7 pitfalls.
- Also used in: Project 10: Capstone.
References
- Rustonomicon “Pinning”.
- RFC 2349 (Pin).
Key insights
Pinning is the only safe way to build self-referential types in Rust.
Summary
Pin + !Unpin gives you address stability, which makes self-references safe.
Homework/Exercises to practice the concept
- Write a struct with
PhantomPinnedand check that it is!Unpin. - Attempt to move a pinned value and observe compiler errors.
Solutions to the homework/exercises
- Add
PhantomPinnedand verify withstd::mem::needs_dropor trait bounds. - The compiler rejects moves from
Pin<&mut T>.
2.2 Unsafe Invariants for Self-References
Fundamentals
A self-referential struct stores a pointer to one of its own fields. The invariants are: the struct must not move after the pointer is created, the pointer must always point to valid data, and the pointer must not be used after the struct is dropped. Unsafe code is required to create the pointer, but the public API must enforce these invariants.
Deep Dive into the concept
Unsafe code is necessary because the compiler cannot express “this pointer points to my own field.” The safe interface must ensure that the pointer is only created once the address is stable, and that it is never used after drop. This usually means splitting construction into two phases: a constructor that allocates and pins the struct, and an init method that fills in the self-reference. The init method must be called exactly once, and the struct should not expose any API that would allow moving after init.
You also must define how mutation works. If the internal buffer is a String, and you mutate it, its buffer may reallocate, invalidating the pointer. Therefore, your API should forbid mutation after initialization, or you must ensure that the string never reallocates (e.g., by reserving enough capacity up front). This is a crucial invariant that many learners miss: pinning prevents moves of the struct, but it does not prevent internal reallocations that change a field’s address. So you must either freeze the data or design a custom stable buffer.
Another subtle point is drop order. When the struct is dropped, the pointer becomes invalid. You must ensure it is never used during Drop. This means you should not implement Drop in a way that accesses the self-reference after fields are dropped. The simplest approach is to avoid custom Drop or ensure you access fields in a safe order.
Finally, you must consider projection. If you provide access to the inner String, you must ensure callers cannot move it out. Returning &str is safe, but returning String or &mut String could violate invariants. The safest API only exposes immutable views, or provides carefully constrained mutable methods that preserve capacity and avoid reallocation.
How this fit on projects
This concept is applied in §3.2 (requirements), §5.10 (phases), and §7 (pitfalls). It also informs Project 10’s safe abstraction design.
Definitions & key terms
- Invariant: Condition that must always hold for safety.
- Self-reference: Pointer to a field within the same struct.
- Two-phase initialization: Pin first, then init pointers.
Mental model diagram (ASCII)
struct SelfRef {
data: String,
ptr: *const str,
}
Pin prevents move, but data can reallocate!
How it works (step-by-step)
- Allocate and pin the struct.
- Initialize pointer to
dataafter pinning. - Forbid operations that reallocate
data. - Provide safe accessors.
Minimal concrete example
let mut s = Box::pin(SelfRef::new("hello"));
SelfRef::init(Pin::as_mut(&mut s));
Common misconceptions
- “Pin prevents all invalidation.” It only prevents moves of the struct, not inner reallocations.
- “Unsafe code can ignore invariants.” Unsafe code is safe only if invariants are upheld.
Check-your-understanding questions
- Why can
Stringreallocation break self-references? - What guarantees does pinning not provide?
- Why is two-phase init required?
Check-your-understanding answers
- The buffer may move, invalidating the pointer.
- Pinning does not prevent internal reallocation.
- The address must be stable before taking references.
Real-world applications
- Futures that store self-references.
- Intrusive linked lists.
Where you’ll apply it
- This project: §5.10 Phase 2, §7.1 pitfalls.
- Also used in: Project 10: Capstone.
References
- Rustonomicon “Self-Referential Structs”.
Key insights
Pinning is necessary but not sufficient; your invariants must also prevent internal moves.
Summary
Self-referential safety is about maintaining invariants, not just using Pin.
Homework/Exercises to practice the concept
- Build a self-referential struct that stores a slice into a
Vec<u8>. - Add a method that appends to the vec and observe invalidation risk.
Solutions to the homework/exercises
- Pin the struct and store a raw pointer.
- Appending may reallocate; forbid it or reserve capacity.
2.3 Pin Projection and API Design
Fundamentals
When you have Pin<&mut T>, you cannot move T but you can still access its fields. Pin projection is the process of obtaining pinned references to fields safely. This is subtle because moving a field could move the whole struct. Safe APIs should avoid exposing methods that allow moves of pinned fields.
Deep Dive into the concept
Pin projection is one of the trickiest parts of pinning. Suppose your struct has fields data: String and ptr: *const str. If you have Pin<&mut SelfRef>, you cannot just do self.data because that would move the field out. Instead, you can access it by reference: &self.data or Pin<&mut self.data> if the field itself must be pinned. The pin-project crate automates this, but you can do it manually with unsafe code by using Pin::map_unchecked_mut. The invariant is: you must not move the field out of the struct, and you must preserve pinning guarantees.
For this project, the simplest solution is to avoid projection entirely. Provide methods like fn get(&self) -> &str and fn get_ptr(&self) -> *const str and avoid exposing mutable access. This keeps the unsafe boundary small. If you want to expose mutable access, you must ensure it does not reallocate. For example, you can expose fn push_str(&mut self, s: &str) only if you reserved sufficient capacity during initialization.
The API design lesson is that pinning changes what is safe to expose. Types that are !Unpin should often have very limited APIs. This is why many self-referential structures in Rust are wrapped in higher-level abstractions that hide the pinning details from users.
How this fit on projects
This concept is applied in §4.2 (component design) and §5.11 (implementation decisions). It also informs Project 10 where you design safe APIs around unsafe internals.
Definitions & key terms
- Projection: Accessing fields of a pinned value.
- Pin::map_unchecked_mut: Unsafe helper for projection.
- Pin invariants: Rules that must be upheld when accessing fields.
Mental model diagram (ASCII)
Pin<&mut SelfRef>
|-- safe: &self.data
|-- unsafe: move self.data
How it works (step-by-step)
- Keep the pinned struct in a stable location.
- Access fields by reference, never by value.
- If projection is needed, use safe wrappers or
pin-project.
Minimal concrete example
fn get(self: Pin<&SelfRef>) -> &str { &self.data }
Common misconceptions
- “Pin means you can’t mutate.” You can mutate safely without moving.
- “Projection is automatic.” It must be handled carefully.
Check-your-understanding questions
- Why is moving a field of a pinned struct unsafe?
- How does
pin-projecthelp?
Check-your-understanding answers
- Moving a field may move the struct or invalidate self-references.
- It generates safe projection code and enforces invariants.
Real-world applications
- Async runtime state machines.
- Intrusive data structures.
Where you’ll apply it
- This project: §5.11 decisions, §7 pitfalls.
- Also used in: Project 10: Capstone.
References
- pin-project crate docs.
Key insights
Pinning affects API design: you must limit what you expose to preserve invariants.
Summary
Projection is safe only if you never move pinned fields; design APIs accordingly.
Homework/Exercises to practice the concept
- Add a method that safely returns a
&strfrom a pinned struct. - Attempt to return
Stringby value and explain why it fails.
Solutions to the homework/exercises
- Use
Pin<&Self>and return&self.data. - Moving
Stringwould violate pinning invariants.
3. Project Specification
3.1 What You Will Build
A crate self_ref that provides a SelfRef struct storing a String and a pointer to its internal slice, using Pin to guarantee safety. Includes CLI demo and tests.
3.2 Functional Requirements
SelfRef::newreturns a pinned instance.initsets up self-reference after pinning.getreturns&strview of the data.- CLI demo shows pointer stability.
3.3 Non-Functional Requirements
- Safety: No moves after pinning.
- Clarity: Unsafe invariants documented.
3.4 Example Usage / Output
$ cargo run --example pin_demo
created pinned self-ref
value: hello
pointer points to: hello
exit code: 0
3.5 Data Formats / Schemas / Protocols
SelfRef { data: String, ptr: *const str, _pin: PhantomPinned }.
3.6 Edge Cases
- Calling
getbeforeinit(should be prevented). - Mutating data after init (should be disallowed).
3.7 Real World Outcome
Deterministic demo and failure case.
3.7.1 How to Run (Copy/Paste)
cargo run --example pin_demo
3.7.2 Golden Path Demo (Deterministic)
- Fixed input string “hello”.
3.7.3 CLI Transcript (Success)
$ cargo run --example pin_demo
created pinned self-ref
value: hello
pointer points to: hello
exit code: 0
3.7.4 Failure Demo (Invalid Use)
$ cargo run --example pin_demo -- --mutate
error: mutation after init is not allowed
exit code: 2
4. Solution Architecture
4.1 High-Level Design
SelfRef
|-- data: String
|-- ptr: *const str
|-- PhantomPinned
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
SelfRef |
Self-referential type | !Unpin with PhantomPinned |
init |
Set pointer | must be called post-pin |
get |
Access data | no mutation |
4.4 Data Structures (No Full Code)
struct SelfRef {
data: String,
ptr: *const str,
_pin: PhantomPinned,
}
4.4 Algorithm Overview
Key Algorithm: Initialization
- Pin the struct.
- Take pointer to
data. - Store pointer.
Complexity Analysis:
- Time: O(1)
- Space: O(1) overhead
5. Implementation Guide
5.1 Development Environment Setup
cargo new self_ref
cd self_ref
5.2 Project Structure
self_ref/
├── src/
│ ├── lib.rs
│ └── self_ref.rs
└── examples/
└── pin_demo.rs
5.3 The Core Question You’re Answering
“How can a struct safely store a reference to its own data without allowing moves?”
5.4 Concepts You Must Understand First
- Pin and Unpin.
- Unsafe invariants for self-references.
- Projection rules.
5.5 Questions to Guide Your Design
- How will you prevent moves after initialization?
- How will you prevent data reallocation?
- What methods are safe to expose?
5.6 Thinking Exercise
Sketch the struct layout and mark which fields must never move.
5.7 The Interview Questions They’ll Ask
- “Why does Rust forbid self-referential structs by default?”
- “What does
!Unpinmean?” - “Why is pinning required for async?”
5.8 Hints in Layers
Hint 1: Use Pin<Box<SelfRef>>.
Hint 2: Add PhantomPinned.
Hint 3: Initialize pointer after pinning.
5.9 Books That Will Help
| Topic | Book | Chapter | |—|—|—| | Pinning | Rustonomicon | Pin chapter | | Async rationale | RFC 2349 | Pin rationale |
5.10 Implementation Phases
Phase 1: Struct + Pin (3-4 days)
- Implement
SelfRefand ensure!Unpin.
Phase 2: Initialization (3 days)
- Implement
initand pointer setup.
Phase 3: Demo + Tests (3 days)
- CLI demo and compile-fail tests.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale | |—|—|—|—| | Storage | Box vs stack | Box | stable address | | Mutation | allowed vs forbidden | forbid | preserve pointer |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples | |—|—|—| | Unit Tests | pointer stability | compare addresses | | Integration Tests | CLI demo | deterministic output | | Compile-Fail Tests | move prevention | try to move pinned value |
6.2 Critical Test Cases
- Pointer equals data slice address.
- Attempted move fails to compile.
- Mutation after init is rejected.
6.3 Test Data
"hello"
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|—|—|—|
| Init before pin | invalid pointer | pin first |
| Mutating data | dangling pointer | forbid mutation |
| Exposing &mut String | move risk | only expose &str |
7.2 Debugging Strategies
- Print pointer addresses before and after moves.
7.3 Performance Traps
- Overly complex projection.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add a method to replace data with same-length string.
8.2 Intermediate Extensions
- Use
pin-projectto implement safe projections.
8.3 Advanced Extensions
- Build a pinned state machine (mini-future).
9. Real-World Connections
9.1 Industry Applications
- Async runtimes.
- Intrusive linked lists.
9.2 Related Open Source Projects
pin-projectcrate.
9.3 Interview Relevance
- Explain why pinning exists and how to use it safely.
10. Resources
10.1 Essential Reading
- Rustonomicon “Pin”.
- RFC 2349.
10.2 Video Resources
- RustConf talks on async/pin.
10.3 Tools & Documentation
pin-projectcrate.
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain why self-referential structs are unsafe by default.
- I can explain
PinvsUnpin.
11.2 Implementation
- Pointer stability tests pass.
- Compile-fail tests prevent moves.
11.3 Growth
- I can explain pinning in async contexts.
12. Submission / Completion Criteria
Minimum Viable Completion:
SelfRefwith pinning and safe accessors.- CLI demo with deterministic output and failure case.
Full Completion:
- Safe projection or documented invariants.
Excellence (Going Above & Beyond):
- Build a pinned state machine with self-references.