Project 7: Implement a Self-Referential Struct with Pin

Build a self-referential struct safely using Pin, PhantomPinned, and documented unsafe invariants.

Quick Reference

Attribute Value
Difficulty Master
Time Estimate 2-3 weeks
Main Programming Language Rust
Alternative Programming Languages C++ (self-referential classes)
Coolness Level Very High
Business Potential Medium
Prerequisites Lifetimes, unsafe, ownership, raw pointers
Key Topics Pin, Unpin, self-referential safety, projection

1. Learning Objectives

By completing this project, you will:

  1. Explain why self-referential structs are unsafe without pinning.
  2. Implement a safe pinned self-referential type.
  3. Document and uphold unsafe invariants.
  4. Understand projection and !Unpin behavior.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Pin and Unpin Semantics

Fundamentals

Pin<T> is a wrapper that guarantees a value will not be moved in memory. Most types are Unpin, meaning they can be safely moved even when pinned. To create a self-referential struct, you must ensure it is !Unpin, typically by including PhantomPinned. Pinning is essential because a self-reference would become invalid if the value moved.

Deep Dive into the concept

Rust assumes that values can be moved freely unless a type opts out by being !Unpin. This assumption makes self-referential structs unsafe because any move would invalidate internal pointers. Pin provides a contract: once a value is pinned, it will not move unless it is Unpin. The standard pattern is to allocate the value on the heap (Box<T>), then pin it (Pin<Box<T>>). The heap allocation gives a stable address; the pinning ensures the value is never moved out of that address.

The key idea is that pinning is about address stability, not immutability. You can still mutate pinned values through Pin<&mut T> if you do so safely. But you cannot move the value out. This is why you add PhantomPinned to your struct: it prevents the compiler from auto-implementing Unpin, and thus prevents moves through safe APIs. This is a deliberate opt-out from Rust’s default move semantics.

Pinning is most commonly used in async programming. A Future may contain self-referential state, so it must not move after being polled. The compiler uses Pin to enforce that. Your project mirrors this pattern with a simpler example: a struct that owns a String and stores a pointer to its internal buffer. Without pinning, moving the struct would change the address of the String, invalidating the pointer.

Projection is another subtlety. If you have a pinned struct and want to access its fields, you must ensure you do not move them. This is why the pin-project crate exists. For this project, you can implement manual projection with unsafe code, but you must document the invariants: you may obtain Pin<&mut Field> only if the field itself is pinned and will not move. This is complex; for the project, you can keep the API narrow and avoid exposing projections.

The safety story comes down to invariants. You must ensure that: (1) the internal pointer always points to the correct buffer, (2) the struct is never moved after initialization, and (3) the pointer is only read while the struct is alive. You can enforce (2) with Pin and !Unpin, and you can enforce (1) by initializing the pointer only after pinning.

How this fit on projects

This concept is central to §3.1 and §4.1. It also connects to Project 10 (safe abstraction with unsafe internals) and Project 5 (stable storage alternatives).

Definitions & key terms

  • Pin: Wrapper guaranteeing no moves.
  • Unpin: Trait for types that can be moved even when pinned.
  • PhantomPinned: Marker to prevent auto-Unpin.
  • Projection: Accessing fields of pinned structs safely.

Mental model diagram (ASCII)

Pin<Box<Self>>
  |
  v
[heap allocation at stable address]
self.ptr -> self.data (stable)

How it works (step-by-step)

  1. Allocate struct on heap.
  2. Pin it so it cannot move.
  3. Initialize internal pointer to field.
  4. Access pointer safely via pinned references.

Minimal concrete example

let pinned = Box::pin(SelfRef::new("hello"));

Common misconceptions

  • “Pin makes things immutable.” It only prevents moves.
  • “Pin is only for async.” It is for address stability generally.

Check-your-understanding questions

  1. Why is PhantomPinned required?
  2. What does Pin<Box<T>> guarantee?
  3. Why must you initialize the self-reference after pinning?

Check-your-understanding answers

  1. It prevents auto-Unpin and thus prevents safe moves.
  2. The value is allocated on the heap and will not move.
  3. Because the address must be stable when you take the reference.

Real-world applications

  • Async futures.
  • Self-referential parser state machines.

Where you’ll apply it

References

  • Rustonomicon “Pinning”.
  • RFC 2349 (Pin).

Key insights

Pinning is the only safe way to build self-referential types in Rust.

Summary

Pin + !Unpin gives you address stability, which makes self-references safe.

Homework/Exercises to practice the concept

  1. Write a struct with PhantomPinned and check that it is !Unpin.
  2. Attempt to move a pinned value and observe compiler errors.

Solutions to the homework/exercises

  1. Add PhantomPinned and verify with std::mem::needs_drop or trait bounds.
  2. The compiler rejects moves from Pin<&mut T>.

2.2 Unsafe Invariants for Self-References

Fundamentals

A self-referential struct stores a pointer to one of its own fields. The invariants are: the struct must not move after the pointer is created, the pointer must always point to valid data, and the pointer must not be used after the struct is dropped. Unsafe code is required to create the pointer, but the public API must enforce these invariants.

Deep Dive into the concept

Unsafe code is necessary because the compiler cannot express “this pointer points to my own field.” The safe interface must ensure that the pointer is only created once the address is stable, and that it is never used after drop. This usually means splitting construction into two phases: a constructor that allocates and pins the struct, and an init method that fills in the self-reference. The init method must be called exactly once, and the struct should not expose any API that would allow moving after init.

You also must define how mutation works. If the internal buffer is a String, and you mutate it, its buffer may reallocate, invalidating the pointer. Therefore, your API should forbid mutation after initialization, or you must ensure that the string never reallocates (e.g., by reserving enough capacity up front). This is a crucial invariant that many learners miss: pinning prevents moves of the struct, but it does not prevent internal reallocations that change a field’s address. So you must either freeze the data or design a custom stable buffer.

Another subtle point is drop order. When the struct is dropped, the pointer becomes invalid. You must ensure it is never used during Drop. This means you should not implement Drop in a way that accesses the self-reference after fields are dropped. The simplest approach is to avoid custom Drop or ensure you access fields in a safe order.

Finally, you must consider projection. If you provide access to the inner String, you must ensure callers cannot move it out. Returning &str is safe, but returning String or &mut String could violate invariants. The safest API only exposes immutable views, or provides carefully constrained mutable methods that preserve capacity and avoid reallocation.

How this fit on projects

This concept is applied in §3.2 (requirements), §5.10 (phases), and §7 (pitfalls). It also informs Project 10’s safe abstraction design.

Definitions & key terms

  • Invariant: Condition that must always hold for safety.
  • Self-reference: Pointer to a field within the same struct.
  • Two-phase initialization: Pin first, then init pointers.

Mental model diagram (ASCII)

struct SelfRef {
  data: String,
  ptr: *const str,
}
Pin prevents move, but data can reallocate!

How it works (step-by-step)

  1. Allocate and pin the struct.
  2. Initialize pointer to data after pinning.
  3. Forbid operations that reallocate data.
  4. Provide safe accessors.

Minimal concrete example

let mut s = Box::pin(SelfRef::new("hello"));
SelfRef::init(Pin::as_mut(&mut s));

Common misconceptions

  • “Pin prevents all invalidation.” It only prevents moves of the struct, not inner reallocations.
  • “Unsafe code can ignore invariants.” Unsafe code is safe only if invariants are upheld.

Check-your-understanding questions

  1. Why can String reallocation break self-references?
  2. What guarantees does pinning not provide?
  3. Why is two-phase init required?

Check-your-understanding answers

  1. The buffer may move, invalidating the pointer.
  2. Pinning does not prevent internal reallocation.
  3. The address must be stable before taking references.

Real-world applications

  • Futures that store self-references.
  • Intrusive linked lists.

Where you’ll apply it

References

  • Rustonomicon “Self-Referential Structs”.

Key insights

Pinning is necessary but not sufficient; your invariants must also prevent internal moves.

Summary

Self-referential safety is about maintaining invariants, not just using Pin.

Homework/Exercises to practice the concept

  1. Build a self-referential struct that stores a slice into a Vec<u8>.
  2. Add a method that appends to the vec and observe invalidation risk.

Solutions to the homework/exercises

  1. Pin the struct and store a raw pointer.
  2. Appending may reallocate; forbid it or reserve capacity.

2.3 Pin Projection and API Design

Fundamentals

When you have Pin<&mut T>, you cannot move T but you can still access its fields. Pin projection is the process of obtaining pinned references to fields safely. This is subtle because moving a field could move the whole struct. Safe APIs should avoid exposing methods that allow moves of pinned fields.

Deep Dive into the concept

Pin projection is one of the trickiest parts of pinning. Suppose your struct has fields data: String and ptr: *const str. If you have Pin<&mut SelfRef>, you cannot just do self.data because that would move the field out. Instead, you can access it by reference: &self.data or Pin<&mut self.data> if the field itself must be pinned. The pin-project crate automates this, but you can do it manually with unsafe code by using Pin::map_unchecked_mut. The invariant is: you must not move the field out of the struct, and you must preserve pinning guarantees.

For this project, the simplest solution is to avoid projection entirely. Provide methods like fn get(&self) -> &str and fn get_ptr(&self) -> *const str and avoid exposing mutable access. This keeps the unsafe boundary small. If you want to expose mutable access, you must ensure it does not reallocate. For example, you can expose fn push_str(&mut self, s: &str) only if you reserved sufficient capacity during initialization.

The API design lesson is that pinning changes what is safe to expose. Types that are !Unpin should often have very limited APIs. This is why many self-referential structures in Rust are wrapped in higher-level abstractions that hide the pinning details from users.

How this fit on projects

This concept is applied in §4.2 (component design) and §5.11 (implementation decisions). It also informs Project 10 where you design safe APIs around unsafe internals.

Definitions & key terms

  • Projection: Accessing fields of a pinned value.
  • Pin::map_unchecked_mut: Unsafe helper for projection.
  • Pin invariants: Rules that must be upheld when accessing fields.

Mental model diagram (ASCII)

Pin<&mut SelfRef>
  |-- safe: &self.data
  |-- unsafe: move self.data

How it works (step-by-step)

  1. Keep the pinned struct in a stable location.
  2. Access fields by reference, never by value.
  3. If projection is needed, use safe wrappers or pin-project.

Minimal concrete example

fn get(self: Pin<&SelfRef>) -> &str { &self.data }

Common misconceptions

  • “Pin means you can’t mutate.” You can mutate safely without moving.
  • “Projection is automatic.” It must be handled carefully.

Check-your-understanding questions

  1. Why is moving a field of a pinned struct unsafe?
  2. How does pin-project help?

Check-your-understanding answers

  1. Moving a field may move the struct or invalidate self-references.
  2. It generates safe projection code and enforces invariants.

Real-world applications

  • Async runtime state machines.
  • Intrusive data structures.

Where you’ll apply it

References

  • pin-project crate docs.

Key insights

Pinning affects API design: you must limit what you expose to preserve invariants.

Summary

Projection is safe only if you never move pinned fields; design APIs accordingly.

Homework/Exercises to practice the concept

  1. Add a method that safely returns a &str from a pinned struct.
  2. Attempt to return String by value and explain why it fails.

Solutions to the homework/exercises

  1. Use Pin<&Self> and return &self.data.
  2. Moving String would violate pinning invariants.

3. Project Specification

3.1 What You Will Build

A crate self_ref that provides a SelfRef struct storing a String and a pointer to its internal slice, using Pin to guarantee safety. Includes CLI demo and tests.

3.2 Functional Requirements

  1. SelfRef::new returns a pinned instance.
  2. init sets up self-reference after pinning.
  3. get returns &str view of the data.
  4. CLI demo shows pointer stability.

3.3 Non-Functional Requirements

  • Safety: No moves after pinning.
  • Clarity: Unsafe invariants documented.

3.4 Example Usage / Output

$ cargo run --example pin_demo
created pinned self-ref
value: hello
pointer points to: hello
exit code: 0

3.5 Data Formats / Schemas / Protocols

  • SelfRef { data: String, ptr: *const str, _pin: PhantomPinned }.

3.6 Edge Cases

  • Calling get before init (should be prevented).
  • Mutating data after init (should be disallowed).

3.7 Real World Outcome

Deterministic demo and failure case.

3.7.1 How to Run (Copy/Paste)

cargo run --example pin_demo

3.7.2 Golden Path Demo (Deterministic)

  • Fixed input string “hello”.

3.7.3 CLI Transcript (Success)

$ cargo run --example pin_demo
created pinned self-ref
value: hello
pointer points to: hello
exit code: 0

3.7.4 Failure Demo (Invalid Use)

$ cargo run --example pin_demo -- --mutate
error: mutation after init is not allowed
exit code: 2

4. Solution Architecture

4.1 High-Level Design

SelfRef
  |-- data: String
  |-- ptr: *const str
  |-- PhantomPinned

4.2 Key Components

Component Responsibility Key Decisions
SelfRef Self-referential type !Unpin with PhantomPinned
init Set pointer must be called post-pin
get Access data no mutation

4.4 Data Structures (No Full Code)

struct SelfRef {
    data: String,
    ptr: *const str,
    _pin: PhantomPinned,
}

4.4 Algorithm Overview

Key Algorithm: Initialization

  1. Pin the struct.
  2. Take pointer to data.
  3. Store pointer.

Complexity Analysis:

  • Time: O(1)
  • Space: O(1) overhead

5. Implementation Guide

5.1 Development Environment Setup

cargo new self_ref
cd self_ref

5.2 Project Structure

self_ref/
├── src/
│   ├── lib.rs
│   └── self_ref.rs
└── examples/
    └── pin_demo.rs

5.3 The Core Question You’re Answering

“How can a struct safely store a reference to its own data without allowing moves?”

5.4 Concepts You Must Understand First

  1. Pin and Unpin.
  2. Unsafe invariants for self-references.
  3. Projection rules.

5.5 Questions to Guide Your Design

  1. How will you prevent moves after initialization?
  2. How will you prevent data reallocation?
  3. What methods are safe to expose?

5.6 Thinking Exercise

Sketch the struct layout and mark which fields must never move.

5.7 The Interview Questions They’ll Ask

  1. “Why does Rust forbid self-referential structs by default?”
  2. “What does !Unpin mean?”
  3. “Why is pinning required for async?”

5.8 Hints in Layers

Hint 1: Use Pin<Box<SelfRef>>. Hint 2: Add PhantomPinned. Hint 3: Initialize pointer after pinning.

5.9 Books That Will Help

| Topic | Book | Chapter | |—|—|—| | Pinning | Rustonomicon | Pin chapter | | Async rationale | RFC 2349 | Pin rationale |

5.10 Implementation Phases

Phase 1: Struct + Pin (3-4 days)

  • Implement SelfRef and ensure !Unpin.

Phase 2: Initialization (3 days)

  • Implement init and pointer setup.

Phase 3: Demo + Tests (3 days)

  • CLI demo and compile-fail tests.

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |—|—|—|—| | Storage | Box vs stack | Box | stable address | | Mutation | allowed vs forbidden | forbid | preserve pointer |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |—|—|—| | Unit Tests | pointer stability | compare addresses | | Integration Tests | CLI demo | deterministic output | | Compile-Fail Tests | move prevention | try to move pinned value |

6.2 Critical Test Cases

  1. Pointer equals data slice address.
  2. Attempted move fails to compile.
  3. Mutation after init is rejected.

6.3 Test Data

"hello"

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |—|—|—| | Init before pin | invalid pointer | pin first | | Mutating data | dangling pointer | forbid mutation | | Exposing &mut String | move risk | only expose &str |

7.2 Debugging Strategies

  • Print pointer addresses before and after moves.

7.3 Performance Traps

  • Overly complex projection.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add a method to replace data with same-length string.

8.2 Intermediate Extensions

  • Use pin-project to implement safe projections.

8.3 Advanced Extensions

  • Build a pinned state machine (mini-future).

9. Real-World Connections

9.1 Industry Applications

  • Async runtimes.
  • Intrusive linked lists.
  • pin-project crate.

9.3 Interview Relevance

  • Explain why pinning exists and how to use it safely.

10. Resources

10.1 Essential Reading

  • Rustonomicon “Pin”.
  • RFC 2349.

10.2 Video Resources

  • RustConf talks on async/pin.

10.3 Tools & Documentation

  • pin-project crate.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain why self-referential structs are unsafe by default.
  • I can explain Pin vs Unpin.

11.2 Implementation

  • Pointer stability tests pass.
  • Compile-fail tests prevent moves.

11.3 Growth

  • I can explain pinning in async contexts.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • SelfRef with pinning and safe accessors.
  • CLI demo with deterministic output and failure case.

Full Completion:

  • Safe projection or documented invariants.

Excellence (Going Above & Beyond):

  • Build a pinned state machine with self-references.