Project 7: Preference Memory & Privacy Controls
Build a consent-aware preference memory system with redaction, retention, and audit trails.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 2 |
| Time Estimate | Weekend |
| Main Programming Language | Python (Alternatives: TypeScript, Go) |
| Alternative Programming Languages | TypeScript, Go |
| Coolness Level | Level 2 |
| Business Potential | Level 3 |
| Prerequisites | Basic data modeling, policy rules |
| Key Topics | Consent, sensitivity, retention policies |
1. Learning Objectives
By completing this project, you will:
- Define sensitivity tiers for preference memory.
- Enforce consent gates for storage and retrieval.
- Implement redaction and expiration policies.
- Generate privacy audit reports.
2. All Theory Needed (Per-Concept Breakdown)
Preference Memory Governance and Privacy
Fundamentals Preference memory stores user-specific constraints (tone, format, scheduling choices). These memories are useful but high-risk because they often contain sensitive or personal information. A privacy-first memory system must enforce consent, classify sensitivity, redact personal data, and apply retention limits. Without governance, preference memory can become a compliance liability.
Deep Dive into the concept Preference memory is different from other memory types because it is tied to a person and can directly impact user trust. Preferences are often implicit, inferred from behavior rather than explicitly stated. This creates a risk: storing inferred preferences without consent can violate expectations or regulations. Therefore, a robust system should distinguish explicit preferences (“I want short answers”) from inferred preferences (“User seems to like short answers”). Inferred preferences should be stored with lower confidence and often require explicit confirmation before use.
Sensitivity tiers are critical. Low sensitivity preferences might include formatting or tone. Medium sensitivity might include working hours or project priorities. High sensitivity includes identifiers, health, or financial data. Each tier requires different handling. High sensitivity should require explicit consent to store and retrieve; medium sensitivity might require consent for retrieval only; low sensitivity could be stored by default. These rules must be encoded in a policy engine, not in ad-hoc code, so they can be audited and updated.
Redaction is the next layer. Even with consent, some data should not be stored in raw form. Phone numbers, emails, or personal identifiers can be masked or tokenized, preserving utility without exposing raw data. Redaction rules must be deterministic and testable. A common pattern is to store a redacted version for retrieval and optionally keep the raw version encrypted with restricted access.
Retention policies prevent preference memory from becoming stale. Preferences change; if a user requests verbose answers once, that might not be permanent. Preferences should include expiration windows and “last confirmed” timestamps. When a preference is retrieved, the system should verify that it is still valid. If it is expired, the system should either ignore it or ask for confirmation. This protects against outdated preferences and supports user trust.
Finally, governance requires auditability. You should be able to answer: what preferences are stored, which were inferred vs explicit, and when consent was given. This is why audit reports are a core deliverable. The audit pipeline is not just for compliance; it is also a diagnostic tool that reveals unsafe or outdated memory entries.
From a systems perspective, this concept must be treated as a first-class interface between data and behavior. That means you need explicit invariants (what must always be true), observability (how you know it is true), and failure signatures (how it breaks when it is not). In practice, engineers often skip this and rely on ad-hoc fixes, which creates hidden coupling between the memory subsystem and the rest of the agent stack. A better approach is to model the concept as a pipeline stage with clear inputs, outputs, and preconditions: if inputs violate the contract, the stage should fail fast rather than silently corrupt memory. This is especially important because memory errors are long-lived and compound over time. You should also define operational metrics that reveal drift early. Examples include: the percentage of memory entries that lack required metadata, the ratio of retrieved memories that are later unused by the model, or the fraction of queries that trigger a fallback route because the primary memory store is empty. These metrics are not just for dashboards; they are design constraints that force you to keep the system testable and predictable.
Another critical dimension is lifecycle management. The concept may work well at small scale but degrade as the memory grows. This is where policies and thresholds matter: you need rules for promotion, demotion, merging, or deletion that prevent the memory from becoming a landfill. The policy should be deterministic and versioned. When it changes, you should be able to replay historical inputs and measure the delta in outputs. This is the same discipline used in data engineering for schema changes and backfills, and it applies equally to memory systems. Finally, remember that memory is an interface to user trust. If the memory system is noisy, the agent feels unreliable; if it is overly strict, the agent feels forgetful. The best designs expose these trade-offs explicitly, so you can tune them according to product goals rather than guessing in the dark.
How this fits on projects This concept is central to Project 7 and is used by Projects 9 and 10 for safety and memory management.
Definitions & key terms
- Consent: Explicit permission to store or use a preference.
- Sensitivity tier: Classification of preference risk level.
- Redaction: Masking sensitive values.
- Retention: Rules that expire memory over time.
Mental model diagram (ASCII)
Preference -> Sensitivity Tier -> Consent Gate -> Store/Redact -> Retrieve
How It Works (Step-by-Step)
- Classify preference sensitivity.
- Check consent rules for storage.
- Apply redaction if needed.
- Store with expiration and confidence.
- Enforce consent and freshness at retrieval.
Minimal Concrete Example
preference:
text: "User prefers short answers"
sensitivity: low
consent: true
expires: 2026-03-01
Common Misconceptions
- “Preferences are always safe to store.” (False: many are sensitive.)
- “Consent is a one-time event.” (False: preferences can change.)
Check-Your-Understanding Questions
- Why do inferred preferences need confirmation?
- What is the role of redaction?
- How do retention policies protect users?
Check-Your-Understanding Answers
- Because inference can be wrong and sensitive.
- It prevents storing raw personal identifiers.
- They prevent outdated or unwanted preferences from persisting.
Real-World Applications
- Personalized assistants with privacy controls.
- Enterprise systems with compliance requirements.
Where You’ll Apply It
- In this project: §5.4 Concepts You Must Understand First and §6 Testing Strategy.
- Also used in: Project 9, Project 10.
References
- A-MemGuard (memory safety) - https://arxiv.org/abs/2504.19413
Key Insights Preference memory must be governed like user data, not like generic logs.
Summary Consent, sensitivity tiers, and retention rules make preference memory safe and trustworthy.
Homework/Exercises to Practice the Concept
- Define three sensitivity tiers and rules for each.
- Draft a consent workflow for inferred preferences.
Solutions to the Homework/Exercises
- Low: store by default. Medium: require consent to retrieve. High: require consent to store.
- Ask user to confirm inferred preference before retrieval.
3. Project Specification
3.1 What You Will Build
A preference memory system that:
- Stores preferences with consent metadata
- Applies redaction rules
- Enforces expiration policies
- Produces audit reports
3.2 Functional Requirements
- Consent Check: Block storage without consent for high sensitivity.
- Redaction: Mask PII patterns.
- Retention: Expire preferences after a window.
- Audit Report: Summarize stored preferences by sensitivity.
3.3 Non-Functional Requirements
- Performance: Audit report in < 1 second for 10k preferences.
- Reliability: Deterministic retrieval with fixed rules.
- Usability: Clear error messages for consent violations.
3.4 Example Usage / Output
$ pref add --text "User prefers markdown summaries" --consent true --sensitivity low
[OK] preference_id=PRF-0012
$ pref add --text "User phone number is 555-1234" --consent false --sensitivity high
[BLOCKED] consent required
3.5 Data Formats / Schemas / Protocols
{
"id": "PRF-0012",
"text": "User prefers markdown summaries",
"sensitivity": "low",
"consent": true,
"expires": "2026-03-01"
}
3.6 Edge Cases
- Missing consent metadata
- Conflicting preferences
- Expired preferences
3.7 Real World Outcome
3.7.1 How to Run (Copy/Paste)
$ pref add --text "User prefers markdown summaries" --consent true --sensitivity low
$ pref audit
3.7.2 Golden Path Demo (Deterministic)
$ pref audit
Total: 24
High sensitivity: 3 (all consented)
Expired: 2
exit_code=0
3.7.3 Failure Demo (Deterministic)
$ pref add --text "User SSN is 123-45-6789" --consent false --sensitivity high
[BLOCKED] consent required
exit_code=2
4. Solution Architecture
4.1 High-Level Design
Preference Input -> Policy Engine -> Redactor -> Store -> Audit
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Policy Engine | Consent checks | Tier rules |
| Redactor | Mask PII | Regex patterns |
| Store | Persist preferences | Expiration fields |
| Auditor | Reports | Sensitivity breakdown |
4.3 Data Structures (No Full Code)
Preference:
id: string
text: string
sensitivity: enum
consent: bool
expires: date
4.4 Algorithm Overview
- Validate consent and sensitivity.
- Apply redaction rules.
- Store preference with expiration.
- Retrieve only if consent and fresh.
5. Implementation Guide
5.1 Development Environment Setup
- Define sensitivity tiers
- Prepare redaction rules
5.2 Project Structure
project-root/
├── src/
│ ├── policy/
│ ├── redact/
│ ├── store/
│ └── audit/
5.3 The Core Question You’re Answering
“How do I store preferences without violating privacy?”
5.4 Concepts You Must Understand First
- Consent enforcement
- Retention policies
5.5 Questions to Guide Your Design
- Which preferences require explicit consent?
- How long should preferences live?
5.6 Thinking Exercise
Classify five preferences into sensitivity tiers.
5.7 The Interview Questions They’ll Ask
- “Why are preferences high risk?”
- “How do you enforce consent?”
- “What is redaction?”
- “How do you handle expired preferences?”
- “What is the role of audit reports?”
5.8 Hints in Layers
Hint 1: Define strict tiers Hint 2: Require consent for high sensitivity Hint 3: Add redaction rules Hint 4: Add audit reporting
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Architecture | “Clean Architecture” | Ch. 12 |
| Agent systems | “AI Engineering” | Ch. 6 |
5.10 Implementation Phases
Phase 1: Foundation
- Policy and redaction rules
Phase 2: Core
- Storage and retrieval
Phase 3: Polish
- Audit reporting and expiration
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Consent | Store vs retrieve gate | Both | Reduces privacy risk |
| Redaction | Mask vs encrypt | Mask | Simpler retrieval |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit | Redaction | PII masking |
| Integration | Consent enforcement | Blocked writes |
| Edge | Expired preferences | Retrieval denied |
6.2 Critical Test Cases
- High sensitivity without consent is blocked.
- Expired preferences are excluded.
- Audit report counts sensitivity tiers correctly.
6.3 Test Data
preference: "User phone number is 555-1234"
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| No consent check | Privacy leak | Add policy engine |
| No expiration | Stale preferences | Add retention |
| Weak redaction | PII stored | Improve rules |
7.2 Debugging Strategies
- Run audit reports regularly.
- Inspect redaction output.
7.3 Performance Traps
- Excessive regex checks per write.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add preference categories
8.2 Intermediate Extensions
- Add consent expiration
8.3 Advanced Extensions
- Add encrypted storage for high sensitivity
9. Real-World Connections
9.1 Industry Applications
- Personal assistants with GDPR-style controls
9.2 Related Open Source Projects
- A-MemGuard concepts
9.3 Interview Relevance
- Privacy governance is a common concern in agent systems.
10. Resources
10.1 Essential Reading
- A-MemGuard paper
10.2 Video Resources
- Privacy and AI governance talks
10.3 Tools & Documentation
- PII detection libraries
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain consent and retention rules.
11.2 Implementation
- Consent gates and redaction work.
11.3 Growth
- I can justify sensitivity tiers.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Consent and redaction implemented
Full Completion:
- Retention and audit reports
Excellence (Going Above & Beyond):
- Encryption and advanced governance