Project 7: Preference Memory & Privacy Controls

Build a consent-aware preference memory system with redaction, retention, and audit trails.

Quick Reference

Attribute	Value
Difficulty	Level 2
Time Estimate	Weekend
Main Programming Language	Python (Alternatives: TypeScript, Go)
Alternative Programming Languages	TypeScript, Go
Coolness Level	Level 2
Business Potential	Level 3
Prerequisites	Basic data modeling, policy rules
Key Topics	Consent, sensitivity, retention policies

1. Learning Objectives

By completing this project, you will:

Define sensitivity tiers for preference memory.
Enforce consent gates for storage and retrieval.
Implement redaction and expiration policies.
Generate privacy audit reports.

2. All Theory Needed (Per-Concept Breakdown)

Preference Memory Governance and Privacy

Fundamentals Preference memory stores user-specific constraints (tone, format, scheduling choices). These memories are useful but high-risk because they often contain sensitive or personal information. A privacy-first memory system must enforce consent, classify sensitivity, redact personal data, and apply retention limits. Without governance, preference memory can become a compliance liability.

Deep Dive into the concept Preference memory is different from other memory types because it is tied to a person and can directly impact user trust. Preferences are often implicit, inferred from behavior rather than explicitly stated. This creates a risk: storing inferred preferences without consent can violate expectations or regulations. Therefore, a robust system should distinguish explicit preferences (“I want short answers”) from inferred preferences (“User seems to like short answers”). Inferred preferences should be stored with lower confidence and often require explicit confirmation before use.

Sensitivity tiers are critical. Low sensitivity preferences might include formatting or tone. Medium sensitivity might include working hours or project priorities. High sensitivity includes identifiers, health, or financial data. Each tier requires different handling. High sensitivity should require explicit consent to store and retrieve; medium sensitivity might require consent for retrieval only; low sensitivity could be stored by default. These rules must be encoded in a policy engine, not in ad-hoc code, so they can be audited and updated.

Redaction is the next layer. Even with consent, some data should not be stored in raw form. Phone numbers, emails, or personal identifiers can be masked or tokenized, preserving utility without exposing raw data. Redaction rules must be deterministic and testable. A common pattern is to store a redacted version for retrieval and optionally keep the raw version encrypted with restricted access.

Retention policies prevent preference memory from becoming stale. Preferences change; if a user requests verbose answers once, that might not be permanent. Preferences should include expiration windows and “last confirmed” timestamps. When a preference is retrieved, the system should verify that it is still valid. If it is expired, the system should either ignore it or ask for confirmation. This protects against outdated preferences and supports user trust.

Finally, governance requires auditability. You should be able to answer: what preferences are stored, which were inferred vs explicit, and when consent was given. This is why audit reports are a core deliverable. The audit pipeline is not just for compliance; it is also a diagnostic tool that reveals unsafe or outdated memory entries.

From a systems perspective, this concept must be treated as a first-class interface between data and behavior. That means you need explicit invariants (what must always be true), observability (how you know it is true), and failure signatures (how it breaks when it is not). In practice, engineers often skip this and rely on ad-hoc fixes, which creates hidden coupling between the memory subsystem and the rest of the agent stack. A better approach is to model the concept as a pipeline stage with clear inputs, outputs, and preconditions: if inputs violate the contract, the stage should fail fast rather than silently corrupt memory. This is especially important because memory errors are long-lived and compound over time. You should also define operational metrics that reveal drift early. Examples include: the percentage of memory entries that lack required metadata, the ratio of retrieved memories that are later unused by the model, or the fraction of queries that trigger a fallback route because the primary memory store is empty. These metrics are not just for dashboards; they are design constraints that force you to keep the system testable and predictable.

Another critical dimension is lifecycle management. The concept may work well at small scale but degrade as the memory grows. This is where policies and thresholds matter: you need rules for promotion, demotion, merging, or deletion that prevent the memory from becoming a landfill. The policy should be deterministic and versioned. When it changes, you should be able to replay historical inputs and measure the delta in outputs. This is the same discipline used in data engineering for schema changes and backfills, and it applies equally to memory systems. Finally, remember that memory is an interface to user trust. If the memory system is noisy, the agent feels unreliable; if it is overly strict, the agent feels forgetful. The best designs expose these trade-offs explicitly, so you can tune them according to product goals rather than guessing in the dark.

How this fits on projects This concept is central to Project 7 and is used by Projects 9 and 10 for safety and memory management.

Definitions & key terms

Consent: Explicit permission to store or use a preference.
Sensitivity tier: Classification of preference risk level.
Redaction: Masking sensitive values.
Retention: Rules that expire memory over time.

Mental model diagram (ASCII)

Preference -> Sensitivity Tier -> Consent Gate -> Store/Redact -> Retrieve

How It Works (Step-by-Step)

Classify preference sensitivity.
Check consent rules for storage.
Apply redaction if needed.
Store with expiration and confidence.
Enforce consent and freshness at retrieval.

Minimal Concrete Example

preference:
  text: "User prefers short answers"
  sensitivity: low
  consent: true
  expires: 2026-03-01

Common Misconceptions

“Preferences are always safe to store.” (False: many are sensitive.)
“Consent is a one-time event.” (False: preferences can change.)

Check-Your-Understanding Questions

Why do inferred preferences need confirmation?
What is the role of redaction?
How do retention policies protect users?

Check-Your-Understanding Answers

Because inference can be wrong and sensitive.
It prevents storing raw personal identifiers.
They prevent outdated or unwanted preferences from persisting.

Real-World Applications

Personalized assistants with privacy controls.
Enterprise systems with compliance requirements.

Where You’ll Apply It

In this project: §5.4 Concepts You Must Understand First and §6 Testing Strategy.
Also used in: Project 9, Project 10.

References

A-MemGuard (memory safety) - https://arxiv.org/abs/2504.19413

Key Insights Preference memory must be governed like user data, not like generic logs.

Summary Consent, sensitivity tiers, and retention rules make preference memory safe and trustworthy.

Homework/Exercises to Practice the Concept

Define three sensitivity tiers and rules for each.
Draft a consent workflow for inferred preferences.

Solutions to the Homework/Exercises

Low: store by default. Medium: require consent to retrieve. High: require consent to store.
Ask user to confirm inferred preference before retrieval.

3. Project Specification

3.1 What You Will Build

A preference memory system that:

Stores preferences with consent metadata
Applies redaction rules
Enforces expiration policies
Produces audit reports

3.2 Functional Requirements

Consent Check: Block storage without consent for high sensitivity.
Redaction: Mask PII patterns.
Retention: Expire preferences after a window.
Audit Report: Summarize stored preferences by sensitivity.

3.3 Non-Functional Requirements

Performance: Audit report in < 1 second for 10k preferences.
Reliability: Deterministic retrieval with fixed rules.
Usability: Clear error messages for consent violations.

3.4 Example Usage / Output

$ pref add --text "User prefers markdown summaries" --consent true --sensitivity low
[OK] preference_id=PRF-0012

$ pref add --text "User phone number is 555-1234" --consent false --sensitivity high
[BLOCKED] consent required

3.5 Data Formats / Schemas / Protocols

{
  "id": "PRF-0012",
  "text": "User prefers markdown summaries",
  "sensitivity": "low",
  "consent": true,
  "expires": "2026-03-01"
}

3.6 Edge Cases

Missing consent metadata
Conflicting preferences
Expired preferences

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

$ pref add --text "User prefers markdown summaries" --consent true --sensitivity low
$ pref audit

3.7.2 Golden Path Demo (Deterministic)

$ pref audit
Total: 24
High sensitivity: 3 (all consented)
Expired: 2
exit_code=0

3.7.3 Failure Demo (Deterministic)

$ pref add --text "User SSN is 123-45-6789" --consent false --sensitivity high
[BLOCKED] consent required
exit_code=2

4. Solution Architecture

4.1 High-Level Design

Preference Input -> Policy Engine -> Redactor -> Store -> Audit

4.2 Key Components

Component	Responsibility	Key Decisions
Policy Engine	Consent checks	Tier rules
Redactor	Mask PII	Regex patterns
Store	Persist preferences	Expiration fields
Auditor	Reports	Sensitivity breakdown

4.3 Data Structures (No Full Code)

Preference:
  id: string
  text: string
  sensitivity: enum
  consent: bool
  expires: date

4.4 Algorithm Overview

Validate consent and sensitivity.
Apply redaction rules.
Store preference with expiration.
Retrieve only if consent and fresh.

5. Implementation Guide

5.1 Development Environment Setup

- Define sensitivity tiers
- Prepare redaction rules

5.2 Project Structure

project-root/
├── src/
│   ├── policy/
│   ├── redact/
│   ├── store/
│   └── audit/

5.3 The Core Question You’re Answering

“How do I store preferences without violating privacy?”

5.4 Concepts You Must Understand First

Consent enforcement
Retention policies

5.5 Questions to Guide Your Design

Which preferences require explicit consent?
How long should preferences live?

5.6 Thinking Exercise

Classify five preferences into sensitivity tiers.

5.7 The Interview Questions They’ll Ask

“Why are preferences high risk?”
“How do you enforce consent?”
“What is redaction?”
“How do you handle expired preferences?”
“What is the role of audit reports?”

5.8 Hints in Layers

Hint 1: Define strict tiers Hint 2: Require consent for high sensitivity Hint 3: Add redaction rules Hint 4: Add audit reporting

5.9 Books That Will Help

Topic	Book	Chapter
Architecture	“Clean Architecture”	Ch. 12
Agent systems	“AI Engineering”	Ch. 6

5.10 Implementation Phases

Phase 1: Foundation

Policy and redaction rules

Phase 2: Core

Storage and retrieval

Phase 3: Polish

Audit reporting and expiration

5.11 Key Implementation Decisions

Decision	Options	Recommendation	Rationale
Consent	Store vs retrieve gate	Both	Reduces privacy risk
Redaction	Mask vs encrypt	Mask	Simpler retrieval

6. Testing Strategy

6.1 Test Categories

Category	Purpose	Examples
Unit	Redaction	PII masking
Integration	Consent enforcement	Blocked writes
Edge	Expired preferences	Retrieval denied

6.2 Critical Test Cases

High sensitivity without consent is blocked.
Expired preferences are excluded.
Audit report counts sensitivity tiers correctly.

6.3 Test Data

preference: "User phone number is 555-1234"

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall	Symptom	Solution
No consent check	Privacy leak	Add policy engine
No expiration	Stale preferences	Add retention
Weak redaction	PII stored	Improve rules

7.2 Debugging Strategies

Run audit reports regularly.
Inspect redaction output.

7.3 Performance Traps

Excessive regex checks per write.

8. Extensions & Challenges

8.1 Beginner Extensions

Add preference categories

8.2 Intermediate Extensions

Add consent expiration

8.3 Advanced Extensions

Add encrypted storage for high sensitivity

9. Real-World Connections

9.1 Industry Applications

Personal assistants with GDPR-style controls

A-MemGuard concepts

9.3 Interview Relevance

Privacy governance is a common concern in agent systems.

10. Resources

10.1 Essential Reading

A-MemGuard paper

10.2 Video Resources

Privacy and AI governance talks

10.3 Tools & Documentation

PII detection libraries

11. Self-Assessment Checklist

11.1 Understanding

I can explain consent and retention rules.

11.2 Implementation

Consent gates and redaction work.

11.3 Growth

I can justify sensitivity tiers.

12. Submission / Completion Criteria

Minimum Viable Completion:

Consent and redaction implemented

Full Completion:

Retention and audit reports

Excellence (Going Above & Beyond):

Encryption and advanced governance