Project 7: Transaction Isolation and End-to-End TinyDB Integration
Integrate storage, indexing, WAL, and SQL into a transaction-capable engine that behaves correctly under concurrent interleavings.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 3-4 weeks |
| Main Programming Language | C (Alternatives: Rust, Go) |
| Alternative Programming Languages | Rust, Go |
| Coolness Level | Level 5: Pure Magic |
| Business Potential | Level 5: Venture Candidate |
| Prerequisites | Projects 1-6 |
| Key Topics | Isolation levels, conflict handling, end-to-end invariants |
1. Learning Objectives
- Implement transaction boundaries and isolation modes.
- Demonstrate anomaly behavior by level (read committed vs serializable-like mode).
- Integrate conflict handling with WAL and SQL execution.
- Build deterministic multi-session test harness.
2. All Theory Needed (Per-Concept Breakdown)
Isolation Semantics in Practice
Fundamentals
Isolation levels define what anomalies are possible when transactions overlap. Stronger isolation limits anomalies but may increase blocking, aborts, or retries.
Deep Dive into the concept
Concurrency correctness is about interleavings, not single-thread logic. Lost update, write skew, and phantom anomalies arise when independent operations combine into invalid global state. Isolation policy should be explicit and testable.
A practical educational model uses two modes: read committed and serializable-like mode with conflict detection/retry. Read committed allows higher concurrency but can permit write skew/lost update unless additional checks exist. Serializable-like mode may abort and require client retries to preserve equivalence to serial execution.
Integration complexity is critical. SQL execution chooses access paths and returns tuple candidates; transaction manager applies visibility/conflict rules; WAL ensures durability of commit decisions. If any subsystem diverges, correctness breaks. For example, a transaction may commit logically but lose state if WAL flush ordering is wrong; or visibility may expose uncommitted state if lock/version checks are incomplete.
Deterministic anomaly harnesses are necessary. Random load does not reliably trigger edge anomalies. Use step-barrier scripts controlling session order. Validate expected final state hash for each scenario and isolation level. This makes correctness claims reproducible.
How this fit on projects
- Primary capstone integration project.
Definitions & key terms
- Lost update: concurrent write overwrite without detection.
- Write skew: concurrent writes violate cross-row invariant.
- Serialization failure: abort raised to preserve serializable order.
Mental model diagram
Session A Session B
read X=5 read X=5
write X=6 write X=6
commit commit
Without conflict controls -> lost update
With serializable controls -> one abort/retry
How it works (step-by-step, with invariants and failure modes)
- Begin transaction with isolation mode.
- Execute SQL operations and track read/write sets.
- Detect conflicts at lock/commit time.
- Commit durable changes via WAL protocol or abort and retry.
Failure modes:
- Missing predicate conflicts in serializable mode.
- Retry-unaware clients causing user-visible failures.
Minimal concrete example
Scenario: two sessions reserve one of two doctors on-call rows.
Both observe at least one available and each disables different row.
Without serializable control -> invariant violation (none on call).
Common misconceptions
- “Transactions always behave as if serial.” -> Depends on isolation level.
- “MVCC means no lock contention.” -> Write conflicts still occur.
Check-your-understanding questions
- Why can write skew occur under snapshot-like semantics?
- Why are retries part of serializable operation?
- Why must anomaly tests be deterministic?
Check-your-understanding answers
- Conflicting writes may target different rows while violating global constraint.
- Aborts preserve correctness; clients must retry.
- Determinism proves behavior and enables regression detection.
Real-world applications
- PostgreSQL transaction isolation behavior.
Where you’ll apply it
- Full TinyDB integration in this project.
References
Key insights
Correct concurrency behavior must be demonstrated, not assumed.
Summary
Isolation work is complete only when anomaly scripts produce expected outcomes per mode.
Homework/Exercises to practice the concept
- Build two-session lost-update script.
- Build write-skew script with cross-row invariant.
- Add retry policy to client wrapper.
Solutions to the homework/exercises
- Use barriered read-read-write-write sequence.
- Validate final invariant hash after both commits.
- Retry full transaction on serialization error.
3. Project Specification
3.1 What You Will Build
Integrated TinyDB runtime with transaction modes, anomaly harness, and deterministic correctness reports.
3.2 Functional Requirements
- Begin/commit/rollback transaction APIs.
- Isolation mode selection per session.
- Conflict detection and retriable errors.
- Deterministic multi-session test scripts.
- End-to-end crash + concurrency validation.
3.3 Non-Functional Requirements
- Performance: Contention metrics available per scenario.
- Reliability: Recovery + isolation interactions validated.
- Usability: Clear retriable/non-retriable error categories.
3.4 Example Usage / Output
run anomaly scripts under two isolation modes and compare final state
3.5 Data Formats / Schemas / Protocols
Transaction table schema, lock/version metadata, conflict event log format.
3.6 Edge Cases
Deadlock cycles, starvation under contention, crash during commit path.
3.7 Real World Outcome
3.7.1 How to Run (Copy/Paste)
make tinydb_multi_session
./tinydb_multi_session --scenario lost-update --isolation read-committed
./tinydb_multi_session --scenario lost-update --isolation serializable
3.7.2 Golden Path Demo (Deterministic)
Fixed interleavings and fixed seed per scenario.
3.7.3 If CLI: exact terminal transcript
$ ./tinydb_multi_session --scenario write-skew --isolation read-committed
result=ANOMALY_OBSERVED invariant=violated
$ ./tinydb_multi_session --scenario write-skew --isolation serializable
result=SERIALIZATION_FAILURE_HANDLED retries=1 invariant=preserved
4. Solution Architecture
4.1 High-Level Design
SQL Session -> Txn Manager -> Conflict Detector -> WAL Commit Path -> Storage
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Txn manager | lifecycle and isolation mode | retriable error model |
| Conflict detector | read/write/predicate conflict checks | deterministic policies |
| Session harness | scripted interleavings | reproducible scenario runner |
| Validator | invariant + state-hash checks | pass/fail oracle |
4.4 Data Structures (No Full Code)
TxnState { txn_id, mode, read_set, write_set, status }
ConflictEvent { txn_a, txn_b, conflict_type, resolution }
4.4 Algorithm Overview
Execute barriered steps, detect conflicts, then commit/abort with WAL durability guarantees.
5. Implementation Guide
5.1 Development Environment Setup
make tinydb_multi_session
5.2 Project Structure
project/
├── src/txn/
├── src/conflict/
├── src/session_harness/
└── tests/anomalies/
5.3 The Core Question You’re Answering
Can my integrated engine preserve invariants under concurrent interleavings and crash conditions?
5.4 Concepts You Must Understand First
- Isolation anomaly taxonomy
- Conflict resolution and retries
- WAL + visibility integration
5.5 Questions to Guide Your Design
- Which conflicts are detected eagerly vs at commit?
- How do clients distinguish retriable from fatal errors?
5.6 Thinking Exercise
Design one deterministic schedule that must fail under weak isolation and pass under strong isolation.
5.7 The Interview Questions They’ll Ask
- How do you prove your isolation behavior is correct?
- Why are retries required for serializable safety?
- What anomalies appear at read committed?
- How do WAL and isolation interact?
5.8 Hints in Layers
- Build anomaly harness before tuning lock granularity.
- Log conflict graph events for debugging.
- Start with coarse locking, then refine.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Concurrency concepts | Operating System Concepts | concurrency chapters |
| Systems reasoning | Operating Systems: Three Easy Pieces | concurrency chapters |
5.10 Implementation Phases
- Phase 1: transaction API + basic conflicts
- Phase 2: anomaly harness + deterministic scripts
- Phase 3: serializable retries + integration hardening
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Conflict model | lock-only/MVCC-like/hybrid | hybrid starter | practical scope |
| Retry policy | caller-managed/auto retry wrapper | caller-managed + helper | explicit control |
6. Testing Strategy
6.1 Test Categories
- Scenario tests: known anomalies
- Integration tests: SQL + WAL + isolation
- Crash+concurrency combined tests
6.2 Critical Test Cases
- Lost update scenario by isolation mode.
- Write skew scenario by isolation mode.
- Crash during commit in contended workload.
6.3 Test Data
Fixed account/booking/inventory datasets with invariant assertions.
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Missing predicate conflicts | write skew survives | add range/predicate checks |
| No retry handling | serializable aborts bubble to users | retry wrapper patterns |
7.2 Debugging Strategies
- Emit conflict trace with txn IDs and step numbers.
- Re-run deterministic schedule with verbose state snapshots.
7.3 Performance Traps
Overly coarse locks can collapse throughput under read-heavy workloads.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add lock wait-time metrics.
8.2 Intermediate Extensions
- Add deadlock detection graph and cycle diagnostics.
8.3 Advanced Extensions
- Add MVCC snapshot visibility prototype.
9. Real-World Connections
9.1 Industry Applications
Transaction isolation behavior defines real financial, booking, and inventory correctness.
9.2 Related Open Source Projects
- PostgreSQL isolation and SSI documentation.
9.3 Interview Relevance
Strong signal for distributed systems/database reliability roles.
10. Resources
10.1 Essential Reading
- PostgreSQL transaction isolation docs.
10.2 Video Resources
- Talks on anomaly testing and serializable isolation.
10.3 Tools & Documentation
- Deterministic schedulers, trace loggers, invariant checkers.
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain anomaly behavior by isolation level.
- I can explain when retries are required.
11.2 Implementation
- Deterministic anomaly scenarios pass expected matrix.
- End-to-end crash + concurrency tests are stable.
11.3 Growth
- I can defend isolation tradeoffs for a real workload.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Transaction API + one isolation mode + anomaly harness
Full Completion:
- Two isolation modes with expected anomaly outcomes
Excellence:
- Rich conflict diagnostics and advanced retry/locking strategy