Project 10: The Secure Enclave (Capstone)

Goal: Understand and implement zero trust controls that validate identity, device health, and policy on every access decision.

Core Zero Trust Principle: “Trust nothing, verify everything, assume breach.” This capstone project integrates all nine previous projects into a cohesive, production-ready Zero Trust environment that demonstrates defense in depth, continuous verification, and complete security observability.

Identity and Device Trust

Zero Trust assumes the network is hostile. Identity and device health are the primary signals for access decisions.

Policy Decision and Enforcement

Access decisions must be centralized (PDP) and enforced close to the resource (PEP). Policies need to be explicit, auditable, and context-aware.

Continuous Verification

Trust is not static. Monitor behavior, re-evaluate sessions, and revoke access when signals change.

Concept Summary Table

Concept Cluster	What You Need to Internalize
Identity	Strong identity, SSO, mTLS, workload IDs.
Device trust	Posture, attestation, health checks.
Policy	PDP/PEP architecture, least privilege.
Telemetry	Continuous monitoring and risk scoring.
Segmentation	ZTNA tunnels and micro-segmentation.

Deep Dive Reading by Concept

Concept	Book & Chapter
Zero Trust	Zero Trust Networks by Kindervag — core chapters
Policy design	NIST SP 800-207 — architecture sections
mTLS	TLS RFC 8446 — handshake sections
Attestation	TPM 2.0 Library — attestation overview

Project Overview

Attribute	Value
Difficulty	Level 5: Master
Time Estimate	2 Weeks (80-120 hours)
Primary Language	Go, Rust, Python (Mixed)
Alternative Languages	N/A - Uses components from P01-P09
Prerequisites	All previous projects (P01-P09) completed or understood
Main Book	“Zero Trust Networks” by Gilman & Barth
Software/Tool	Docker, Kubernetes (optional), Terraform (optional)
Knowledge Area	Security Architecture / System Integration

What You’re Building: A complete “Secure Enclave” - an integrated Zero Trust environment that protects a set of internal applications. This is not a new component; it’s the orchestration of all previous projects into a unified security architecture where every request is authenticated, every access is authorized, every connection is encrypted, and every action is logged.

Why It Matters: Individual security components are useless if they don’t work together. Real-world breaches happen in the gaps between security tools. This capstone teaches you to think like a security architect - designing systems where every layer reinforces the others, where compromise of one component doesn’t mean compromise of all.

Learning Objectives

By completing this project, you will be able to:

Architect complete Zero Trust environments - Design systems where all five pillars (Identity, Device, Network, Application, Data) work together
Integrate disparate security components - Connect identity proxies, policy engines, mTLS meshes, and access brokers into a unified system
Implement defense in depth - Create overlapping security controls where failure of one doesn’t compromise the whole
Design for security observability - Build unified logging, alerting, and incident response capabilities
Apply the CISA Zero Trust Maturity Model - Measure and improve your architecture against industry standards
Conduct security architecture reviews - Evaluate systems for gaps, weaknesses, and improvement opportunities
Build incident response readiness - Create runbooks and tooling for responding to security events

Deep Theoretical Foundation

The Integration Challenge

Individual security components are necessary but not sufficient. Consider this scenario:

+--------------------------------------------------------------------------+
|              The Gap Between Components: Where Breaches Happen            |
+--------------------------------------------------------------------------+

SCENARIO: Attacker compromises a developer's laptop via phishing

WITHOUT INTEGRATION:
--------------------

  [ Laptop Compromised ]
        |
        v
  [ Identity Proxy (P01) ] - "Valid JWT from douglas@corp.com" - ALLOWS
        |
        | (But is the device healthy? No one checks!)
        v
  [ Internal App ] - "X-ZT-Identity: douglas@corp.com" - TRUSTS
        |
        | (But is this normal behavior for douglas? No one asks!)
        v
  [ Database ] - Attacker exfiltrates data
        |
        v
  [ Audit Logs ] - "douglas accessed 10,000 records"
                   (Noticed 3 weeks later during audit)


WITH FULL INTEGRATION (This Capstone):
--------------------------------------

  [ Laptop Compromised ]
        |
        v
  [ ZTNA Tunnel (P09) ] - Attempts connection
        |
        v
  [ Device Trust (P05) ] - "Device posture: MALWARE_DETECTED" - BLOCKS
        |
        X---> Connection terminated, alert sent
        |
  [ If malware evades detection... ]
        |
        v
  [ Identity Proxy (P01) ] - "Valid JWT from douglas@corp.com"
        |
        v
  [ Policy Engine (P02) ] - "Context: New device, unusual time, VPN IP"
        |                    "Risk score: HIGH"
        |                    "Action: REQUIRE_STEP_UP_AUTH"
        |
        X---> User prompted for additional verification
        |
  [ If attacker has MFA... ]
        |
        v
  [ Continuous Auth (P06) ] - "Behavior anomaly: 10x normal query rate"
        |                      "Action: TERMINATE_SESSION"
        |
        X---> Session killed, SOC alerted, incident created
        |
  [ Even if data accessed... ]
        |
        v
  [ JIT Access (P08) ] - Credential was scoped to specific tables
        |                Credential expired 30 minutes ago
        |
        v
  [ mTLS Mesh (P04) ] - All connections logged with identity
        |
        v
  [ Unified SIEM ] - Correlation: "Compromised device + stolen creds"
                     Auto-response: Revoke all douglas's access

+--------------------------------------------------------------------------+

Key Insight: Defense in depth means that an attacker must defeat EVERY control, not just one. Integration is what makes this possible.

The Five Pillars in Practice

Each previous project maps to one or more of NIST’s Zero Trust pillars. In the capstone, you’ll see how they interconnect:

+--------------------------------------------------------------------------+
|              NIST Zero Trust Pillars - Project Mapping                    |
+--------------------------------------------------------------------------+

  +------------------+      +------------------+      +------------------+
  |     IDENTITY     |      |      DEVICE      |      |     NETWORK      |
  |   (Who are you?) | <==> | (Is device safe?)| <==> |  (How connected?)|
  +------------------+      +------------------+      +------------------+
         |                         |                         |
    +----+----+               +----+----+               +----+----+
    |         |               |         |               |         |
  [P01]     [P06]           [P05]     [P05]           [P03]     [P04]
  Identity  Continuous      Device    Health          Micro-    mTLS
  Proxy     Auth            Trust     Attestation     Segment   Mesh
    |         |               |         |               |         |
    +----+----+               +----+----+               +----+----+
         |                         |                         |
         v                         v                         v
  +------------------+      +------------------+      +------------------+
  |   APPLICATION    |      |      DATA        |      |   VISIBILITY     |
  | (What can access)| <==> | (What's protected)| <==> | (What happened?) |
  +------------------+      +------------------+      +------------------+
         |                         |                         |
    +----+----+               +----+----+               +----+----+
    |         |               |         |               |         |
  [P02]     [P08]           [P07]     [P09]           [ALL]     [P10]
  Policy    JIT             SDP       ZTNA            Audit     SIEM
  Engine    Access          Control   Tunnel          Logs      Correlation
    |         |               |         |               |         |
    +----+----+               +----+----+               +----+----+
         |                         |                         |
         +-------------------------+-------------------------+
                                   |
                                   v
                      +------------------------+
                      |   INTEGRATED CONTROL   |
                      |        PLANE           |
                      |                        |
                      |  Policy as Code        |
                      |  Unified Identity      |
                      |  Centralized Logging   |
                      |  Automated Response    |
                      +------------------------+

+--------------------------------------------------------------------------+

The Control Plane Integration Pattern

All Zero Trust components share a common control plane. This is the “brain” that coordinates policy across all enforcement points:

+--------------------------------------------------------------------------+
|              Unified Control Plane Architecture                           |
+--------------------------------------------------------------------------+

                          +-------------------------+
                          |    CONTROL PLANE        |
                          |                         |
                          |  +-------------------+  |
                          |  | Policy Repository |  |
                          |  | (Git-versioned)   |  |
                          |  +-------------------+  |
                          |           |             |
                          |  +-------------------+  |
                          |  | Policy Engine     |  |
                          |  | (P02)             |  |
                          |  +-------------------+  |
                          |           |             |
                          |  +-------------------+  |
                          |  | Identity Provider |  |
                          |  | (SSO/OIDC)        |  |
                          |  +-------------------+  |
                          |           |             |
                          |  +-------------------+  |
                          |  | Device Registry   |  |
                          |  | (P05 metadata)    |  |
                          |  +-------------------+  |
                          |           |             |
                          |  +-------------------+  |
                          |  | SIEM / Logging    |  |
                          |  | (Unified)         |  |
                          |  +-------------------+  |
                          +-------------------------+
                                     |
         +---------------------------+---------------------------+
         |                           |                           |
         v                           v                           v
+------------------+      +------------------+      +------------------+
|   DATA PLANE     |      |   DATA PLANE     |      |   DATA PLANE     |
|   (P01 Proxy)    |      |   (P04 mTLS)     |      |   (P09 Tunnel)   |
|                  |      |                  |      |                  |
| - Enforce policy |      | - Enforce mTLS   |      | - Enforce access |
| - Log access     |      | - Log connections|      | - Log tunnel use |
| - Query PDP      |      | - Verify certs   |      | - Route traffic  |
+------------------+      +------------------+      +------------------+
         |                           |                           |
         v                           v                           v
    [Protected         [Service-to-Service    [Remote User
     Application]        Communication]         Access]

+--------------------------------------------------------------------------+

Real World Outcome

Deliverables:

Zero trust component with policy config
Telemetry output for decisions

Validation checklist:

Identity/device checks gate access
Policy enforcement is consistent
Logs provide decision traceability

After completing this capstone, you will have a fully operational Zero Trust environment. Here’s what it looks like in practice:

Demo Scenario: Developer Accessing Production

# Terminal 1: The Secure Enclave is running
$ docker-compose -f secure-enclave.yml up
[P01-proxy]      | Listening on :8443 (mTLS enabled)
[P02-policy]     | Policy engine ready, 47 policies loaded
[P04-mesh]       | mTLS mesh controller running
[P05-device]     | Device trust service ready
[P06-auth]       | Continuous authentication monitor active
[P08-jit]        | JIT access broker running
[P09-tunnel]     | ZTNA gateway accepting connections
[siem]           | Unified logging ingesting from all components
[alerting]       | Alert manager connected to Slack/PagerDuty

# Terminal 2: Developer's machine with ZTNA client
$ ztna-client connect --config corp.yaml
[INFO] Device health check: PASSED (macOS 14.2, FileVault enabled, no malware)
[INFO] mTLS certificate loaded: spiffe://corp.com/user/douglas/device/macbook-42
[INFO] Connected to gateway.corp.com
[INFO] Ready to access: jira.internal, gitlab.internal, postgres-prod.internal

# Terminal 3: Developer requests database access
$ jit request --db postgres-prod --tables orders --reason "PROD-1234"
[INFO] Request submitted: req-abc123
[INFO] Auto-approval policy matched: "senior_engineer_readonly_prod"
[INFO] Credentials generated (TTL: 30 minutes)

Username: jit_douglas_20241227_x7y8z9
Password: Kj8mN2pQ5rT7vX9yB3dF6gH8jL0nM1oP2qR3sT4u
Expires:  2024-12-27T15:30:00Z

$ psql "postgresql://jit_douglas_20241227_x7y8z9:Kj8m...@postgres-prod.internal/app"
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384)
app=> SELECT * FROM orders WHERE id = 12345;
 id    | status  | created_at
-------+---------+------------
 12345 | shipped | 2024-12-26
(1 row)

# SIEM shows the full trace
$ curl https://siem.corp.com/api/trace/douglas/today | jq
{
  "user": "douglas@corp.com",
  "device": "macbook-42",
  "events": [
    {
      "time": "14:58:00Z",
      "component": "P05-device",
      "event": "device_health_check",
      "result": "PASS",
      "details": {"os": "macOS 14.2", "encryption": true, "firewall": true}
    },
    {
      "time": "14:58:01Z",
      "component": "P09-tunnel",
      "event": "tunnel_established",
      "client_ip": "203.0.113.42",
      "gateway": "gateway-us-west-2"
    },
    {
      "time": "14:59:00Z",
      "component": "P08-jit",
      "event": "access_requested",
      "database": "postgres-prod",
      "permissions": ["SELECT"],
      "tables": ["orders"]
    },
    {
      "time": "14:59:01Z",
      "component": "P02-policy",
      "event": "policy_evaluated",
      "policy": "senior_engineer_readonly_prod",
      "decision": "ALLOW"
    },
    {
      "time": "14:59:02Z",
      "component": "P08-jit",
      "event": "credential_created",
      "temp_user": "jit_douglas_20241227_x7y8z9",
      "ttl_minutes": 30
    },
    {
      "time": "14:59:30Z",
      "component": "P04-mesh",
      "event": "connection_established",
      "source": "macbook-42",
      "destination": "postgres-prod",
      "mtls_verified": true
    },
    {
      "time": "14:59:31Z",
      "component": "postgres",
      "event": "query_executed",
      "user": "jit_douglas_20241227_x7y8z9",
      "query": "SELECT * FROM orders WHERE id = 12345",
      "rows_returned": 1
    }
  ]
}

The Core Question You’re Answering

“How do all the pieces of Zero Trust architecture fit together to create a truly secure environment where every access is verified, every connection is encrypted, and every action is audited?”

This capstone integrates Projects P01-P09 because individually, each component solves only one piece of the Zero Trust puzzle:

P01 (Identity-Aware Proxy) verifies WHO is making requests, but doesn’t know if their device is compromised
P02 (Policy Engine) makes authorization decisions, but needs identity and context from other components
P03 (Micro-segmentation) prevents lateral movement, but doesn’t understand application-layer identity
P04 (mTLS Mesh) encrypts connections, but relies on the PKI and identity systems to work correctly
P05 (Device Trust) validates endpoints, but must feed into policy decisions to be actionable
P06 (Continuous Authentication) detects anomalies, but needs integration to actually block access
P07 (SDP Controller) hides services, but must coordinate with identity and policy systems
P08 (JIT Access) eliminates standing privileges, but must be governed by policy and logged for audit
P09 (ZTNA Tunnel) provides secure remote access, but is the integration point for all other controls

The capstone is where you learn that security architecture is not about components - it’s about how they work together. A chain of security controls is only as strong as the integration between its links.

Concepts You Must Understand First

Before implementing this capstone, ensure you deeply understand these integration concepts:

1. System Integration Patterns

Understanding how distributed systems communicate and share state:

+--------------------------------------------------------------------------+
|              Integration Pattern: Event-Driven Security                   |
+--------------------------------------------------------------------------+

PATTERN: Event Bus for Security Signals
---------------------------------------

All components publish security events to a central bus:

  [P05: Device Trust]         [P06: Continuous Auth]
         |                            |
         | "device_compromised"       | "behavior_anomaly"
         v                            v
  +------------------------------------------+
  |           SECURITY EVENT BUS             |
  |  (Kafka, Redis Streams, or NATS)         |
  +------------------------------------------+
         |                            |
         | Subscribe                  | Subscribe
         v                            v
  [P01: Identity Proxy]         [P08: JIT Broker]
         |                            |
         | Block sessions             | Revoke credentials
         | for device                 | for user

This pattern enables REACTIVE security:
- Device compromised -> All access revoked within seconds
- Behavior anomaly -> Step-up auth required immediately
- Policy change -> All PEPs updated atomically

+--------------------------------------------------------------------------+

2. End-to-End Security Architecture

The complete request lifecycle through all security layers:

+--------------------------------------------------------------------------+
|              End-to-End Request Flow                                      |
+--------------------------------------------------------------------------+

  USER REQUEST
       |
       v
  +--[Layer 1: Device]--+
  | P05 Health Check    |
  | - OS patched?       |
  | - Firewall enabled? |
  | - Malware scan?     |
  +---------------------+
       | PASS
       v
  +--[Layer 2: Tunnel]--+
  | P09 ZTNA Client     |
  | - mTLS to gateway   |
  | - Domain routing    |
  +---------------------+
       | CONNECTED
       v
  +--[Layer 3: Identity]+
  | P01 Identity Proxy  |
  | - JWT validation    |
  | - Header injection  |
  +---------------------+
       | AUTHENTICATED
       v
  +--[Layer 4: Policy]--+
  | P02 Policy Engine   |
  | - ABAC evaluation   |
  | - Risk assessment   |
  +---------------------+
       | AUTHORIZED
       v
  +--[Layer 5: Network]-+
  | P03 Micro-segment   |
  | - eBPF/iptables     |
  | P04 mTLS Mesh       |
  | - Service identity  |
  +---------------------+
       | ENCRYPTED
       v
  +--[Layer 6: Access]--+
  | P08 JIT Credentials |
  | - Ephemeral access  |
  | - Scoped permissions|
  +---------------------+
       | PROVISIONED
       v
  +--[Layer 7: Observe]-+
  | P06 Continuous Auth |
  | - Behavior monitor  |
  | - Anomaly detection |
  +---------------------+
       | MONITORED
       v
  PROTECTED RESOURCE

Every layer can DENY. Every layer LOGS.

+--------------------------------------------------------------------------+

3. Defense in Depth

Overlapping controls that assume each layer might fail:

+--------------------------------------------------------------------------+
|              Defense in Depth: Overlapping Controls                       |
+--------------------------------------------------------------------------+

PRINCIPLE: If one control fails, others catch the breach

EXAMPLE: Compromised User Credential

  Control 1: Device Trust (P05)
  +------------------------------------------+
  | Expected: Block unmanaged/unhealthy device|
  | Bypassed: Attacker uses managed device    |
  +------------------------------------------+
           | (Control failed)
           v
  Control 2: Continuous Auth (P06)
  +------------------------------------------+
  | Expected: Detect behavior anomaly         |
  | Catches: Unusual access patterns flagged  |
  +------------------------------------------+
           | (Anomaly detected!)
           v
  Control 3: Policy Engine (P02)
  +------------------------------------------+
  | Response: Risk score elevated             |
  | Action: Require MFA re-verification       |
  +------------------------------------------+
           | (Attacker blocked at MFA)
           v
  BREACH PREVENTED

Even if Control 2 missed the anomaly:

  Control 4: JIT Access (P08)
  +------------------------------------------+
  | Limit: Credential only valid 30 minutes   |
  | Scope: Only SELECT on specific tables     |
  +------------------------------------------+
           | (Blast radius limited)
           v
  Control 5: Audit Logging
  +------------------------------------------+
  | Record: All queries with full context     |
  | Alert: SIEM correlation triggers alert    |
  +------------------------------------------+
           | (Incident detected)
           v
  Control 6: Automated Response
  +------------------------------------------+
  | Action: Revoke all user credentials       |
  | Action: Terminate all active sessions     |
  | Action: Alert SOC team                    |
  +------------------------------------------+

+--------------------------------------------------------------------------+

4. Security Observability

Unified visibility across all security components:

+--------------------------------------------------------------------------+
|              Security Observability Stack                                 |
+--------------------------------------------------------------------------+

                    +----------------------+
                    |    DASHBOARDS        |
                    |    (Grafana)         |
                    +----------------------+
                              |
                    +----------------------+
                    |    ALERTING          |
                    |    (AlertManager)    |
                    +----------------------+
                              |
                    +----------------------+
                    |    CORRELATION       |
                    |    (SIEM Rules)      |
                    +----------------------+
                              |
                    +----------------------+
                    |    LOG AGGREGATION   |
                    |    (Elasticsearch/   |
                    |     Loki/Splunk)     |
                    +----------------------+
                              ^
         +--------------------+--------------------+
         |                    |                    |
    P01 Logs             P02 Logs            P04-P09 Logs
    - Access             - Decisions          - All other
    - Deny               - Policy hits          components
    - Identity           - Risk scores

KEY METRICS TO OBSERVE:
-----------------------
1. Access Patterns
   - Requests per user/device/time
   - Denied vs allowed ratio
   - New device/location access

2. Policy Effectiveness
   - Policies triggered
   - Auto-approve vs manual approval
   - Time to approval

3. Security Events
   - Device health failures
   - Authentication anomalies
   - Revoked credentials

4. System Health
   - Component latency
   - Error rates
   - Certificate expiration

+--------------------------------------------------------------------------+

5. Incident Response Readiness

Pre-built playbooks for common security scenarios:

+--------------------------------------------------------------------------+
|              Incident Response Playbooks                                  |
+--------------------------------------------------------------------------+

PLAYBOOK 1: Compromised User Account
-------------------------------------
Trigger: P06 detects impossible travel or credential stuffing

Steps:
1. [AUTOMATIC] Terminate all active sessions for user
2. [AUTOMATIC] Revoke all JIT credentials for user
3. [AUTOMATIC] Block user at Identity Proxy
4. [ALERT] Notify SOC and user's manager
5. [MANUAL] Investigate activity in SIEM
6. [MANUAL] Reset user credentials if legitimate
7. [MANUAL] Re-enable access with step-up verification


PLAYBOOK 2: Compromised Device
------------------------------
Trigger: P05 reports malware detected or failed health check

Steps:
1. [AUTOMATIC] Block device at ZTNA gateway
2. [AUTOMATIC] Terminate tunnel connection
3. [AUTOMATIC] Revoke device certificate
4. [ALERT] Notify user and IT security
5. [MANUAL] Quarantine device for forensics
6. [MANUAL] Reimage device
7. [MANUAL] Re-enroll with new certificate


PLAYBOOK 3: Suspicious Database Access
---------------------------------------
Trigger: P08 detects unusual query patterns

Steps:
1. [AUTOMATIC] Revoke JIT credential immediately
2. [AUTOMATIC] Log complete query history
3. [ALERT] Notify DBA and SOC
4. [MANUAL] Review data accessed
5. [MANUAL] Assess data breach notification requirements
6. [MANUAL] Update policies to prevent recurrence


PLAYBOOK 4: Certificate Compromise
----------------------------------
Trigger: P04 detects certificate misuse or CA compromise

Steps:
1. [AUTOMATIC] Revoke compromised certificates
2. [AUTOMATIC] Force re-authentication for all services
3. [AUTOMATIC] Issue new certificates to legitimate entities
4. [ALERT] Notify security team
5. [MANUAL] Investigate root cause
6. [MANUAL] Rotate CA if necessary

+--------------------------------------------------------------------------+

6. Zero Trust Maturity Model (CISA)

The CISA Zero Trust Maturity Model provides a framework for measuring your implementation:

+--------------------------------------------------------------------------+
|              CISA Zero Trust Maturity Model - Assessment                  |
+--------------------------------------------------------------------------+

MATURITY LEVELS:
----------------
1. Traditional - Perimeter-based, minimal ZT
2. Initial     - Starting ZT journey, some components
3. Advanced    - Integrated ZT, automation beginning
4. Optimal     - Full ZT, continuous improvement

PILLAR ASSESSMENT:

IDENTITY
--------
Traditional: Passwords only
Initial:     SSO + MFA for some apps
Advanced:    Risk-based MFA, continuous verification [P01, P06]
Optimal:     Passwordless, phishing-resistant MFA

Your implementation: [P01 Identity Proxy + P06 Continuous Auth]


DEVICE
------
Traditional: No device validation
Initial:     Inventory, basic MDM
Advanced:    Real-time posture, conditional access [P05]
Optimal:     Automated remediation, hardware attestation

Your implementation: [P05 Device Trust]


NETWORK
-------
Traditional: Perimeter firewall only
Initial:     Network segmentation
Advanced:    Micro-segmentation, encrypted traffic [P03, P04]
Optimal:     Software-defined, identity-aware networking

Your implementation: [P03 Micro-segmentation + P04 mTLS + P07 SDP]


APPLICATION
-----------
Traditional: VPN for remote access
Initial:     ZTNA for some apps
Advanced:    Per-app access, continuous authorization [P02, P09]
Optimal:     Dynamic policy, threat intelligence integrated

Your implementation: [P02 Policy Engine + P08 JIT + P09 ZTNA]


DATA
----
Traditional: Perimeter-based data protection
Initial:     Classification, DLP basics
Advanced:    Automatic classification, encryption everywhere
Optimal:     Dynamic data policies, user behavior analytics

Your implementation: [All projects contribute + SIEM]


VISIBILITY
----------
Traditional: Siloed logs
Initial:     Centralized logging
Advanced:    Correlation, automated alerting [P10 Capstone]
Optimal:     ML-based threat detection, automated response

Your implementation: [Unified SIEM + Alerting + Playbooks]

+--------------------------------------------------------------------------+

Questions to Guide Your Design

Before implementing, work through these design questions:

Integration Questions

Event Flow: How will security events propagate between components? (Event bus? API calls? Shared database?)
Identity Consistency: How will user identity be represented consistently across all components? (SPIFFE ID? JWT claims? Custom format?)
Policy Distribution: How will policy changes propagate to all enforcement points? (Push? Pull? Event-driven?)
Certificate Management: How will certificates be issued, rotated, and revoked across the mesh? (Central CA? Distributed?)
State Sharing: What state needs to be shared between components? (Session state? Device state? Risk scores?)

Testing Questions

Component Testing: How will you verify each component works in isolation?
Integration Testing: How will you test the interactions between components?
Security Testing: How will you verify the security controls actually work? (Penetration testing scenarios?)
Failure Testing: What happens when each component fails? (Chaos engineering?)
Performance Testing: What’s the latency overhead of the full security stack?

Operations Questions

Deployment: How will you deploy and update components without breaking security? (Blue-green? Canary?)
Monitoring: What dashboards and alerts do you need? (Per-component? Cross-component?)
Incident Response: What playbooks do you need for common scenarios?
Compliance: How will you demonstrate compliance with SOC2/ISO27001/etc.?
Disaster Recovery: How will you recover if a critical component fails?

Thinking Exercise: Architecture Design

Before writing any code, complete this design exercise on paper or whiteboard:

Exercise: Design the Complete Architecture

Task: Draw a complete architecture diagram showing:

All Components: Where P01-P09 components are deployed
Data Flows: How requests flow through the system
Control Flows: How policy decisions are made and enforced
Event Flows: How security events propagate
Trust Boundaries: Where trust is established and verified

Guiding Questions:

Where is the user? (Remote? Office? Mobile?)
How do they authenticate? (SSO? MFA? Certificate?)
How does their request reach the internal application?
What checks happen at each step?
What gets logged at each step?
What happens if any check fails?
What happens if the user’s device is compromised mid-session?

Deliverable: A diagram similar to this (but with YOUR understanding):

+--------------------------------------------------------------------------+
|              YOUR Secure Enclave Architecture                             |
+--------------------------------------------------------------------------+

  [Remote User]                    [Office User]
       |                                 |
       v                                 v
  +----+----+                      +-----+-----+
  | ZTNA    |                      | Direct    |
  | Client  |                      | Network   |
  | (P09)   |                      |           |
  +----+----+                      +-----+-----+
       |                                 |
       | mTLS                            | mTLS
       v                                 v
  +------------------------------------------------+
  |            SECURITY GATEWAY TIER                |
  |                                                 |
  |  +----------+  +----------+  +----------+      |
  |  | Identity |  | Policy   |  | Device   |      |
  |  | Proxy    |  | Engine   |  | Trust    |      |
  |  | (P01)    |  | (P02)    |  | (P05)    |      |
  |  +----------+  +----------+  +----------+      |
  |       |              |              |          |
  |       +-------+------+------+-------+          |
  |               |                                |
  +---------------|--------------------------------+
                  | Authorized & Verified
                  v
  +------------------------------------------------+
  |            SERVICE TIER                         |
  |                                                 |
  |  +----------+  +----------+  +----------+      |
  |  | App A    |  | App B    |  | Database |      |
  |  |          |  |          |  | (via JIT)|      |
  |  +----------+  +----------+  +----------+      |
  |       ^              ^              ^          |
  |       |    mTLS (P04)     |              |          |
  |       +-------+------+------+-------+          |
  |               |                                |
  |  +-------------------------------------------+ |
  |  | Micro-segmentation (P03) - eBPF/iptables | |
  |  +-------------------------------------------+ |
  |                                                 |
  +------------------------------------------------+
                  |
                  v
  +------------------------------------------------+
  |            OBSERVABILITY TIER                   |
  |                                                 |
  |  +----------+  +----------+  +----------+      |
  |  | Log      |  | SIEM     |  | Alert    |      |
  |  | Collector|  | Correlate|  | Manager  |      |
  |  +----------+  +----------+  +----------+      |
  |                                                 |
  +------------------------------------------------+

+--------------------------------------------------------------------------+

Hints in Layers: Progressive Implementation Guidance

Hint Layer 1: Start with the Control Plane

Before integrating data plane components, establish the shared control plane:

Unified Identity: All components use the same identity representation
- Define your identity format (SPIFFE recommended)
- All components validate the same JWT issuer
- User identity flows through every request
Shared Policy Repository: All policies in one place
- Git repository for policy-as-code
- Policies reference the same identity format
- Version control enables rollback
Centralized Logging: All components log to the same destination
- Structured JSON logs with consistent schema
- Common fields: timestamp, component, user, device, action, result
- Correlation ID traces requests across components

Hint Layer 2: Integration by Event

Connect components through events rather than direct API calls:

Event Schema: Define a common event format

{
  "event_type": "access_denied",
  "timestamp": "2024-12-27T15:00:00Z",
  "component": "P01-identity-proxy",
  "correlation_id": "req-abc123",
  "subject": {
    "user": "douglas@corp.com",
    "device": "macbook-42"
  },
  "object": {
    "resource": "jira.internal",
    "action": "GET /admin"
  },
  "result": {
    "decision": "DENY",
    "reason": "insufficient_permissions"
  }
}

Event Bus: Use Redis Streams, Kafka, or NATS
- Components publish events to topics
- Other components subscribe and react
- Enables loose coupling
Reaction Rules: Define what happens when events occur
- device_compromised -> Revoke all sessions for device
- user_anomaly -> Elevate risk score, require step-up

Hint Layer 3: Test the Attack Scenarios

Validate your integration by simulating attacks:

Stolen Credential Attack
- Generate valid JWT for “attacker”
- Attempt access from new device -> P05 should block
- Attempt access at unusual time -> P06 should flag
- Attempt access to unauthorized resource -> P02 should deny
Compromised Device Attack
- Simulate malware detection on device
- Verify device is blocked at ZTNA gateway
- Verify all sessions for device are terminated
- Verify credentials are revoked
Lateral Movement Attack
- Gain access to one service
- Attempt to reach other services -> P03 should block
- Attempt to scan internal network -> P03 should block
- Verify all attempts are logged

Hint Layer 4: Build the Observability Stack

You can’t secure what you can’t see:

Metrics Collection
- Prometheus for time-series metrics
- Each component exports metrics
- Custom dashboards in Grafana
Log Aggregation
- Loki, Elasticsearch, or Splunk
- All components write to same destination
- Retention policy for compliance
Alerting Rules
- Define thresholds and patterns
- denied_requests > 10/minute -> Alert
- same_user_different_locations -> Alert
- Integrate with PagerDuty/Slack

Hint Layer 5: Document and Automate Response

The final step is automating incident response:

Runbook Documentation
- For each alert type, document response steps
- Include commands to run, people to notify
- Test runbooks regularly
Automated Response
- Some responses should be automatic
- Device compromised -> Auto-block (no human needed)
- Credential suspected -> Auto-revoke, then investigate
Post-Incident Review
- After each incident, review what happened
- Update policies to prevent recurrence
- Share learnings across team

Project Specification

Functional Requirements

ID	Requirement	Acceptance Criteria
FR-1	Unified authentication	All access flows through P01 with consistent identity
FR-2	Policy-driven authorization	P02 evaluates all access requests against policy
FR-3	Device trust enforcement	P05 blocks unhealthy devices at entry point
FR-4	Encrypted service mesh	All service-to-service uses mTLS (P04)
FR-5	Remote access via ZTNA	P09 provides secure remote access
FR-6	Just-in-time database access	P08 provides ephemeral credentials
FR-7	Continuous monitoring	P06 detects and responds to anomalies
FR-8	Unified logging	All components log to central SIEM
FR-9	Automated alerting	Security events trigger appropriate alerts
FR-10	Incident response automation	Playbooks execute automatically or with guidance

Non-Functional Requirements

ID	Requirement	Target
NFR-1	End-to-end latency	< 100ms added by security stack
NFR-2	Availability	99.9% for security control plane
NFR-3	Recovery time	< 5 minutes for any single component failure
NFR-4	Log retention	90 days hot, 1 year cold storage
NFR-5	Alert response time	Critical: < 5 minutes, High: < 1 hour
NFR-6	Policy update propagation	< 30 seconds to all enforcement points

Component Checklist

Before considering the capstone complete, verify:

Phased Implementation Guide

Phase 1: Foundation (Days 1-3)

Goal: Establish the shared control plane infrastructure.

Deliverables:

Unified logging stack (Loki or Elasticsearch)
Event bus (Redis Streams or NATS)
Identity format defined (SPIFFE or custom)
Policy repository initialized (Git)

Steps:

Deploy logging infrastructure
Configure all components to log in consistent format
Set up event bus
Define identity and event schemas

Phase 2: Core Integration (Days 4-7)

Goal: Connect the core security components.

Deliverables:

P01 + P02 integration (proxy consults policy engine)
P05 + P09 integration (device trust gates tunnel access)
P04 mesh running between services

Steps:

Deploy P01, P02, P05, P09 in connected configuration
Verify request flow through all components
Test policy enforcement at each layer
Verify mTLS between all services

Phase 3: Access Management (Days 8-10)

Goal: Add JIT access and continuous authentication.

Deliverables:

P08 JIT broker integrated with policy engine
P06 continuous auth monitoring all sessions
Automatic session termination on anomaly

Steps:

Deploy P08 JIT broker
Integrate P08 with P02 for approval policies
Deploy P06 continuous authentication
Configure P06 to terminate sessions and emit events

Phase 4: Observability (Days 11-12)

Goal: Build the security observability stack.

Deliverables:

SIEM correlation rules
Security dashboards
Alert definitions
Integration with notification systems

Steps:

Define SIEM correlation rules for attack scenarios
Build Grafana dashboards for security metrics
Configure AlertManager with notification channels
Test alert flow end-to-end

Phase 5: Incident Response (Days 13-14)

Goal: Automate incident response.

Deliverables:

3+ incident response playbooks
Automated response for critical events
Documentation for manual procedures
Tested recovery procedures

Steps:

Document incident response playbooks
Implement automated responses (event -> action)
Conduct tabletop exercises
Create runbooks for recovery scenarios

Testing Strategy

Security Test Scenarios

Scenario	Expected Result	Components Tested
Valid user, healthy device	Access granted, logged	P01, P02, P05, P09
Valid user, unhealthy device	Blocked at tunnel, logged	P05, P09, SIEM
Invalid JWT	401 Unauthorized	P01
Unauthorized resource access	403 Forbidden	P01, P02
Lateral movement attempt	Blocked by micro-seg	P03
Behavior anomaly detected	Session terminated	P06
JIT credential expired	Access denied	P08, Database
Certificate revoked	Connection rejected	P04

Integration Test Scenarios

Scenario	Steps	Validation
End-to-end access	User -> ZTNA -> Proxy -> App	Response received, all logs present
Policy update propagation	Update policy in Git	All PEPs enforce within 30s
Device compromise response	Emit device_compromised event	All sessions terminated within 60s
Credential rotation	Trigger JIT rotation	Old creds fail, new creds work
Component failure recovery	Kill policy engine	Access denied (fail-closed), recovery < 5min

Performance Test Scenarios

Metric	Target	Test Method
Authentication latency	< 50ms	Load test P01 with 1000 RPS
Policy evaluation latency	< 20ms	Load test P02 with 1000 RPS
Tunnel throughput	> 100 Mbps	iperf through P09
Log ingestion rate	> 10,000 events/sec	Flood log collector
Alert latency	< 5 seconds	Measure event to alert time

Self-Assessment Checklist

Architecture Understanding

I can explain how all five NIST Zero Trust pillars are addressed
I can trace a request through every security layer
I can describe how compromise of one component is mitigated
I understand the control plane vs data plane distinction
I can assess our maturity on the CISA Zero Trust model

Integration Verification

All components share a consistent identity format
Events propagate correctly between components
Policy changes affect all enforcement points
Logs from all components are in the SIEM
Alerts fire correctly for security events

Security Verification

Unhealthy devices cannot access resources
Unauthorized access attempts are blocked and logged
Compromised sessions are terminated automatically
Lateral movement is prevented by micro-segmentation
All connections are encrypted with mTLS

Operations Readiness

Dashboards show security posture at a glance
Alerts are configured for all critical scenarios
Incident response playbooks are documented
Recovery procedures are tested
Compliance evidence can be generated

Books That Will Help

Book	Focus	Chapters
“Zero Trust Networks” by Gilman & Barth	Complete ZT architecture	All chapters
“Security in Computing” by Pfleeger	Security fundamentals	Ch. 1-5, Ch. 10
“The Practice of Network Security Monitoring” by Bejtlich	Security observability	Ch. 1-8
“Incident Response & Computer Forensics” by Luttgens	IR procedures	Ch. 1-5
“Practical Cloud Security” by Dotson	Cloud security architecture	Ch. 1-4
“Site Reliability Engineering” by Google	Operational excellence	Ch. 12-14

Interview Questions You Can Now Answer

“Describe how you would architect a Zero Trust environment for a 500-person company.”
“How do you ensure defense in depth when any single security control might fail?”
“Walk me through what happens when a user’s device is compromised mid-session.”
“How would you measure the maturity of a Zero Trust implementation?”
“What’s the difference between a security event and a security incident, and how do you handle each?”
“How do you balance security with user experience in a Zero Trust environment?”
“Describe your approach to security observability and incident response automation.”

Conclusion

This capstone represents the culmination of your Zero Trust learning journey. By integrating all nine previous projects, you’ve learned that security is not about individual components - it’s about how they work together to create overlapping, reinforcing controls.

You now understand:

How to architect complete Zero Trust environments
Why integration is more important than any single technology
How defense in depth prevents single points of failure
Why observability is essential for security
How to automate incident response for rapid containment

Most importantly, you can now think like a security architect - seeing the whole system, understanding how attacks flow, and designing defenses that work together to protect what matters.

This guide was expanded from ZERO_TRUST_ARCHITECTURE_DEEP_DIVE.md. For the complete learning path, see the project index.