Project 10: The Secure Enclave (Capstone)

Project 10: The Secure Enclave (Capstone)

Core Zero Trust Principle: “Trust nothing, verify everything, assume breach.” This capstone project integrates all nine previous projects into a cohesive, production-ready Zero Trust environment that demonstrates defense in depth, continuous verification, and complete security observability.


Project Overview

Attribute Value
Difficulty Level 5: Master
Time Estimate 2 Weeks (80-120 hours)
Primary Language Go, Rust, Python (Mixed)
Alternative Languages N/A - Uses components from P01-P09
Prerequisites All previous projects (P01-P09) completed or understood
Main Book “Zero Trust Networks” by Gilman & Barth
Software/Tool Docker, Kubernetes (optional), Terraform (optional)
Knowledge Area Security Architecture / System Integration

What You’re Building: A complete “Secure Enclave” - an integrated Zero Trust environment that protects a set of internal applications. This is not a new component; it’s the orchestration of all previous projects into a unified security architecture where every request is authenticated, every access is authorized, every connection is encrypted, and every action is logged.

Why It Matters: Individual security components are useless if they don’t work together. Real-world breaches happen in the gaps between security tools. This capstone teaches you to think like a security architect - designing systems where every layer reinforces the others, where compromise of one component doesn’t mean compromise of all.


Learning Objectives

By completing this project, you will be able to:

  1. Architect complete Zero Trust environments - Design systems where all five pillars (Identity, Device, Network, Application, Data) work together
  2. Integrate disparate security components - Connect identity proxies, policy engines, mTLS meshes, and access brokers into a unified system
  3. Implement defense in depth - Create overlapping security controls where failure of one doesn’t compromise the whole
  4. Design for security observability - Build unified logging, alerting, and incident response capabilities
  5. Apply the CISA Zero Trust Maturity Model - Measure and improve your architecture against industry standards
  6. Conduct security architecture reviews - Evaluate systems for gaps, weaknesses, and improvement opportunities
  7. Build incident response readiness - Create runbooks and tooling for responding to security events

Deep Theoretical Foundation

The Integration Challenge

Individual security components are necessary but not sufficient. Consider this scenario:

+--------------------------------------------------------------------------+
|              The Gap Between Components: Where Breaches Happen            |
+--------------------------------------------------------------------------+

SCENARIO: Attacker compromises a developer's laptop via phishing

WITHOUT INTEGRATION:
--------------------

  [ Laptop Compromised ]
        |
        v
  [ Identity Proxy (P01) ] - "Valid JWT from douglas@corp.com" - ALLOWS
        |
        | (But is the device healthy? No one checks!)
        v
  [ Internal App ] - "X-ZT-Identity: douglas@corp.com" - TRUSTS
        |
        | (But is this normal behavior for douglas? No one asks!)
        v
  [ Database ] - Attacker exfiltrates data
        |
        v
  [ Audit Logs ] - "douglas accessed 10,000 records"
                   (Noticed 3 weeks later during audit)


WITH FULL INTEGRATION (This Capstone):
--------------------------------------

  [ Laptop Compromised ]
        |
        v
  [ ZTNA Tunnel (P09) ] - Attempts connection
        |
        v
  [ Device Trust (P05) ] - "Device posture: MALWARE_DETECTED" - BLOCKS
        |
        X---> Connection terminated, alert sent
        |
  [ If malware evades detection... ]
        |
        v
  [ Identity Proxy (P01) ] - "Valid JWT from douglas@corp.com"
        |
        v
  [ Policy Engine (P02) ] - "Context: New device, unusual time, VPN IP"
        |                    "Risk score: HIGH"
        |                    "Action: REQUIRE_STEP_UP_AUTH"
        |
        X---> User prompted for additional verification
        |
  [ If attacker has MFA... ]
        |
        v
  [ Continuous Auth (P06) ] - "Behavior anomaly: 10x normal query rate"
        |                      "Action: TERMINATE_SESSION"
        |
        X---> Session killed, SOC alerted, incident created
        |
  [ Even if data accessed... ]
        |
        v
  [ JIT Access (P08) ] - Credential was scoped to specific tables
        |                Credential expired 30 minutes ago
        |
        v
  [ mTLS Mesh (P04) ] - All connections logged with identity
        |
        v
  [ Unified SIEM ] - Correlation: "Compromised device + stolen creds"
                     Auto-response: Revoke all douglas's access

+--------------------------------------------------------------------------+

Key Insight: Defense in depth means that an attacker must defeat EVERY control, not just one. Integration is what makes this possible.

The Five Pillars in Practice

Each previous project maps to one or more of NIST’s Zero Trust pillars. In the capstone, you’ll see how they interconnect:

+--------------------------------------------------------------------------+
|              NIST Zero Trust Pillars - Project Mapping                    |
+--------------------------------------------------------------------------+

  +------------------+      +------------------+      +------------------+
  |     IDENTITY     |      |      DEVICE      |      |     NETWORK      |
  |   (Who are you?) | <==> | (Is device safe?)| <==> |  (How connected?)|
  +------------------+      +------------------+      +------------------+
         |                         |                         |
    +----+----+               +----+----+               +----+----+
    |         |               |         |               |         |
  [P01]     [P06]           [P05]     [P05]           [P03]     [P04]
  Identity  Continuous      Device    Health          Micro-    mTLS
  Proxy     Auth            Trust     Attestation     Segment   Mesh
    |         |               |         |               |         |
    +----+----+               +----+----+               +----+----+
         |                         |                         |
         v                         v                         v
  +------------------+      +------------------+      +------------------+
  |   APPLICATION    |      |      DATA        |      |   VISIBILITY     |
  | (What can access)| <==> | (What's protected)| <==> | (What happened?) |
  +------------------+      +------------------+      +------------------+
         |                         |                         |
    +----+----+               +----+----+               +----+----+
    |         |               |         |               |         |
  [P02]     [P08]           [P07]     [P09]           [ALL]     [P10]
  Policy    JIT             SDP       ZTNA            Audit     SIEM
  Engine    Access          Control   Tunnel          Logs      Correlation
    |         |               |         |               |         |
    +----+----+               +----+----+               +----+----+
         |                         |                         |
         +-------------------------+-------------------------+
                                   |
                                   v
                      +------------------------+
                      |   INTEGRATED CONTROL   |
                      |        PLANE           |
                      |                        |
                      |  Policy as Code        |
                      |  Unified Identity      |
                      |  Centralized Logging   |
                      |  Automated Response    |
                      +------------------------+

+--------------------------------------------------------------------------+

The Control Plane Integration Pattern

All Zero Trust components share a common control plane. This is the “brain” that coordinates policy across all enforcement points:

+--------------------------------------------------------------------------+
|              Unified Control Plane Architecture                           |
+--------------------------------------------------------------------------+

                          +-------------------------+
                          |    CONTROL PLANE        |
                          |                         |
                          |  +-------------------+  |
                          |  | Policy Repository |  |
                          |  | (Git-versioned)   |  |
                          |  +-------------------+  |
                          |           |             |
                          |  +-------------------+  |
                          |  | Policy Engine     |  |
                          |  | (P02)             |  |
                          |  +-------------------+  |
                          |           |             |
                          |  +-------------------+  |
                          |  | Identity Provider |  |
                          |  | (SSO/OIDC)        |  |
                          |  +-------------------+  |
                          |           |             |
                          |  +-------------------+  |
                          |  | Device Registry   |  |
                          |  | (P05 metadata)    |  |
                          |  +-------------------+  |
                          |           |             |
                          |  +-------------------+  |
                          |  | SIEM / Logging    |  |
                          |  | (Unified)         |  |
                          |  +-------------------+  |
                          +-------------------------+
                                     |
         +---------------------------+---------------------------+
         |                           |                           |
         v                           v                           v
+------------------+      +------------------+      +------------------+
|   DATA PLANE     |      |   DATA PLANE     |      |   DATA PLANE     |
|   (P01 Proxy)    |      |   (P04 mTLS)     |      |   (P09 Tunnel)   |
|                  |      |                  |      |                  |
| - Enforce policy |      | - Enforce mTLS   |      | - Enforce access |
| - Log access     |      | - Log connections|      | - Log tunnel use |
| - Query PDP      |      | - Verify certs   |      | - Route traffic  |
+------------------+      +------------------+      +------------------+
         |                           |                           |
         v                           v                           v
    [Protected         [Service-to-Service    [Remote User
     Application]        Communication]         Access]

+--------------------------------------------------------------------------+

Real World Outcome

After completing this capstone, you will have a fully operational Zero Trust environment. Here’s what it looks like in practice:

Demo Scenario: Developer Accessing Production

# Terminal 1: The Secure Enclave is running
$ docker-compose -f secure-enclave.yml up
[P01-proxy]      | Listening on :8443 (mTLS enabled)
[P02-policy]     | Policy engine ready, 47 policies loaded
[P04-mesh]       | mTLS mesh controller running
[P05-device]     | Device trust service ready
[P06-auth]       | Continuous authentication monitor active
[P08-jit]        | JIT access broker running
[P09-tunnel]     | ZTNA gateway accepting connections
[siem]           | Unified logging ingesting from all components
[alerting]       | Alert manager connected to Slack/PagerDuty
# Terminal 2: Developer's machine with ZTNA client
$ ztna-client connect --config corp.yaml
[INFO] Device health check: PASSED (macOS 14.2, FileVault enabled, no malware)
[INFO] mTLS certificate loaded: spiffe://corp.com/user/douglas/device/macbook-42
[INFO] Connected to gateway.corp.com
[INFO] Ready to access: jira.internal, gitlab.internal, postgres-prod.internal
# Terminal 3: Developer requests database access
$ jit request --db postgres-prod --tables orders --reason "PROD-1234"
[INFO] Request submitted: req-abc123
[INFO] Auto-approval policy matched: "senior_engineer_readonly_prod"
[INFO] Credentials generated (TTL: 30 minutes)

Username: jit_douglas_20241227_x7y8z9
Password: Kj8mN2pQ5rT7vX9yB3dF6gH8jL0nM1oP2qR3sT4u
Expires:  2024-12-27T15:30:00Z

$ psql "postgresql://jit_douglas_20241227_x7y8z9:Kj8m...@postgres-prod.internal/app"
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384)
app=> SELECT * FROM orders WHERE id = 12345;
 id    | status  | created_at
-------+---------+------------
 12345 | shipped | 2024-12-26
(1 row)
# SIEM shows the full trace
$ curl https://siem.corp.com/api/trace/douglas/today | jq
{
  "user": "douglas@corp.com",
  "device": "macbook-42",
  "events": [
    {
      "time": "14:58:00Z",
      "component": "P05-device",
      "event": "device_health_check",
      "result": "PASS",
      "details": {"os": "macOS 14.2", "encryption": true, "firewall": true}
    },
    {
      "time": "14:58:01Z",
      "component": "P09-tunnel",
      "event": "tunnel_established",
      "client_ip": "203.0.113.42",
      "gateway": "gateway-us-west-2"
    },
    {
      "time": "14:59:00Z",
      "component": "P08-jit",
      "event": "access_requested",
      "database": "postgres-prod",
      "permissions": ["SELECT"],
      "tables": ["orders"]
    },
    {
      "time": "14:59:01Z",
      "component": "P02-policy",
      "event": "policy_evaluated",
      "policy": "senior_engineer_readonly_prod",
      "decision": "ALLOW"
    },
    {
      "time": "14:59:02Z",
      "component": "P08-jit",
      "event": "credential_created",
      "temp_user": "jit_douglas_20241227_x7y8z9",
      "ttl_minutes": 30
    },
    {
      "time": "14:59:30Z",
      "component": "P04-mesh",
      "event": "connection_established",
      "source": "macbook-42",
      "destination": "postgres-prod",
      "mtls_verified": true
    },
    {
      "time": "14:59:31Z",
      "component": "postgres",
      "event": "query_executed",
      "user": "jit_douglas_20241227_x7y8z9",
      "query": "SELECT * FROM orders WHERE id = 12345",
      "rows_returned": 1
    }
  ]
}

The Core Question You’re Answering

“How do all the pieces of Zero Trust architecture fit together to create a truly secure environment where every access is verified, every connection is encrypted, and every action is audited?”

This capstone integrates Projects P01-P09 because individually, each component solves only one piece of the Zero Trust puzzle:

  • P01 (Identity-Aware Proxy) verifies WHO is making requests, but doesn’t know if their device is compromised
  • P02 (Policy Engine) makes authorization decisions, but needs identity and context from other components
  • P03 (Micro-segmentation) prevents lateral movement, but doesn’t understand application-layer identity
  • P04 (mTLS Mesh) encrypts connections, but relies on the PKI and identity systems to work correctly
  • P05 (Device Trust) validates endpoints, but must feed into policy decisions to be actionable
  • P06 (Continuous Authentication) detects anomalies, but needs integration to actually block access
  • P07 (SDP Controller) hides services, but must coordinate with identity and policy systems
  • P08 (JIT Access) eliminates standing privileges, but must be governed by policy and logged for audit
  • P09 (ZTNA Tunnel) provides secure remote access, but is the integration point for all other controls

The capstone is where you learn that security architecture is not about components - it’s about how they work together. A chain of security controls is only as strong as the integration between its links.


Concepts You Must Understand First

Before implementing this capstone, ensure you deeply understand these integration concepts:

1. System Integration Patterns

Understanding how distributed systems communicate and share state:

+--------------------------------------------------------------------------+
|              Integration Pattern: Event-Driven Security                   |
+--------------------------------------------------------------------------+

PATTERN: Event Bus for Security Signals
---------------------------------------

All components publish security events to a central bus:

  [P05: Device Trust]         [P06: Continuous Auth]
         |                            |
         | "device_compromised"       | "behavior_anomaly"
         v                            v
  +------------------------------------------+
  |           SECURITY EVENT BUS             |
  |  (Kafka, Redis Streams, or NATS)         |
  +------------------------------------------+
         |                            |
         | Subscribe                  | Subscribe
         v                            v
  [P01: Identity Proxy]         [P08: JIT Broker]
         |                            |
         | Block sessions             | Revoke credentials
         | for device                 | for user

This pattern enables REACTIVE security:
- Device compromised -> All access revoked within seconds
- Behavior anomaly -> Step-up auth required immediately
- Policy change -> All PEPs updated atomically

+--------------------------------------------------------------------------+

2. End-to-End Security Architecture

The complete request lifecycle through all security layers:

+--------------------------------------------------------------------------+
|              End-to-End Request Flow                                      |
+--------------------------------------------------------------------------+

  USER REQUEST
       |
       v
  +--[Layer 1: Device]--+
  | P05 Health Check    |
  | - OS patched?       |
  | - Firewall enabled? |
  | - Malware scan?     |
  +---------------------+
       | PASS
       v
  +--[Layer 2: Tunnel]--+
  | P09 ZTNA Client     |
  | - mTLS to gateway   |
  | - Domain routing    |
  +---------------------+
       | CONNECTED
       v
  +--[Layer 3: Identity]+
  | P01 Identity Proxy  |
  | - JWT validation    |
  | - Header injection  |
  +---------------------+
       | AUTHENTICATED
       v
  +--[Layer 4: Policy]--+
  | P02 Policy Engine   |
  | - ABAC evaluation   |
  | - Risk assessment   |
  +---------------------+
       | AUTHORIZED
       v
  +--[Layer 5: Network]-+
  | P03 Micro-segment   |
  | - eBPF/iptables     |
  | P04 mTLS Mesh       |
  | - Service identity  |
  +---------------------+
       | ENCRYPTED
       v
  +--[Layer 6: Access]--+
  | P08 JIT Credentials |
  | - Ephemeral access  |
  | - Scoped permissions|
  +---------------------+
       | PROVISIONED
       v
  +--[Layer 7: Observe]-+
  | P06 Continuous Auth |
  | - Behavior monitor  |
  | - Anomaly detection |
  +---------------------+
       | MONITORED
       v
  PROTECTED RESOURCE

Every layer can DENY. Every layer LOGS.

+--------------------------------------------------------------------------+

3. Defense in Depth

Overlapping controls that assume each layer might fail:

+--------------------------------------------------------------------------+
|              Defense in Depth: Overlapping Controls                       |
+--------------------------------------------------------------------------+

PRINCIPLE: If one control fails, others catch the breach

EXAMPLE: Compromised User Credential

  Control 1: Device Trust (P05)
  +------------------------------------------+
  | Expected: Block unmanaged/unhealthy device|
  | Bypassed: Attacker uses managed device    |
  +------------------------------------------+
           | (Control failed)
           v
  Control 2: Continuous Auth (P06)
  +------------------------------------------+
  | Expected: Detect behavior anomaly         |
  | Catches: Unusual access patterns flagged  |
  +------------------------------------------+
           | (Anomaly detected!)
           v
  Control 3: Policy Engine (P02)
  +------------------------------------------+
  | Response: Risk score elevated             |
  | Action: Require MFA re-verification       |
  +------------------------------------------+
           | (Attacker blocked at MFA)
           v
  BREACH PREVENTED

Even if Control 2 missed the anomaly:

  Control 4: JIT Access (P08)
  +------------------------------------------+
  | Limit: Credential only valid 30 minutes   |
  | Scope: Only SELECT on specific tables     |
  +------------------------------------------+
           | (Blast radius limited)
           v
  Control 5: Audit Logging
  +------------------------------------------+
  | Record: All queries with full context     |
  | Alert: SIEM correlation triggers alert    |
  +------------------------------------------+
           | (Incident detected)
           v
  Control 6: Automated Response
  +------------------------------------------+
  | Action: Revoke all user credentials       |
  | Action: Terminate all active sessions     |
  | Action: Alert SOC team                    |
  +------------------------------------------+

+--------------------------------------------------------------------------+

4. Security Observability

Unified visibility across all security components:

+--------------------------------------------------------------------------+
|              Security Observability Stack                                 |
+--------------------------------------------------------------------------+

                    +----------------------+
                    |    DASHBOARDS        |
                    |    (Grafana)         |
                    +----------------------+
                              |
                    +----------------------+
                    |    ALERTING          |
                    |    (AlertManager)    |
                    +----------------------+
                              |
                    +----------------------+
                    |    CORRELATION       |
                    |    (SIEM Rules)      |
                    +----------------------+
                              |
                    +----------------------+
                    |    LOG AGGREGATION   |
                    |    (Elasticsearch/   |
                    |     Loki/Splunk)     |
                    +----------------------+
                              ^
         +--------------------+--------------------+
         |                    |                    |
    P01 Logs             P02 Logs            P04-P09 Logs
    - Access             - Decisions          - All other
    - Deny               - Policy hits          components
    - Identity           - Risk scores

KEY METRICS TO OBSERVE:
-----------------------
1. Access Patterns
   - Requests per user/device/time
   - Denied vs allowed ratio
   - New device/location access

2. Policy Effectiveness
   - Policies triggered
   - Auto-approve vs manual approval
   - Time to approval

3. Security Events
   - Device health failures
   - Authentication anomalies
   - Revoked credentials

4. System Health
   - Component latency
   - Error rates
   - Certificate expiration

+--------------------------------------------------------------------------+

5. Incident Response Readiness

Pre-built playbooks for common security scenarios:

+--------------------------------------------------------------------------+
|              Incident Response Playbooks                                  |
+--------------------------------------------------------------------------+

PLAYBOOK 1: Compromised User Account
-------------------------------------
Trigger: P06 detects impossible travel or credential stuffing

Steps:
1. [AUTOMATIC] Terminate all active sessions for user
2. [AUTOMATIC] Revoke all JIT credentials for user
3. [AUTOMATIC] Block user at Identity Proxy
4. [ALERT] Notify SOC and user's manager
5. [MANUAL] Investigate activity in SIEM
6. [MANUAL] Reset user credentials if legitimate
7. [MANUAL] Re-enable access with step-up verification


PLAYBOOK 2: Compromised Device
------------------------------
Trigger: P05 reports malware detected or failed health check

Steps:
1. [AUTOMATIC] Block device at ZTNA gateway
2. [AUTOMATIC] Terminate tunnel connection
3. [AUTOMATIC] Revoke device certificate
4. [ALERT] Notify user and IT security
5. [MANUAL] Quarantine device for forensics
6. [MANUAL] Reimage device
7. [MANUAL] Re-enroll with new certificate


PLAYBOOK 3: Suspicious Database Access
---------------------------------------
Trigger: P08 detects unusual query patterns

Steps:
1. [AUTOMATIC] Revoke JIT credential immediately
2. [AUTOMATIC] Log complete query history
3. [ALERT] Notify DBA and SOC
4. [MANUAL] Review data accessed
5. [MANUAL] Assess data breach notification requirements
6. [MANUAL] Update policies to prevent recurrence


PLAYBOOK 4: Certificate Compromise
----------------------------------
Trigger: P04 detects certificate misuse or CA compromise

Steps:
1. [AUTOMATIC] Revoke compromised certificates
2. [AUTOMATIC] Force re-authentication for all services
3. [AUTOMATIC] Issue new certificates to legitimate entities
4. [ALERT] Notify security team
5. [MANUAL] Investigate root cause
6. [MANUAL] Rotate CA if necessary

+--------------------------------------------------------------------------+

6. Zero Trust Maturity Model (CISA)

The CISA Zero Trust Maturity Model provides a framework for measuring your implementation:

+--------------------------------------------------------------------------+
|              CISA Zero Trust Maturity Model - Assessment                  |
+--------------------------------------------------------------------------+

MATURITY LEVELS:
----------------
1. Traditional - Perimeter-based, minimal ZT
2. Initial     - Starting ZT journey, some components
3. Advanced    - Integrated ZT, automation beginning
4. Optimal     - Full ZT, continuous improvement

PILLAR ASSESSMENT:

IDENTITY
--------
Traditional: Passwords only
Initial:     SSO + MFA for some apps
Advanced:    Risk-based MFA, continuous verification [P01, P06]
Optimal:     Passwordless, phishing-resistant MFA

Your implementation: [P01 Identity Proxy + P06 Continuous Auth]


DEVICE
------
Traditional: No device validation
Initial:     Inventory, basic MDM
Advanced:    Real-time posture, conditional access [P05]
Optimal:     Automated remediation, hardware attestation

Your implementation: [P05 Device Trust]


NETWORK
-------
Traditional: Perimeter firewall only
Initial:     Network segmentation
Advanced:    Micro-segmentation, encrypted traffic [P03, P04]
Optimal:     Software-defined, identity-aware networking

Your implementation: [P03 Micro-segmentation + P04 mTLS + P07 SDP]


APPLICATION
-----------
Traditional: VPN for remote access
Initial:     ZTNA for some apps
Advanced:    Per-app access, continuous authorization [P02, P09]
Optimal:     Dynamic policy, threat intelligence integrated

Your implementation: [P02 Policy Engine + P08 JIT + P09 ZTNA]


DATA
----
Traditional: Perimeter-based data protection
Initial:     Classification, DLP basics
Advanced:    Automatic classification, encryption everywhere
Optimal:     Dynamic data policies, user behavior analytics

Your implementation: [All projects contribute + SIEM]


VISIBILITY
----------
Traditional: Siloed logs
Initial:     Centralized logging
Advanced:    Correlation, automated alerting [P10 Capstone]
Optimal:     ML-based threat detection, automated response

Your implementation: [Unified SIEM + Alerting + Playbooks]

+--------------------------------------------------------------------------+

Questions to Guide Your Design

Before implementing, work through these design questions:

Integration Questions

  1. Event Flow: How will security events propagate between components? (Event bus? API calls? Shared database?)

  2. Identity Consistency: How will user identity be represented consistently across all components? (SPIFFE ID? JWT claims? Custom format?)

  3. Policy Distribution: How will policy changes propagate to all enforcement points? (Push? Pull? Event-driven?)

  4. Certificate Management: How will certificates be issued, rotated, and revoked across the mesh? (Central CA? Distributed?)

  5. State Sharing: What state needs to be shared between components? (Session state? Device state? Risk scores?)

Testing Questions

  1. Component Testing: How will you verify each component works in isolation?

  2. Integration Testing: How will you test the interactions between components?

  3. Security Testing: How will you verify the security controls actually work? (Penetration testing scenarios?)

  4. Failure Testing: What happens when each component fails? (Chaos engineering?)

  5. Performance Testing: What’s the latency overhead of the full security stack?

Operations Questions

  1. Deployment: How will you deploy and update components without breaking security? (Blue-green? Canary?)

  2. Monitoring: What dashboards and alerts do you need? (Per-component? Cross-component?)

  3. Incident Response: What playbooks do you need for common scenarios?

  4. Compliance: How will you demonstrate compliance with SOC2/ISO27001/etc.?

  5. Disaster Recovery: How will you recover if a critical component fails?


Thinking Exercise: Architecture Design

Before writing any code, complete this design exercise on paper or whiteboard:

Exercise: Design the Complete Architecture

Task: Draw a complete architecture diagram showing:

  1. All Components: Where P01-P09 components are deployed
  2. Data Flows: How requests flow through the system
  3. Control Flows: How policy decisions are made and enforced
  4. Event Flows: How security events propagate
  5. Trust Boundaries: Where trust is established and verified

Guiding Questions:

  1. Where is the user? (Remote? Office? Mobile?)
  2. How do they authenticate? (SSO? MFA? Certificate?)
  3. How does their request reach the internal application?
  4. What checks happen at each step?
  5. What gets logged at each step?
  6. What happens if any check fails?
  7. What happens if the user’s device is compromised mid-session?

Deliverable: A diagram similar to this (but with YOUR understanding):

+--------------------------------------------------------------------------+
|              YOUR Secure Enclave Architecture                             |
+--------------------------------------------------------------------------+

  [Remote User]                    [Office User]
       |                                 |
       v                                 v
  +----+----+                      +-----+-----+
  | ZTNA    |                      | Direct    |
  | Client  |                      | Network   |
  | (P09)   |                      |           |
  +----+----+                      +-----+-----+
       |                                 |
       | mTLS                            | mTLS
       v                                 v
  +------------------------------------------------+
  |            SECURITY GATEWAY TIER                |
  |                                                 |
  |  +----------+  +----------+  +----------+      |
  |  | Identity |  | Policy   |  | Device   |      |
  |  | Proxy    |  | Engine   |  | Trust    |      |
  |  | (P01)    |  | (P02)    |  | (P05)    |      |
  |  +----------+  +----------+  +----------+      |
  |       |              |              |          |
  |       +-------+------+------+-------+          |
  |               |                                |
  +---------------|--------------------------------+
                  | Authorized & Verified
                  v
  +------------------------------------------------+
  |            SERVICE TIER                         |
  |                                                 |
  |  +----------+  +----------+  +----------+      |
  |  | App A    |  | App B    |  | Database |      |
  |  |          |  |          |  | (via JIT)|      |
  |  +----------+  +----------+  +----------+      |
  |       ^              ^              ^          |
  |       |    mTLS (P04)     |              |          |
  |       +-------+------+------+-------+          |
  |               |                                |
  |  +-------------------------------------------+ |
  |  | Micro-segmentation (P03) - eBPF/iptables | |
  |  +-------------------------------------------+ |
  |                                                 |
  +------------------------------------------------+
                  |
                  v
  +------------------------------------------------+
  |            OBSERVABILITY TIER                   |
  |                                                 |
  |  +----------+  +----------+  +----------+      |
  |  | Log      |  | SIEM     |  | Alert    |      |
  |  | Collector|  | Correlate|  | Manager  |      |
  |  +----------+  +----------+  +----------+      |
  |                                                 |
  +------------------------------------------------+

+--------------------------------------------------------------------------+

Hints in Layers: Progressive Implementation Guidance

Hint Layer 1: Start with the Control Plane

Before integrating data plane components, establish the shared control plane:

  1. Unified Identity: All components use the same identity representation
    • Define your identity format (SPIFFE recommended)
    • All components validate the same JWT issuer
    • User identity flows through every request
  2. Shared Policy Repository: All policies in one place
    • Git repository for policy-as-code
    • Policies reference the same identity format
    • Version control enables rollback
  3. Centralized Logging: All components log to the same destination
    • Structured JSON logs with consistent schema
    • Common fields: timestamp, component, user, device, action, result
    • Correlation ID traces requests across components

Hint Layer 2: Integration by Event

Connect components through events rather than direct API calls:

  1. Event Schema: Define a common event format
    {
      "event_type": "access_denied",
      "timestamp": "2024-12-27T15:00:00Z",
      "component": "P01-identity-proxy",
      "correlation_id": "req-abc123",
      "subject": {
        "user": "douglas@corp.com",
        "device": "macbook-42"
      },
      "object": {
        "resource": "jira.internal",
        "action": "GET /admin"
      },
      "result": {
        "decision": "DENY",
        "reason": "insufficient_permissions"
      }
    }
    
  2. Event Bus: Use Redis Streams, Kafka, or NATS
    • Components publish events to topics
    • Other components subscribe and react
    • Enables loose coupling
  3. Reaction Rules: Define what happens when events occur
    • device_compromised -> Revoke all sessions for device
    • user_anomaly -> Elevate risk score, require step-up

Hint Layer 3: Test the Attack Scenarios

Validate your integration by simulating attacks:

  1. Stolen Credential Attack
    • Generate valid JWT for “attacker”
    • Attempt access from new device -> P05 should block
    • Attempt access at unusual time -> P06 should flag
    • Attempt access to unauthorized resource -> P02 should deny
  2. Compromised Device Attack
    • Simulate malware detection on device
    • Verify device is blocked at ZTNA gateway
    • Verify all sessions for device are terminated
    • Verify credentials are revoked
  3. Lateral Movement Attack
    • Gain access to one service
    • Attempt to reach other services -> P03 should block
    • Attempt to scan internal network -> P03 should block
    • Verify all attempts are logged

Hint Layer 4: Build the Observability Stack

You can’t secure what you can’t see:

  1. Metrics Collection
    • Prometheus for time-series metrics
    • Each component exports metrics
    • Custom dashboards in Grafana
  2. Log Aggregation
    • Loki, Elasticsearch, or Splunk
    • All components write to same destination
    • Retention policy for compliance
  3. Alerting Rules
    • Define thresholds and patterns
    • denied_requests > 10/minute -> Alert
    • same_user_different_locations -> Alert
    • Integrate with PagerDuty/Slack

Hint Layer 5: Document and Automate Response

The final step is automating incident response:

  1. Runbook Documentation
    • For each alert type, document response steps
    • Include commands to run, people to notify
    • Test runbooks regularly
  2. Automated Response
    • Some responses should be automatic
    • Device compromised -> Auto-block (no human needed)
    • Credential suspected -> Auto-revoke, then investigate
  3. Post-Incident Review
    • After each incident, review what happened
    • Update policies to prevent recurrence
    • Share learnings across team

Project Specification

Functional Requirements

ID Requirement Acceptance Criteria
FR-1 Unified authentication All access flows through P01 with consistent identity
FR-2 Policy-driven authorization P02 evaluates all access requests against policy
FR-3 Device trust enforcement P05 blocks unhealthy devices at entry point
FR-4 Encrypted service mesh All service-to-service uses mTLS (P04)
FR-5 Remote access via ZTNA P09 provides secure remote access
FR-6 Just-in-time database access P08 provides ephemeral credentials
FR-7 Continuous monitoring P06 detects and responds to anomalies
FR-8 Unified logging All components log to central SIEM
FR-9 Automated alerting Security events trigger appropriate alerts
FR-10 Incident response automation Playbooks execute automatically or with guidance

Non-Functional Requirements

ID Requirement Target
NFR-1 End-to-end latency < 100ms added by security stack
NFR-2 Availability 99.9% for security control plane
NFR-3 Recovery time < 5 minutes for any single component failure
NFR-4 Log retention 90 days hot, 1 year cold storage
NFR-5 Alert response time Critical: < 5 minutes, High: < 1 hour
NFR-6 Policy update propagation < 30 seconds to all enforcement points

Component Checklist

Before considering the capstone complete, verify:

  • P01 (Identity Proxy) is deployed and validating JWTs
  • P02 (Policy Engine) is evaluating all access requests
  • P03 (Micro-segmentation) is enforcing network policies
  • P04 (mTLS Mesh) is encrypting all service communication
  • P05 (Device Trust) is validating device health
  • P06 (Continuous Auth) is monitoring user behavior
  • P07 (SDP Controller) is hiding services from unauthorized users
  • P08 (JIT Access) is providing ephemeral database credentials
  • P09 (ZTNA Tunnel) is providing secure remote access
  • All components are logging to a unified SIEM
  • Alerts are configured for key security events
  • At least 3 incident response playbooks are documented

Phased Implementation Guide

Phase 1: Foundation (Days 1-3)

Goal: Establish the shared control plane infrastructure.

Deliverables:

  • Unified logging stack (Loki or Elasticsearch)
  • Event bus (Redis Streams or NATS)
  • Identity format defined (SPIFFE or custom)
  • Policy repository initialized (Git)

Steps:

  1. Deploy logging infrastructure
  2. Configure all components to log in consistent format
  3. Set up event bus
  4. Define identity and event schemas

Phase 2: Core Integration (Days 4-7)

Goal: Connect the core security components.

Deliverables:

  • P01 + P02 integration (proxy consults policy engine)
  • P05 + P09 integration (device trust gates tunnel access)
  • P04 mesh running between services

Steps:

  1. Deploy P01, P02, P05, P09 in connected configuration
  2. Verify request flow through all components
  3. Test policy enforcement at each layer
  4. Verify mTLS between all services

Phase 3: Access Management (Days 8-10)

Goal: Add JIT access and continuous authentication.

Deliverables:

  • P08 JIT broker integrated with policy engine
  • P06 continuous auth monitoring all sessions
  • Automatic session termination on anomaly

Steps:

  1. Deploy P08 JIT broker
  2. Integrate P08 with P02 for approval policies
  3. Deploy P06 continuous authentication
  4. Configure P06 to terminate sessions and emit events

Phase 4: Observability (Days 11-12)

Goal: Build the security observability stack.

Deliverables:

  • SIEM correlation rules
  • Security dashboards
  • Alert definitions
  • Integration with notification systems

Steps:

  1. Define SIEM correlation rules for attack scenarios
  2. Build Grafana dashboards for security metrics
  3. Configure AlertManager with notification channels
  4. Test alert flow end-to-end

Phase 5: Incident Response (Days 13-14)

Goal: Automate incident response.

Deliverables:

  • 3+ incident response playbooks
  • Automated response for critical events
  • Documentation for manual procedures
  • Tested recovery procedures

Steps:

  1. Document incident response playbooks
  2. Implement automated responses (event -> action)
  3. Conduct tabletop exercises
  4. Create runbooks for recovery scenarios

Testing Strategy

Security Test Scenarios

Scenario Expected Result Components Tested
Valid user, healthy device Access granted, logged P01, P02, P05, P09
Valid user, unhealthy device Blocked at tunnel, logged P05, P09, SIEM
Invalid JWT 401 Unauthorized P01
Unauthorized resource access 403 Forbidden P01, P02
Lateral movement attempt Blocked by micro-seg P03
Behavior anomaly detected Session terminated P06
JIT credential expired Access denied P08, Database
Certificate revoked Connection rejected P04

Integration Test Scenarios

Scenario Steps Validation
End-to-end access User -> ZTNA -> Proxy -> App Response received, all logs present
Policy update propagation Update policy in Git All PEPs enforce within 30s
Device compromise response Emit device_compromised event All sessions terminated within 60s
Credential rotation Trigger JIT rotation Old creds fail, new creds work
Component failure recovery Kill policy engine Access denied (fail-closed), recovery < 5min

Performance Test Scenarios

Metric Target Test Method
Authentication latency < 50ms Load test P01 with 1000 RPS
Policy evaluation latency < 20ms Load test P02 with 1000 RPS
Tunnel throughput > 100 Mbps iperf through P09
Log ingestion rate > 10,000 events/sec Flood log collector
Alert latency < 5 seconds Measure event to alert time

Self-Assessment Checklist

Architecture Understanding

  • I can explain how all five NIST Zero Trust pillars are addressed
  • I can trace a request through every security layer
  • I can describe how compromise of one component is mitigated
  • I understand the control plane vs data plane distinction
  • I can assess our maturity on the CISA Zero Trust model

Integration Verification

  • All components share a consistent identity format
  • Events propagate correctly between components
  • Policy changes affect all enforcement points
  • Logs from all components are in the SIEM
  • Alerts fire correctly for security events

Security Verification

  • Unhealthy devices cannot access resources
  • Unauthorized access attempts are blocked and logged
  • Compromised sessions are terminated automatically
  • Lateral movement is prevented by micro-segmentation
  • All connections are encrypted with mTLS

Operations Readiness

  • Dashboards show security posture at a glance
  • Alerts are configured for all critical scenarios
  • Incident response playbooks are documented
  • Recovery procedures are tested
  • Compliance evidence can be generated

Books That Will Help

Book Focus Chapters
“Zero Trust Networks” by Gilman & Barth Complete ZT architecture All chapters
“Security in Computing” by Pfleeger Security fundamentals Ch. 1-5, Ch. 10
“The Practice of Network Security Monitoring” by Bejtlich Security observability Ch. 1-8
“Incident Response & Computer Forensics” by Luttgens IR procedures Ch. 1-5
“Practical Cloud Security” by Dotson Cloud security architecture Ch. 1-4
“Site Reliability Engineering” by Google Operational excellence Ch. 12-14

Interview Questions You Can Now Answer

  1. “Describe how you would architect a Zero Trust environment for a 500-person company.”

  2. “How do you ensure defense in depth when any single security control might fail?”

  3. “Walk me through what happens when a user’s device is compromised mid-session.”

  4. “How would you measure the maturity of a Zero Trust implementation?”

  5. “What’s the difference between a security event and a security incident, and how do you handle each?”

  6. “How do you balance security with user experience in a Zero Trust environment?”

  7. “Describe your approach to security observability and incident response automation.”


Conclusion

This capstone represents the culmination of your Zero Trust learning journey. By integrating all nine previous projects, you’ve learned that security is not about individual components - it’s about how they work together to create overlapping, reinforcing controls.

You now understand:

  • How to architect complete Zero Trust environments
  • Why integration is more important than any single technology
  • How defense in depth prevents single points of failure
  • Why observability is essential for security
  • How to automate incident response for rapid containment

Most importantly, you can now think like a security architect - seeing the whole system, understanding how attacks flow, and designing defenses that work together to protect what matters.


This guide was expanded from ZERO_TRUST_ARCHITECTURE_DEEP_DIVE.md. For the complete learning path, see the project index.