Project 10: The Secure Enclave (Capstone)
Project 10: The Secure Enclave (Capstone)
Core Zero Trust Principle: âTrust nothing, verify everything, assume breach.â This capstone project integrates all nine previous projects into a cohesive, production-ready Zero Trust environment that demonstrates defense in depth, continuous verification, and complete security observability.
Project Overview
| Attribute | Value |
|---|---|
| Difficulty | Level 5: Master |
| Time Estimate | 2 Weeks (80-120 hours) |
| Primary Language | Go, Rust, Python (Mixed) |
| Alternative Languages | N/A - Uses components from P01-P09 |
| Prerequisites | All previous projects (P01-P09) completed or understood |
| Main Book | âZero Trust Networksâ by Gilman & Barth |
| Software/Tool | Docker, Kubernetes (optional), Terraform (optional) |
| Knowledge Area | Security Architecture / System Integration |
What Youâre Building: A complete âSecure Enclaveâ - an integrated Zero Trust environment that protects a set of internal applications. This is not a new component; itâs the orchestration of all previous projects into a unified security architecture where every request is authenticated, every access is authorized, every connection is encrypted, and every action is logged.
Why It Matters: Individual security components are useless if they donât work together. Real-world breaches happen in the gaps between security tools. This capstone teaches you to think like a security architect - designing systems where every layer reinforces the others, where compromise of one component doesnât mean compromise of all.
Learning Objectives
By completing this project, you will be able to:
- Architect complete Zero Trust environments - Design systems where all five pillars (Identity, Device, Network, Application, Data) work together
- Integrate disparate security components - Connect identity proxies, policy engines, mTLS meshes, and access brokers into a unified system
- Implement defense in depth - Create overlapping security controls where failure of one doesnât compromise the whole
- Design for security observability - Build unified logging, alerting, and incident response capabilities
- Apply the CISA Zero Trust Maturity Model - Measure and improve your architecture against industry standards
- Conduct security architecture reviews - Evaluate systems for gaps, weaknesses, and improvement opportunities
- Build incident response readiness - Create runbooks and tooling for responding to security events
Deep Theoretical Foundation
The Integration Challenge
Individual security components are necessary but not sufficient. Consider this scenario:
+--------------------------------------------------------------------------+
| The Gap Between Components: Where Breaches Happen |
+--------------------------------------------------------------------------+
SCENARIO: Attacker compromises a developer's laptop via phishing
WITHOUT INTEGRATION:
--------------------
[ Laptop Compromised ]
|
v
[ Identity Proxy (P01) ] - "Valid JWT from douglas@corp.com" - ALLOWS
|
| (But is the device healthy? No one checks!)
v
[ Internal App ] - "X-ZT-Identity: douglas@corp.com" - TRUSTS
|
| (But is this normal behavior for douglas? No one asks!)
v
[ Database ] - Attacker exfiltrates data
|
v
[ Audit Logs ] - "douglas accessed 10,000 records"
(Noticed 3 weeks later during audit)
WITH FULL INTEGRATION (This Capstone):
--------------------------------------
[ Laptop Compromised ]
|
v
[ ZTNA Tunnel (P09) ] - Attempts connection
|
v
[ Device Trust (P05) ] - "Device posture: MALWARE_DETECTED" - BLOCKS
|
X---> Connection terminated, alert sent
|
[ If malware evades detection... ]
|
v
[ Identity Proxy (P01) ] - "Valid JWT from douglas@corp.com"
|
v
[ Policy Engine (P02) ] - "Context: New device, unusual time, VPN IP"
| "Risk score: HIGH"
| "Action: REQUIRE_STEP_UP_AUTH"
|
X---> User prompted for additional verification
|
[ If attacker has MFA... ]
|
v
[ Continuous Auth (P06) ] - "Behavior anomaly: 10x normal query rate"
| "Action: TERMINATE_SESSION"
|
X---> Session killed, SOC alerted, incident created
|
[ Even if data accessed... ]
|
v
[ JIT Access (P08) ] - Credential was scoped to specific tables
| Credential expired 30 minutes ago
|
v
[ mTLS Mesh (P04) ] - All connections logged with identity
|
v
[ Unified SIEM ] - Correlation: "Compromised device + stolen creds"
Auto-response: Revoke all douglas's access
+--------------------------------------------------------------------------+
Key Insight: Defense in depth means that an attacker must defeat EVERY control, not just one. Integration is what makes this possible.
The Five Pillars in Practice
Each previous project maps to one or more of NISTâs Zero Trust pillars. In the capstone, youâll see how they interconnect:
+--------------------------------------------------------------------------+
| NIST Zero Trust Pillars - Project Mapping |
+--------------------------------------------------------------------------+
+------------------+ +------------------+ +------------------+
| IDENTITY | | DEVICE | | NETWORK |
| (Who are you?) | <==> | (Is device safe?)| <==> | (How connected?)|
+------------------+ +------------------+ +------------------+
| | |
+----+----+ +----+----+ +----+----+
| | | | | |
[P01] [P06] [P05] [P05] [P03] [P04]
Identity Continuous Device Health Micro- mTLS
Proxy Auth Trust Attestation Segment Mesh
| | | | | |
+----+----+ +----+----+ +----+----+
| | |
v v v
+------------------+ +------------------+ +------------------+
| APPLICATION | | DATA | | VISIBILITY |
| (What can access)| <==> | (What's protected)| <==> | (What happened?) |
+------------------+ +------------------+ +------------------+
| | |
+----+----+ +----+----+ +----+----+
| | | | | |
[P02] [P08] [P07] [P09] [ALL] [P10]
Policy JIT SDP ZTNA Audit SIEM
Engine Access Control Tunnel Logs Correlation
| | | | | |
+----+----+ +----+----+ +----+----+
| | |
+-------------------------+-------------------------+
|
v
+------------------------+
| INTEGRATED CONTROL |
| PLANE |
| |
| Policy as Code |
| Unified Identity |
| Centralized Logging |
| Automated Response |
+------------------------+
+--------------------------------------------------------------------------+
The Control Plane Integration Pattern
All Zero Trust components share a common control plane. This is the âbrainâ that coordinates policy across all enforcement points:
+--------------------------------------------------------------------------+
| Unified Control Plane Architecture |
+--------------------------------------------------------------------------+
+-------------------------+
| CONTROL PLANE |
| |
| +-------------------+ |
| | Policy Repository | |
| | (Git-versioned) | |
| +-------------------+ |
| | |
| +-------------------+ |
| | Policy Engine | |
| | (P02) | |
| +-------------------+ |
| | |
| +-------------------+ |
| | Identity Provider | |
| | (SSO/OIDC) | |
| +-------------------+ |
| | |
| +-------------------+ |
| | Device Registry | |
| | (P05 metadata) | |
| +-------------------+ |
| | |
| +-------------------+ |
| | SIEM / Logging | |
| | (Unified) | |
| +-------------------+ |
+-------------------------+
|
+---------------------------+---------------------------+
| | |
v v v
+------------------+ +------------------+ +------------------+
| DATA PLANE | | DATA PLANE | | DATA PLANE |
| (P01 Proxy) | | (P04 mTLS) | | (P09 Tunnel) |
| | | | | |
| - Enforce policy | | - Enforce mTLS | | - Enforce access |
| - Log access | | - Log connections| | - Log tunnel use |
| - Query PDP | | - Verify certs | | - Route traffic |
+------------------+ +------------------+ +------------------+
| | |
v v v
[Protected [Service-to-Service [Remote User
Application] Communication] Access]
+--------------------------------------------------------------------------+
Real World Outcome
After completing this capstone, you will have a fully operational Zero Trust environment. Hereâs what it looks like in practice:
Demo Scenario: Developer Accessing Production
# Terminal 1: The Secure Enclave is running
$ docker-compose -f secure-enclave.yml up
[P01-proxy] | Listening on :8443 (mTLS enabled)
[P02-policy] | Policy engine ready, 47 policies loaded
[P04-mesh] | mTLS mesh controller running
[P05-device] | Device trust service ready
[P06-auth] | Continuous authentication monitor active
[P08-jit] | JIT access broker running
[P09-tunnel] | ZTNA gateway accepting connections
[siem] | Unified logging ingesting from all components
[alerting] | Alert manager connected to Slack/PagerDuty
# Terminal 2: Developer's machine with ZTNA client
$ ztna-client connect --config corp.yaml
[INFO] Device health check: PASSED (macOS 14.2, FileVault enabled, no malware)
[INFO] mTLS certificate loaded: spiffe://corp.com/user/douglas/device/macbook-42
[INFO] Connected to gateway.corp.com
[INFO] Ready to access: jira.internal, gitlab.internal, postgres-prod.internal
# Terminal 3: Developer requests database access
$ jit request --db postgres-prod --tables orders --reason "PROD-1234"
[INFO] Request submitted: req-abc123
[INFO] Auto-approval policy matched: "senior_engineer_readonly_prod"
[INFO] Credentials generated (TTL: 30 minutes)
Username: jit_douglas_20241227_x7y8z9
Password: Kj8mN2pQ5rT7vX9yB3dF6gH8jL0nM1oP2qR3sT4u
Expires: 2024-12-27T15:30:00Z
$ psql "postgresql://jit_douglas_20241227_x7y8z9:Kj8m...@postgres-prod.internal/app"
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384)
app=> SELECT * FROM orders WHERE id = 12345;
id | status | created_at
-------+---------+------------
12345 | shipped | 2024-12-26
(1 row)
# SIEM shows the full trace
$ curl https://siem.corp.com/api/trace/douglas/today | jq
{
"user": "douglas@corp.com",
"device": "macbook-42",
"events": [
{
"time": "14:58:00Z",
"component": "P05-device",
"event": "device_health_check",
"result": "PASS",
"details": {"os": "macOS 14.2", "encryption": true, "firewall": true}
},
{
"time": "14:58:01Z",
"component": "P09-tunnel",
"event": "tunnel_established",
"client_ip": "203.0.113.42",
"gateway": "gateway-us-west-2"
},
{
"time": "14:59:00Z",
"component": "P08-jit",
"event": "access_requested",
"database": "postgres-prod",
"permissions": ["SELECT"],
"tables": ["orders"]
},
{
"time": "14:59:01Z",
"component": "P02-policy",
"event": "policy_evaluated",
"policy": "senior_engineer_readonly_prod",
"decision": "ALLOW"
},
{
"time": "14:59:02Z",
"component": "P08-jit",
"event": "credential_created",
"temp_user": "jit_douglas_20241227_x7y8z9",
"ttl_minutes": 30
},
{
"time": "14:59:30Z",
"component": "P04-mesh",
"event": "connection_established",
"source": "macbook-42",
"destination": "postgres-prod",
"mtls_verified": true
},
{
"time": "14:59:31Z",
"component": "postgres",
"event": "query_executed",
"user": "jit_douglas_20241227_x7y8z9",
"query": "SELECT * FROM orders WHERE id = 12345",
"rows_returned": 1
}
]
}
The Core Question Youâre Answering
âHow do all the pieces of Zero Trust architecture fit together to create a truly secure environment where every access is verified, every connection is encrypted, and every action is audited?â
This capstone integrates Projects P01-P09 because individually, each component solves only one piece of the Zero Trust puzzle:
- P01 (Identity-Aware Proxy) verifies WHO is making requests, but doesnât know if their device is compromised
- P02 (Policy Engine) makes authorization decisions, but needs identity and context from other components
- P03 (Micro-segmentation) prevents lateral movement, but doesnât understand application-layer identity
- P04 (mTLS Mesh) encrypts connections, but relies on the PKI and identity systems to work correctly
- P05 (Device Trust) validates endpoints, but must feed into policy decisions to be actionable
- P06 (Continuous Authentication) detects anomalies, but needs integration to actually block access
- P07 (SDP Controller) hides services, but must coordinate with identity and policy systems
- P08 (JIT Access) eliminates standing privileges, but must be governed by policy and logged for audit
- P09 (ZTNA Tunnel) provides secure remote access, but is the integration point for all other controls
The capstone is where you learn that security architecture is not about components - itâs about how they work together. A chain of security controls is only as strong as the integration between its links.
Concepts You Must Understand First
Before implementing this capstone, ensure you deeply understand these integration concepts:
1. System Integration Patterns
Understanding how distributed systems communicate and share state:
+--------------------------------------------------------------------------+
| Integration Pattern: Event-Driven Security |
+--------------------------------------------------------------------------+
PATTERN: Event Bus for Security Signals
---------------------------------------
All components publish security events to a central bus:
[P05: Device Trust] [P06: Continuous Auth]
| |
| "device_compromised" | "behavior_anomaly"
v v
+------------------------------------------+
| SECURITY EVENT BUS |
| (Kafka, Redis Streams, or NATS) |
+------------------------------------------+
| |
| Subscribe | Subscribe
v v
[P01: Identity Proxy] [P08: JIT Broker]
| |
| Block sessions | Revoke credentials
| for device | for user
This pattern enables REACTIVE security:
- Device compromised -> All access revoked within seconds
- Behavior anomaly -> Step-up auth required immediately
- Policy change -> All PEPs updated atomically
+--------------------------------------------------------------------------+
2. End-to-End Security Architecture
The complete request lifecycle through all security layers:
+--------------------------------------------------------------------------+
| End-to-End Request Flow |
+--------------------------------------------------------------------------+
USER REQUEST
|
v
+--[Layer 1: Device]--+
| P05 Health Check |
| - OS patched? |
| - Firewall enabled? |
| - Malware scan? |
+---------------------+
| PASS
v
+--[Layer 2: Tunnel]--+
| P09 ZTNA Client |
| - mTLS to gateway |
| - Domain routing |
+---------------------+
| CONNECTED
v
+--[Layer 3: Identity]+
| P01 Identity Proxy |
| - JWT validation |
| - Header injection |
+---------------------+
| AUTHENTICATED
v
+--[Layer 4: Policy]--+
| P02 Policy Engine |
| - ABAC evaluation |
| - Risk assessment |
+---------------------+
| AUTHORIZED
v
+--[Layer 5: Network]-+
| P03 Micro-segment |
| - eBPF/iptables |
| P04 mTLS Mesh |
| - Service identity |
+---------------------+
| ENCRYPTED
v
+--[Layer 6: Access]--+
| P08 JIT Credentials |
| - Ephemeral access |
| - Scoped permissions|
+---------------------+
| PROVISIONED
v
+--[Layer 7: Observe]-+
| P06 Continuous Auth |
| - Behavior monitor |
| - Anomaly detection |
+---------------------+
| MONITORED
v
PROTECTED RESOURCE
Every layer can DENY. Every layer LOGS.
+--------------------------------------------------------------------------+
3. Defense in Depth
Overlapping controls that assume each layer might fail:
+--------------------------------------------------------------------------+
| Defense in Depth: Overlapping Controls |
+--------------------------------------------------------------------------+
PRINCIPLE: If one control fails, others catch the breach
EXAMPLE: Compromised User Credential
Control 1: Device Trust (P05)
+------------------------------------------+
| Expected: Block unmanaged/unhealthy device|
| Bypassed: Attacker uses managed device |
+------------------------------------------+
| (Control failed)
v
Control 2: Continuous Auth (P06)
+------------------------------------------+
| Expected: Detect behavior anomaly |
| Catches: Unusual access patterns flagged |
+------------------------------------------+
| (Anomaly detected!)
v
Control 3: Policy Engine (P02)
+------------------------------------------+
| Response: Risk score elevated |
| Action: Require MFA re-verification |
+------------------------------------------+
| (Attacker blocked at MFA)
v
BREACH PREVENTED
Even if Control 2 missed the anomaly:
Control 4: JIT Access (P08)
+------------------------------------------+
| Limit: Credential only valid 30 minutes |
| Scope: Only SELECT on specific tables |
+------------------------------------------+
| (Blast radius limited)
v
Control 5: Audit Logging
+------------------------------------------+
| Record: All queries with full context |
| Alert: SIEM correlation triggers alert |
+------------------------------------------+
| (Incident detected)
v
Control 6: Automated Response
+------------------------------------------+
| Action: Revoke all user credentials |
| Action: Terminate all active sessions |
| Action: Alert SOC team |
+------------------------------------------+
+--------------------------------------------------------------------------+
4. Security Observability
Unified visibility across all security components:
+--------------------------------------------------------------------------+
| Security Observability Stack |
+--------------------------------------------------------------------------+
+----------------------+
| DASHBOARDS |
| (Grafana) |
+----------------------+
|
+----------------------+
| ALERTING |
| (AlertManager) |
+----------------------+
|
+----------------------+
| CORRELATION |
| (SIEM Rules) |
+----------------------+
|
+----------------------+
| LOG AGGREGATION |
| (Elasticsearch/ |
| Loki/Splunk) |
+----------------------+
^
+--------------------+--------------------+
| | |
P01 Logs P02 Logs P04-P09 Logs
- Access - Decisions - All other
- Deny - Policy hits components
- Identity - Risk scores
KEY METRICS TO OBSERVE:
-----------------------
1. Access Patterns
- Requests per user/device/time
- Denied vs allowed ratio
- New device/location access
2. Policy Effectiveness
- Policies triggered
- Auto-approve vs manual approval
- Time to approval
3. Security Events
- Device health failures
- Authentication anomalies
- Revoked credentials
4. System Health
- Component latency
- Error rates
- Certificate expiration
+--------------------------------------------------------------------------+
5. Incident Response Readiness
Pre-built playbooks for common security scenarios:
+--------------------------------------------------------------------------+
| Incident Response Playbooks |
+--------------------------------------------------------------------------+
PLAYBOOK 1: Compromised User Account
-------------------------------------
Trigger: P06 detects impossible travel or credential stuffing
Steps:
1. [AUTOMATIC] Terminate all active sessions for user
2. [AUTOMATIC] Revoke all JIT credentials for user
3. [AUTOMATIC] Block user at Identity Proxy
4. [ALERT] Notify SOC and user's manager
5. [MANUAL] Investigate activity in SIEM
6. [MANUAL] Reset user credentials if legitimate
7. [MANUAL] Re-enable access with step-up verification
PLAYBOOK 2: Compromised Device
------------------------------
Trigger: P05 reports malware detected or failed health check
Steps:
1. [AUTOMATIC] Block device at ZTNA gateway
2. [AUTOMATIC] Terminate tunnel connection
3. [AUTOMATIC] Revoke device certificate
4. [ALERT] Notify user and IT security
5. [MANUAL] Quarantine device for forensics
6. [MANUAL] Reimage device
7. [MANUAL] Re-enroll with new certificate
PLAYBOOK 3: Suspicious Database Access
---------------------------------------
Trigger: P08 detects unusual query patterns
Steps:
1. [AUTOMATIC] Revoke JIT credential immediately
2. [AUTOMATIC] Log complete query history
3. [ALERT] Notify DBA and SOC
4. [MANUAL] Review data accessed
5. [MANUAL] Assess data breach notification requirements
6. [MANUAL] Update policies to prevent recurrence
PLAYBOOK 4: Certificate Compromise
----------------------------------
Trigger: P04 detects certificate misuse or CA compromise
Steps:
1. [AUTOMATIC] Revoke compromised certificates
2. [AUTOMATIC] Force re-authentication for all services
3. [AUTOMATIC] Issue new certificates to legitimate entities
4. [ALERT] Notify security team
5. [MANUAL] Investigate root cause
6. [MANUAL] Rotate CA if necessary
+--------------------------------------------------------------------------+
6. Zero Trust Maturity Model (CISA)
The CISA Zero Trust Maturity Model provides a framework for measuring your implementation:
+--------------------------------------------------------------------------+
| CISA Zero Trust Maturity Model - Assessment |
+--------------------------------------------------------------------------+
MATURITY LEVELS:
----------------
1. Traditional - Perimeter-based, minimal ZT
2. Initial - Starting ZT journey, some components
3. Advanced - Integrated ZT, automation beginning
4. Optimal - Full ZT, continuous improvement
PILLAR ASSESSMENT:
IDENTITY
--------
Traditional: Passwords only
Initial: SSO + MFA for some apps
Advanced: Risk-based MFA, continuous verification [P01, P06]
Optimal: Passwordless, phishing-resistant MFA
Your implementation: [P01 Identity Proxy + P06 Continuous Auth]
DEVICE
------
Traditional: No device validation
Initial: Inventory, basic MDM
Advanced: Real-time posture, conditional access [P05]
Optimal: Automated remediation, hardware attestation
Your implementation: [P05 Device Trust]
NETWORK
-------
Traditional: Perimeter firewall only
Initial: Network segmentation
Advanced: Micro-segmentation, encrypted traffic [P03, P04]
Optimal: Software-defined, identity-aware networking
Your implementation: [P03 Micro-segmentation + P04 mTLS + P07 SDP]
APPLICATION
-----------
Traditional: VPN for remote access
Initial: ZTNA for some apps
Advanced: Per-app access, continuous authorization [P02, P09]
Optimal: Dynamic policy, threat intelligence integrated
Your implementation: [P02 Policy Engine + P08 JIT + P09 ZTNA]
DATA
----
Traditional: Perimeter-based data protection
Initial: Classification, DLP basics
Advanced: Automatic classification, encryption everywhere
Optimal: Dynamic data policies, user behavior analytics
Your implementation: [All projects contribute + SIEM]
VISIBILITY
----------
Traditional: Siloed logs
Initial: Centralized logging
Advanced: Correlation, automated alerting [P10 Capstone]
Optimal: ML-based threat detection, automated response
Your implementation: [Unified SIEM + Alerting + Playbooks]
+--------------------------------------------------------------------------+
Questions to Guide Your Design
Before implementing, work through these design questions:
Integration Questions
-
Event Flow: How will security events propagate between components? (Event bus? API calls? Shared database?)
-
Identity Consistency: How will user identity be represented consistently across all components? (SPIFFE ID? JWT claims? Custom format?)
-
Policy Distribution: How will policy changes propagate to all enforcement points? (Push? Pull? Event-driven?)
-
Certificate Management: How will certificates be issued, rotated, and revoked across the mesh? (Central CA? Distributed?)
-
State Sharing: What state needs to be shared between components? (Session state? Device state? Risk scores?)
Testing Questions
-
Component Testing: How will you verify each component works in isolation?
-
Integration Testing: How will you test the interactions between components?
-
Security Testing: How will you verify the security controls actually work? (Penetration testing scenarios?)
-
Failure Testing: What happens when each component fails? (Chaos engineering?)
-
Performance Testing: Whatâs the latency overhead of the full security stack?
Operations Questions
-
Deployment: How will you deploy and update components without breaking security? (Blue-green? Canary?)
-
Monitoring: What dashboards and alerts do you need? (Per-component? Cross-component?)
-
Incident Response: What playbooks do you need for common scenarios?
-
Compliance: How will you demonstrate compliance with SOC2/ISO27001/etc.?
-
Disaster Recovery: How will you recover if a critical component fails?
Thinking Exercise: Architecture Design
Before writing any code, complete this design exercise on paper or whiteboard:
Exercise: Design the Complete Architecture
Task: Draw a complete architecture diagram showing:
- All Components: Where P01-P09 components are deployed
- Data Flows: How requests flow through the system
- Control Flows: How policy decisions are made and enforced
- Event Flows: How security events propagate
- Trust Boundaries: Where trust is established and verified
Guiding Questions:
- Where is the user? (Remote? Office? Mobile?)
- How do they authenticate? (SSO? MFA? Certificate?)
- How does their request reach the internal application?
- What checks happen at each step?
- What gets logged at each step?
- What happens if any check fails?
- What happens if the userâs device is compromised mid-session?
Deliverable: A diagram similar to this (but with YOUR understanding):
+--------------------------------------------------------------------------+
| YOUR Secure Enclave Architecture |
+--------------------------------------------------------------------------+
[Remote User] [Office User]
| |
v v
+----+----+ +-----+-----+
| ZTNA | | Direct |
| Client | | Network |
| (P09) | | |
+----+----+ +-----+-----+
| |
| mTLS | mTLS
v v
+------------------------------------------------+
| SECURITY GATEWAY TIER |
| |
| +----------+ +----------+ +----------+ |
| | Identity | | Policy | | Device | |
| | Proxy | | Engine | | Trust | |
| | (P01) | | (P02) | | (P05) | |
| +----------+ +----------+ +----------+ |
| | | | |
| +-------+------+------+-------+ |
| | |
+---------------|--------------------------------+
| Authorized & Verified
v
+------------------------------------------------+
| SERVICE TIER |
| |
| +----------+ +----------+ +----------+ |
| | App A | | App B | | Database | |
| | | | | | (via JIT)| |
| +----------+ +----------+ +----------+ |
| ^ ^ ^ |
| | mTLS (P04) | | |
| +-------+------+------+-------+ |
| | |
| +-------------------------------------------+ |
| | Micro-segmentation (P03) - eBPF/iptables | |
| +-------------------------------------------+ |
| |
+------------------------------------------------+
|
v
+------------------------------------------------+
| OBSERVABILITY TIER |
| |
| +----------+ +----------+ +----------+ |
| | Log | | SIEM | | Alert | |
| | Collector| | Correlate| | Manager | |
| +----------+ +----------+ +----------+ |
| |
+------------------------------------------------+
+--------------------------------------------------------------------------+
Hints in Layers: Progressive Implementation Guidance
Hint Layer 1: Start with the Control Plane
Before integrating data plane components, establish the shared control plane:
- Unified Identity: All components use the same identity representation
- Define your identity format (SPIFFE recommended)
- All components validate the same JWT issuer
- User identity flows through every request
- Shared Policy Repository: All policies in one place
- Git repository for policy-as-code
- Policies reference the same identity format
- Version control enables rollback
- Centralized Logging: All components log to the same destination
- Structured JSON logs with consistent schema
- Common fields: timestamp, component, user, device, action, result
- Correlation ID traces requests across components
Hint Layer 2: Integration by Event
Connect components through events rather than direct API calls:
- Event Schema: Define a common event format
{ "event_type": "access_denied", "timestamp": "2024-12-27T15:00:00Z", "component": "P01-identity-proxy", "correlation_id": "req-abc123", "subject": { "user": "douglas@corp.com", "device": "macbook-42" }, "object": { "resource": "jira.internal", "action": "GET /admin" }, "result": { "decision": "DENY", "reason": "insufficient_permissions" } } - Event Bus: Use Redis Streams, Kafka, or NATS
- Components publish events to topics
- Other components subscribe and react
- Enables loose coupling
- Reaction Rules: Define what happens when events occur
device_compromised-> Revoke all sessions for deviceuser_anomaly-> Elevate risk score, require step-up
Hint Layer 3: Test the Attack Scenarios
Validate your integration by simulating attacks:
- Stolen Credential Attack
- Generate valid JWT for âattackerâ
- Attempt access from new device -> P05 should block
- Attempt access at unusual time -> P06 should flag
- Attempt access to unauthorized resource -> P02 should deny
- Compromised Device Attack
- Simulate malware detection on device
- Verify device is blocked at ZTNA gateway
- Verify all sessions for device are terminated
- Verify credentials are revoked
- Lateral Movement Attack
- Gain access to one service
- Attempt to reach other services -> P03 should block
- Attempt to scan internal network -> P03 should block
- Verify all attempts are logged
Hint Layer 4: Build the Observability Stack
You canât secure what you canât see:
- Metrics Collection
- Prometheus for time-series metrics
- Each component exports metrics
- Custom dashboards in Grafana
- Log Aggregation
- Loki, Elasticsearch, or Splunk
- All components write to same destination
- Retention policy for compliance
- Alerting Rules
- Define thresholds and patterns
denied_requests > 10/minute-> Alertsame_user_different_locations-> Alert- Integrate with PagerDuty/Slack
Hint Layer 5: Document and Automate Response
The final step is automating incident response:
- Runbook Documentation
- For each alert type, document response steps
- Include commands to run, people to notify
- Test runbooks regularly
- Automated Response
- Some responses should be automatic
- Device compromised -> Auto-block (no human needed)
- Credential suspected -> Auto-revoke, then investigate
- Post-Incident Review
- After each incident, review what happened
- Update policies to prevent recurrence
- Share learnings across team
Project Specification
Functional Requirements
| ID | Requirement | Acceptance Criteria |
|---|---|---|
| FR-1 | Unified authentication | All access flows through P01 with consistent identity |
| FR-2 | Policy-driven authorization | P02 evaluates all access requests against policy |
| FR-3 | Device trust enforcement | P05 blocks unhealthy devices at entry point |
| FR-4 | Encrypted service mesh | All service-to-service uses mTLS (P04) |
| FR-5 | Remote access via ZTNA | P09 provides secure remote access |
| FR-6 | Just-in-time database access | P08 provides ephemeral credentials |
| FR-7 | Continuous monitoring | P06 detects and responds to anomalies |
| FR-8 | Unified logging | All components log to central SIEM |
| FR-9 | Automated alerting | Security events trigger appropriate alerts |
| FR-10 | Incident response automation | Playbooks execute automatically or with guidance |
Non-Functional Requirements
| ID | Requirement | Target |
|---|---|---|
| NFR-1 | End-to-end latency | < 100ms added by security stack |
| NFR-2 | Availability | 99.9% for security control plane |
| NFR-3 | Recovery time | < 5 minutes for any single component failure |
| NFR-4 | Log retention | 90 days hot, 1 year cold storage |
| NFR-5 | Alert response time | Critical: < 5 minutes, High: < 1 hour |
| NFR-6 | Policy update propagation | < 30 seconds to all enforcement points |
Component Checklist
Before considering the capstone complete, verify:
- P01 (Identity Proxy) is deployed and validating JWTs
- P02 (Policy Engine) is evaluating all access requests
- P03 (Micro-segmentation) is enforcing network policies
- P04 (mTLS Mesh) is encrypting all service communication
- P05 (Device Trust) is validating device health
- P06 (Continuous Auth) is monitoring user behavior
- P07 (SDP Controller) is hiding services from unauthorized users
- P08 (JIT Access) is providing ephemeral database credentials
- P09 (ZTNA Tunnel) is providing secure remote access
- All components are logging to a unified SIEM
- Alerts are configured for key security events
- At least 3 incident response playbooks are documented
Phased Implementation Guide
Phase 1: Foundation (Days 1-3)
Goal: Establish the shared control plane infrastructure.
Deliverables:
- Unified logging stack (Loki or Elasticsearch)
- Event bus (Redis Streams or NATS)
- Identity format defined (SPIFFE or custom)
- Policy repository initialized (Git)
Steps:
- Deploy logging infrastructure
- Configure all components to log in consistent format
- Set up event bus
- Define identity and event schemas
Phase 2: Core Integration (Days 4-7)
Goal: Connect the core security components.
Deliverables:
- P01 + P02 integration (proxy consults policy engine)
- P05 + P09 integration (device trust gates tunnel access)
- P04 mesh running between services
Steps:
- Deploy P01, P02, P05, P09 in connected configuration
- Verify request flow through all components
- Test policy enforcement at each layer
- Verify mTLS between all services
Phase 3: Access Management (Days 8-10)
Goal: Add JIT access and continuous authentication.
Deliverables:
- P08 JIT broker integrated with policy engine
- P06 continuous auth monitoring all sessions
- Automatic session termination on anomaly
Steps:
- Deploy P08 JIT broker
- Integrate P08 with P02 for approval policies
- Deploy P06 continuous authentication
- Configure P06 to terminate sessions and emit events
Phase 4: Observability (Days 11-12)
Goal: Build the security observability stack.
Deliverables:
- SIEM correlation rules
- Security dashboards
- Alert definitions
- Integration with notification systems
Steps:
- Define SIEM correlation rules for attack scenarios
- Build Grafana dashboards for security metrics
- Configure AlertManager with notification channels
- Test alert flow end-to-end
Phase 5: Incident Response (Days 13-14)
Goal: Automate incident response.
Deliverables:
- 3+ incident response playbooks
- Automated response for critical events
- Documentation for manual procedures
- Tested recovery procedures
Steps:
- Document incident response playbooks
- Implement automated responses (event -> action)
- Conduct tabletop exercises
- Create runbooks for recovery scenarios
Testing Strategy
Security Test Scenarios
| Scenario | Expected Result | Components Tested |
|---|---|---|
| Valid user, healthy device | Access granted, logged | P01, P02, P05, P09 |
| Valid user, unhealthy device | Blocked at tunnel, logged | P05, P09, SIEM |
| Invalid JWT | 401 Unauthorized | P01 |
| Unauthorized resource access | 403 Forbidden | P01, P02 |
| Lateral movement attempt | Blocked by micro-seg | P03 |
| Behavior anomaly detected | Session terminated | P06 |
| JIT credential expired | Access denied | P08, Database |
| Certificate revoked | Connection rejected | P04 |
Integration Test Scenarios
| Scenario | Steps | Validation |
|---|---|---|
| End-to-end access | User -> ZTNA -> Proxy -> App | Response received, all logs present |
| Policy update propagation | Update policy in Git | All PEPs enforce within 30s |
| Device compromise response | Emit device_compromised event | All sessions terminated within 60s |
| Credential rotation | Trigger JIT rotation | Old creds fail, new creds work |
| Component failure recovery | Kill policy engine | Access denied (fail-closed), recovery < 5min |
Performance Test Scenarios
| Metric | Target | Test Method |
|---|---|---|
| Authentication latency | < 50ms | Load test P01 with 1000 RPS |
| Policy evaluation latency | < 20ms | Load test P02 with 1000 RPS |
| Tunnel throughput | > 100 Mbps | iperf through P09 |
| Log ingestion rate | > 10,000 events/sec | Flood log collector |
| Alert latency | < 5 seconds | Measure event to alert time |
Self-Assessment Checklist
Architecture Understanding
- I can explain how all five NIST Zero Trust pillars are addressed
- I can trace a request through every security layer
- I can describe how compromise of one component is mitigated
- I understand the control plane vs data plane distinction
- I can assess our maturity on the CISA Zero Trust model
Integration Verification
- All components share a consistent identity format
- Events propagate correctly between components
- Policy changes affect all enforcement points
- Logs from all components are in the SIEM
- Alerts fire correctly for security events
Security Verification
- Unhealthy devices cannot access resources
- Unauthorized access attempts are blocked and logged
- Compromised sessions are terminated automatically
- Lateral movement is prevented by micro-segmentation
- All connections are encrypted with mTLS
Operations Readiness
- Dashboards show security posture at a glance
- Alerts are configured for all critical scenarios
- Incident response playbooks are documented
- Recovery procedures are tested
- Compliance evidence can be generated
Books That Will Help
| Book | Focus | Chapters |
|---|---|---|
| âZero Trust Networksâ by Gilman & Barth | Complete ZT architecture | All chapters |
| âSecurity in Computingâ by Pfleeger | Security fundamentals | Ch. 1-5, Ch. 10 |
| âThe Practice of Network Security Monitoringâ by Bejtlich | Security observability | Ch. 1-8 |
| âIncident Response & Computer Forensicsâ by Luttgens | IR procedures | Ch. 1-5 |
| âPractical Cloud Securityâ by Dotson | Cloud security architecture | Ch. 1-4 |
| âSite Reliability Engineeringâ by Google | Operational excellence | Ch. 12-14 |
Interview Questions You Can Now Answer
-
âDescribe how you would architect a Zero Trust environment for a 500-person company.â
-
âHow do you ensure defense in depth when any single security control might fail?â
-
âWalk me through what happens when a userâs device is compromised mid-session.â
-
âHow would you measure the maturity of a Zero Trust implementation?â
-
âWhatâs the difference between a security event and a security incident, and how do you handle each?â
-
âHow do you balance security with user experience in a Zero Trust environment?â
-
âDescribe your approach to security observability and incident response automation.â
Conclusion
This capstone represents the culmination of your Zero Trust learning journey. By integrating all nine previous projects, youâve learned that security is not about individual components - itâs about how they work together to create overlapping, reinforcing controls.
You now understand:
- How to architect complete Zero Trust environments
- Why integration is more important than any single technology
- How defense in depth prevents single points of failure
- Why observability is essential for security
- How to automate incident response for rapid containment
Most importantly, you can now think like a security architect - seeing the whole system, understanding how attacks flow, and designing defenses that work together to protect what matters.
This guide was expanded from ZERO_TRUST_ARCHITECTURE_DEEP_DIVE.md. For the complete learning path, see the project index.