← Back to all projects

CVE EXPLOIT DATABASES LEARNING PATH

Learn CVE/Exploit Databases: From Zero to Vulnerability Intelligence Master

Goal: Deeply understand the vulnerability ecosystem—how CVEs are discovered, assigned, scored, and tracked; how exploit databases work; how to query and analyze vulnerability data programmatically; and how to build tools that leverage this intelligence for defensive security. You’ll understand the entire lifecycle from vulnerability discovery to patch deployment, and build real tools that security professionals use daily.


Why CVE/Exploit Databases Matter

In 1999, MITRE Corporation created the Common Vulnerabilities and Exposures (CVE) system to solve a critical problem: security tools and databases were using different names for the same vulnerabilities, making it impossible to correlate data across systems. A buffer overflow in Apache might be called “Apache Bug #4532” in one scanner and “HTTPD-2001-0043” in another.

The scale is staggering:

  • The NVD contains 323,540+ CVE records (as of 2024)
  • CVE submissions increased 32% in 2024 alone
  • New CVEs are published at a rate of 50-100 per day
  • The Exploit Database contains 45,000+ public exploits

Why this knowledge is critical:

  1. Defensive Security: You can’t defend against what you don’t understand
  2. Vulnerability Management: Every organization needs to prioritize which vulnerabilities to fix first
  3. Threat Intelligence: Understanding exploitability helps predict attacks
  4. Security Research: Finding and responsibly disclosing vulnerabilities
  5. Compliance: PCI-DSS, HIPAA, SOC2 all require vulnerability management

Real-world impact:

  • The Log4Shell vulnerability (CVE-2021-44228) affected millions of systems worldwide
  • Equifax breach (2017) exploited CVE-2017-5638, a known vulnerability with a public exploit
  • WannaCry ransomware exploited CVE-2017-0144 (EternalBlue), which had patches available

Understanding CVE/Exploit databases is the foundation of modern security operations.


The Vulnerability Ecosystem: A Bird’s Eye View

┌─────────────────────────────────────────────────────────────────────────────┐
│                    THE VULNERABILITY LIFECYCLE                               │
└─────────────────────────────────────────────────────────────────────────────┘

  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
  │  DISCOVERY   │───▶│  REPORTING   │───▶│  ASSIGNMENT  │───▶│ PUBLICATION  │
  │              │    │              │    │              │    │              │
  │ • Research   │    │ • Vendor     │    │ • CNA assigns│    │ • CVE List   │
  │ • Bug Bounty │    │ • CNA        │    │   CVE ID     │    │ • NVD adds   │
  │ • Fuzzing    │    │ • Full Disc. │    │ • Reserved   │    │   CVSS/CPE   │
  │ • Audit      │    │              │    │   status     │    │              │
  └──────────────┘    └──────────────┘    └──────────────┘    └──────────────┘
                                                                      │
                                                                      ▼
  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
  │   PATCHING   │◀───│ PRIORITIZE   │◀───│   SCORING    │◀───│  ENRICHMENT  │
  │              │    │              │    │              │    │              │
  │ • Vendor     │    │ • Risk-based │    │ • CVSS Base  │    │ • CWE Type   │
  │   releases   │    │ • Asset      │    │ • Temporal   │    │ • CPE Match  │
  │ • Deploy     │    │   context    │    │ • Environ.   │    │ • References │
  │              │    │              │    │              │    │              │
  └──────────────┘    └──────────────┘    └──────────────┘    └──────────────┘
                                                │
                                                ▼
                                    ┌──────────────────────┐
                                    │   EXPLOIT DEVELOPMENT│
                                    │                      │
                                    │ • PoC Creation       │
                                    │ • Weaponization      │
                                    │ • Exploit-DB/MSF     │
                                    └──────────────────────┘

Core Concept 1: The CVE Identifier System

What is a CVE ID?

A CVE ID is a unique identifier for a publicly known cybersecurity vulnerability. It follows the format:

CVE-YYYY-NNNNN

Where:
  CVE  = Prefix (always "CVE")
  YYYY = Year the CVE was assigned (NOT when vuln was discovered)
  NNNNN = Sequential number (4+ digits, can be 5+ for high-volume years)

Examples:
  CVE-2021-44228  (Log4Shell)
  CVE-2017-0144   (EternalBlue)
  CVE-2014-0160   (Heartbleed)
  CVE-2024-12345  (5-digit ID from high-volume year)

CVE States

┌─────────────────────────────────────────────────────────────────┐
│                      CVE ID LIFECYCLE                           │
└─────────────────────────────────────────────────────────────────┘

  ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐
  │ RESERVED │────▶│  PUBLIC  │────▶│ ANALYZED │────▶│ MODIFIED │
  │          │     │          │     │   (NVD)  │     │          │
  │ ID given │     │ Details  │     │ CVSS/CPE │     │ Updates  │
  │ no info  │     │ published│     │ added    │     │ applied  │
  └──────────┘     └──────────┘     └──────────┘     └──────────┘
       │                                                   │
       │           ┌──────────┐                           │
       └──────────▶│ REJECTED │◀──────────────────────────┘
                   │          │
                   │ Invalid/ │
                   │ Duplicate│
                   └──────────┘

Special States:
  • DISPUTED  - Vendor disagrees it's a vulnerability
  • DEFERRED  - NVD won't prioritize enrichment (pre-2018 CVEs)

Core Concept 2: CVE Numbering Authorities (CNAs)

CNAs are organizations authorized to assign CVE IDs within their defined scope.

┌─────────────────────────────────────────────────────────────────┐
│                    CNA HIERARCHY                                 │
└─────────────────────────────────────────────────────────────────┘

                    ┌─────────────────┐
                    │  PROGRAM ROOT   │
                    │    (MITRE)      │
                    └────────┬────────┘
                             │
            ┌────────────────┼────────────────┐
            │                │                │
            ▼                ▼                ▼
    ┌───────────────┐ ┌───────────────┐ ┌───────────────┐
    │   ROOT CNA    │ │   ROOT CNA    │ │   ROOT CNA    │
    │    (CISA)     │ │  (Google)     │ │ (Microsoft)   │
    │ ICS systems   │ │ Google prods  │ │ MS products   │
    └───────┬───────┘ └───────┬───────┘ └───────┬───────┘
            │                 │                 │
            ▼                 ▼                 ▼
    ┌───────────────┐ ┌───────────────┐ ┌───────────────┐
    │   SUB-CNAs    │ │   SUB-CNAs    │ │   SUB-CNAs    │
    │ (Vendors in   │ │ (Android,     │ │ (Azure,       │
    │  ICS space)   │ │  Chrome...)   │ │  GitHub...)   │
    └───────────────┘ └───────────────┘ └───────────────┘

Notable CNAs:
  • MITRE (Program Root) - Last resort CNA
  • CISA - US government, ICS vulnerabilities
  • Major vendors (Microsoft, Apple, Google, Red Hat, etc.)
  • Security companies (Rapid7, Tenable, etc.)
  • Open source projects (Apache, Linux kernel, etc.)

Core Concept 3: The National Vulnerability Database (NVD)

The NVD is the U.S. government’s repository of vulnerability data, maintained by NIST. It enriches CVE data with additional metadata.

┌─────────────────────────────────────────────────────────────────┐
│              CVE vs NVD: What Each Provides                     │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────┐        ┌─────────────────────────────┐
│      CVE LIST           │        │           NVD               │
│     (cve.mitre.org)     │        │      (nvd.nist.gov)         │
├─────────────────────────┤        ├─────────────────────────────┤
│ • CVE ID                │        │ Everything from CVE, PLUS:  │
│ • Description           │───────▶│ • CVSS Scores (v2, v3, v4)  │
│ • References (URLs)     │        │ • CWE Classification        │
│ • Affected products     │        │ • CPE (product matching)    │
│   (vendor statement)    │        │ • Fix information           │
│ • Date published        │        │ • Exploitability metrics    │
│                         │        │ • Impact metrics            │
│                         │        │ • References (expanded)     │
└─────────────────────────┘        └─────────────────────────────┘

NVD API 2.0 Endpoints:
  • /cves      - Query CVE data
  • /cpes      - Query product identifiers
  • /cpematch  - Find products matching criteria
  • /source    - CVE source information

NVD Data Flow

                    CVE Published
                         │
                         ▼
              ┌─────────────────────┐
              │   NVD Receives CVE  │
              │   (via CVE API)     │
              └──────────┬──────────┘
                         │
                         ▼
              ┌─────────────────────┐
              │  Analyst Enrichment │
              │  • Calculate CVSS   │
              │  • Assign CWE       │
              │  • Create CPE match │
              └──────────┬──────────┘
                         │
                         ▼
              ┌─────────────────────┐
              │   Published to NVD  │
              │   (API + Website)   │
              └──────────┬──────────┘
                         │
        ┌────────────────┼────────────────┐
        ▼                ▼                ▼
  ┌───────────┐   ┌───────────┐   ┌───────────┐
  │ NVD API   │   │ Data Feeds│   │  Website  │
  │ (JSON)    │   │ (JSON)    │   │ (Search)  │
  └───────────┘   └───────────┘   └───────────┘

Core Concept 4: CVSS - Common Vulnerability Scoring System

CVSS provides a numerical score (0.0-10.0) representing vulnerability severity.

┌─────────────────────────────────────────────────────────────────┐
│                    CVSS v4.0 METRIC GROUPS                      │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                     BASE METRICS (Required)                     │
│   Intrinsic qualities of a vulnerability - don't change        │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  EXPLOITABILITY METRICS          IMPACT METRICS                 │
│  ─────────────────────          ──────────────                  │
│  • Attack Vector (AV)           • Confidentiality (VC/SC)       │
│    Network/Adjacent/Local/      • Integrity (VI/SI)             │
│    Physical                     • Availability (VA/SA)          │
│  • Attack Complexity (AC)                                       │
│    Low/High                     VULNERABLE SYSTEM vs            │
│  • Attack Requirements (AT)     SUBSEQUENT SYSTEM               │
│    None/Present                 (New in v4.0!)                  │
│  • Privileges Required (PR)                                     │
│    None/Low/High                                                │
│  • User Interaction (UI)                                        │
│    None/Passive/Active                                          │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                   THREAT METRICS (Optional)                     │
│   Current state of exploit techniques or code availability     │
├─────────────────────────────────────────────────────────────────┤
│  • Exploit Maturity (E): Not Defined/Attacked/PoC/Unreported   │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                ENVIRONMENTAL METRICS (Optional)                 │
│   Customized scoring based on YOUR environment                 │
├─────────────────────────────────────────────────────────────────┤
│  • Modified Base metrics (your specific context)               │
│  • Confidentiality/Integrity/Availability Requirements         │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│               SUPPLEMENTAL METRICS (v4.0 New!)                  │
│   Additional context (not used in score calculation)           │
├─────────────────────────────────────────────────────────────────┤
│  • Safety, Automatable, Recovery, Value Density, etc.          │
└─────────────────────────────────────────────────────────────────┘

CVSS Score Ranges

Score Range    Severity     Color Code    Action Priority
───────────    ────────     ──────────    ───────────────
0.0            None         Gray          Informational
0.1 - 3.9      Low          Green         Schedule patch
4.0 - 6.9      Medium       Yellow        Patch soon
7.0 - 8.9      High         Orange        Patch urgently
9.0 - 10.0     Critical     Red           Patch immediately

Example CVSS v4.0 Vector String:
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N

Breakdown:
  AV:N  = Attack Vector: Network (remotely exploitable)
  AC:L  = Attack Complexity: Low (easy to exploit)
  AT:N  = Attack Requirements: None
  PR:N  = Privileges Required: None (unauthenticated)
  UI:N  = User Interaction: None (no victim action needed)
  VC:H  = Confidentiality Impact (Vulnerable): High
  VI:H  = Integrity Impact (Vulnerable): High
  VA:H  = Availability Impact (Vulnerable): High
  SC:N  = Confidentiality Impact (Subsequent): None
  SI:N  = Integrity Impact (Subsequent): None
  SA:N  = Availability Impact (Subsequent): None

This would score: 9.3 (Critical)

Core Concept 5: CWE - Common Weakness Enumeration

CWE categorizes the TYPE of vulnerability (the “how”), while CVE identifies a specific instance.

┌─────────────────────────────────────────────────────────────────┐
│                    CWE HIERARCHY                                │
└─────────────────────────────────────────────────────────────────┘

                        ┌──────────────────┐
                        │   CWE-699        │
                        │ Software Dev.    │
                        │   View           │
                        └────────┬─────────┘
                                 │
         ┌───────────────────────┼───────────────────────┐
         │                       │                       │
         ▼                       ▼                       ▼
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   CWE-19        │    │   CWE-20        │    │   CWE-21        │
│ Data Processing │    │ Improper Input  │    │ Pathname        │
│    Errors       │    │   Validation    │    │  Traversal      │
└────────┬────────┘    └────────┬────────┘    └─────────────────┘
         │                      │
         │              ┌───────┴───────┐
         │              │               │
         ▼              ▼               ▼
┌─────────────┐  ┌─────────────┐  ┌─────────────┐
│  CWE-89     │  │  CWE-79     │  │  CWE-78     │
│ SQL         │  │ Cross-Site  │  │ OS Command  │
│ Injection   │  │ Scripting   │  │ Injection   │
└─────────────┘  └─────────────┘  └─────────────┘

OWASP Top 10 (2021) to CWE Mapping:
  A01 Broken Access Control    → CWE-200, CWE-284, CWE-352...
  A02 Cryptographic Failures   → CWE-259, CWE-327, CWE-328...
  A03 Injection                → CWE-79, CWE-89, CWE-78...
  A04 Insecure Design          → CWE-209, CWE-256, CWE-501...
  A05 Security Misconfiguration→ CWE-16, CWE-611...
  ...

Core Concept 6: CPE - Common Platform Enumeration

CPE provides standardized names for IT products, enabling vulnerability matching.

┌─────────────────────────────────────────────────────────────────┐
│                    CPE 2.3 FORMAT                               │
└─────────────────────────────────────────────────────────────────┘

cpe:2.3:part:vendor:product:version:update:edition:language:sw_edition:target_sw:target_hw:other

Example:
cpe:2.3:a:apache:log4j:2.14.1:*:*:*:*:*:*:*

Breakdown:
  cpe:2.3     = CPE version
  a           = Application (o=OS, h=Hardware)
  apache      = Vendor
  log4j       = Product name
  2.14.1      = Version
  *           = Wildcard (any value)

More Examples:
  cpe:2.3:o:microsoft:windows_10:1903:*:*:*:*:*:*:*
  cpe:2.3:a:openssl:openssl:1.0.1:*:*:*:*:*:*:*
  cpe:2.3:h:cisco:catalyst_9300:-:*:*:*:*:*:*:*

CPE Matching Logic:
┌─────────────────────────────────────────────────────────────────┐
│  CVE-2021-44228 affects:                                        │
│    cpe:2.3:a:apache:log4j:*:*:*:*:*:*:*:*                       │
│    WHERE version >= 2.0-beta9 AND version < 2.15.0              │
│                                                                 │
│  Your inventory:                                                │
│    Server A: cpe:2.3:a:apache:log4j:2.14.1:*:*:*:*:*:*:*  MATCH!│
│    Server B: cpe:2.3:a:apache:log4j:2.17.0:*:*:*:*:*:*:*  SAFE  │
└─────────────────────────────────────────────────────────────────┘

Core Concept 7: Exploit Databases

Exploit databases collect and organize public exploit code and proof-of-concepts.

┌─────────────────────────────────────────────────────────────────┐
│                MAJOR EXPLOIT DATABASES                          │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│  EXPLOIT-DB (exploit-db.com)                                    │
│  Maintained by: Offensive Security                              │
├─────────────────────────────────────────────────────────────────┤
│  • 45,000+ exploits                                             │
│  • Verified/reviewed submissions                                │
│  • Integrated with Metasploit                                   │
│  • CLI tool: searchsploit                                       │
│  • Categories: Remote, Local, WebApp, DoS, Shellcode, Papers    │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│  RAPID7 VULNERABILITY DB (rapid7.com/db)                        │
│  Maintained by: Rapid7                                          │
├─────────────────────────────────────────────────────────────────┤
│  • 180,000+ vulnerabilities                                     │
│  • Direct Metasploit module links                               │
│  • Detailed technical analysis                                  │
│  • Remediation guidance                                         │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│  PACKET STORM (packetstormsecurity.com)                         │
│  Maintained by: Community                                       │
├─────────────────────────────────────────────────────────────────┤
│  • Security tools & exploits                                    │
│  • News and advisories                                          │
│  • Historical archive                                           │
└─────────────────────────────────────────────────────────────────┘

Exploit Maturity Levels:
┌────────────────────────────────────────────────────────────────┐
│                                                                │
│   Theoretical ──▶ PoC ──▶ Functional ──▶ Weaponized ──▶ ITW   │
│        │           │          │              │            │    │
│   "Possible"   "Works in   "Reliable"   "Packaged"   "Active  │
│                  lab"                    for use"    attacks" │
│                                                                │
│   ITW = In The Wild                                           │
└────────────────────────────────────────────────────────────────┘

Core Concept 8: MITRE ATT&CK Integration

MITRE ATT&CK maps adversary tactics and techniques, which can be linked to CVEs.

┌─────────────────────────────────────────────────────────────────┐
│              CVE → ATT&CK MAPPING                               │
└─────────────────────────────────────────────────────────────────┘

CVE-2021-44228 (Log4Shell)
         │
         ▼
┌─────────────────────────────────────────────────────────────────┐
│  TACTIC: Initial Access                                         │
│  TECHNIQUE: T1190 - Exploit Public-Facing Application          │
├─────────────────────────────────────────────────────────────────┤
│  TACTIC: Execution                                              │
│  TECHNIQUE: T1059.007 - JavaScript/JScript                     │
├─────────────────────────────────────────────────────────────────┤
│  Detection: Monitor for unusual JNDI lookups                    │
│  Mitigation: Update Log4j, disable JNDI lookup                 │
└─────────────────────────────────────────────────────────────────┘

Why This Mapping Matters:
  • Understand attack chains (not just single vulns)
  • Prioritize based on threat actor TTPs
  • Build detection rules
  • Validate defensive coverage

Concept Summary Table

Concept Cluster What You Need to Internalize
CVE System A CVE is a unique ID for a specific vulnerability instance. The ID tells you when it was assigned, not discovered. States include Reserved, Published, Rejected, and Disputed.
CNA Hierarchy CNAs assign CVE IDs within their scope. MITRE is the root, vendors are CNAs for their products. Understanding who assigns CVEs helps you track disclosure timelines.
NVD Enrichment NVD adds CVSS scores, CWE types, and CPE matches to raw CVE data. The NVD API 2.0 is the primary programmatic interface. Data feeds are being deprecated.
CVSS Scoring CVSS provides severity scores (0-10) based on exploitability and impact. v4.0 adds vulnerable vs subsequent system impacts. Environmental metrics customize scores for YOUR context.
CWE Classification CWE describes the TYPE of weakness (SQL injection, buffer overflow). A CVE is a specific instance of a CWE. Understanding CWE helps you see patterns across vulnerabilities.
CPE Matching CPE provides standardized product identifiers. Matching your inventory against CVE-affected CPEs is the core of vulnerability management.
Exploit Databases Exploit-DB, Rapid7, and PacketStorm collect public exploits. SearchSploit enables offline searching. Exploit maturity ranges from theoretical to in-the-wild.
ATT&CK Integration Mapping CVEs to ATT&CK techniques connects vulnerabilities to real attack patterns. This enables threat-informed prioritization.

Deep Dive Reading by Concept

This section maps each concept to specific resources for deeper understanding. Read these before or alongside the projects.

Vulnerability Fundamentals

Concept Resource
CVE System Overview “Practical Vulnerability Management” by Andrew Magnusson — Ch. 2: “Vulnerability Intelligence”
NVD Data Model NVD Developers Documentation — API Schema section
Vulnerability Lifecycle “Effective Vulnerability Management” by Chris Hughes — Ch. 1-2

Scoring and Classification

Concept Resource
CVSS v4.0 Specification FIRST.org CVSS v4.0 Specification
CVSS Calculator Usage FIRST.org CVSS v4.0 User Guide
CWE Understanding CWE Top 25 Most Dangerous Software Weaknesses

Exploit Research

Concept Resource
Exploit Development Basics “Penetration Testing” by Georgia Weidman — Ch. 13-15
Using SearchSploit Exploit-DB SearchSploit Manual
Metasploit Integration “Ethical Hacking” by Daniel G. Graham — Ch. 8-10

Advanced Topics

Concept Resource
ATT&CK Framework MITRE ATT&CK Getting Started
CVE-to-ATT&CK Mapping Center for Threat-Informed Defense Project
Vulnerability Prioritization “Effective Vulnerability Management” by Chris Hughes — Ch. 5-6

Essential Reading Order

For maximum comprehension, read in this order:

  1. Foundation (Week 1):
    • Practical Vulnerability Management Ch. 1-3 (vulnerability basics)
    • NVD Developers Documentation (API understanding)
  2. Scoring & Classification (Week 2):
    • FIRST.org CVSS v4.0 User Guide
    • CWE Top 25 overview
  3. Exploitation Context (Week 3):
    • SearchSploit manual
    • Penetration Testing exploit chapters
  4. Advanced Integration (Week 4):
    • ATT&CK Getting Started
    • CVE-to-ATT&CK methodology

Project List

Projects are ordered from fundamental understanding to advanced implementations. Each project builds on concepts from previous ones.


Project 1: CVE Data Explorer (Understand the Data Model)

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, Rust, JavaScript/Node.js
  • Coolness Level: Level 2: Practical but Forgettable
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 1: Beginner
  • Knowledge Area: API Integration / Data Parsing / JSON
  • Software or Tool: NVD API, Python requests/httpx
  • Main Book: “Practical Vulnerability Management” by Andrew Magnusson

What you’ll build: A command-line tool that queries the NVD API 2.0 to fetch CVE details, display them in human-readable format, and export to JSON/CSV. You’ll implement searching by CVE ID, keyword, date range, and CVSS severity.

Why it teaches CVE/Exploit databases: This forces you to understand the exact structure of CVE data—every field, every nested object. You’ll see firsthand how CVSS vectors are encoded, how CPE matching works, and how NVD enriches raw CVE data.

Core challenges you’ll face:

  • Handling NVD API pagination → maps to understanding rate limits and chunked responses
  • Parsing nested JSON structures → maps to CVE data model comprehension
  • Decoding CVSS vector strings → maps to understanding each metric component
  • Implementing date-based filtering → maps to understanding CVE publication timeline

Key Concepts:

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic Python, understanding of REST APIs, JSON familiarity


Real World Outcome

You’ll have a CLI tool that security analysts would actually use daily. When you run it:

$ ./cve-explorer search --keyword "log4j" --severity critical

╔═══════════════════════════════════════════════════════════════════╗
║                    CVE Search Results                              ║
║                    Keyword: "log4j"                                ║
║                    Severity: CRITICAL                              ║
╠═══════════════════════════════════════════════════════════════════╣

┌─────────────────────────────────────────────────────────────────┐
│ CVE-2021-44228 (Log4Shell)                                      │
├─────────────────────────────────────────────────────────────────┤
│ Published: 2021-12-10    Modified: 2023-11-07                   │
│ CVSS v3.1: 10.0 (CRITICAL)                                      │
│ Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H           │
├─────────────────────────────────────────────────────────────────┤
│ Description:                                                    │
│ Apache Log4j2 2.0-beta9 through 2.15.0 (excluding security      │
│ releases 2.12.3, 2.12.4, and 2.3.1) JNDI features used in       │
│ configuration, log messages, and parameters do not protect      │
│ against attacker controlled LDAP and other JNDI related...      │
├─────────────────────────────────────────────────────────────────┤
│ CWE: CWE-502 (Deserialization of Untrusted Data)               │
│      CWE-400 (Uncontrolled Resource Consumption)               │
├─────────────────────────────────────────────────────────────────┤
│ Affected Products (CPE):                                        │
│   • cpe:2.3:a:apache:log4j:2.0:beta9:*:*:*:*:*:*               │
│   • cpe:2.3:a:apache:log4j:2.14.1:*:*:*:*:*:*:*                │
│   • ... (47 more)                                               │
├─────────────────────────────────────────────────────────────────┤
│ References:                                                     │
│   • https://logging.apache.org/log4j/2.x/security.html         │
│   • https://www.cisa.gov/uscert/apache-log4j-vulnerability...  │
└─────────────────────────────────────────────────────────────────┘

Found 23 CVEs matching criteria. Showing 1 of 23.
Export to CSV? [y/N]:
$ ./cve-explorer get CVE-2021-44228 --format json

{
  "cve_id": "CVE-2021-44228",
  "published": "2021-12-10T10:15:00.000",
  "last_modified": "2023-11-07T03:35:00.000",
  "vuln_status": "Analyzed",
  "description": "Apache Log4j2 2.0-beta9 through 2.15.0...",
  "cvss_v31": {
    "score": 10.0,
    "severity": "CRITICAL",
    "vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H",
    "attack_vector": "Network",
    "attack_complexity": "Low",
    "privileges_required": "None",
    "user_interaction": "None",
    "scope": "Changed",
    "confidentiality_impact": "High",
    "integrity_impact": "High",
    "availability_impact": "High"
  },
  "cwes": ["CWE-502", "CWE-400"],
  "cpes": [...],
  "references": [...]
}

The Core Question You’re Answering

“What exactly IS a CVE record? What data does it contain, where does that data come from, and how is it structured?”

Before you write any code, sit with this question. Most developers have a vague sense of “CVE = security bug” but can’t explain the difference between CVE and NVD data, why CVSS scores differ between sources, or how CPE matching actually works.


Concepts You Must Understand First

Stop and research these before coding:

  1. REST API Fundamentals
    • What is pagination and why do APIs use it?
    • How do rate limits work and why do they exist?
    • What’s the difference between query parameters and path parameters?
    • Resource: Any REST API tutorial
  2. JSON Data Structures
    • How do nested objects work in JSON?
    • What’s the difference between arrays and objects?
    • How do you handle optional/nullable fields?
    • Book Reference: Any Python JSON handling guide
  3. CVE Data Model
    • What fields are in a CVE record?
    • What’s the difference between CVE List data and NVD enrichment?
    • How are CVSS vectors encoded as strings?
    • Resource: NVD API Documentation

Questions to Guide Your Design

Before implementing, think through these:

  1. API Interaction
    • How will you handle the 2000 result limit per request?
    • What happens if the API is down or rate-limited?
    • Should you cache results? For how long?
    • How will you handle API key authentication (optional but recommended)?
  2. Data Presentation
    • How do you make CVSS vectors human-readable?
    • What’s the best way to display long descriptions?
    • How should you handle CVEs with multiple CVSS versions (v2, v3.1, v4.0)?
  3. User Experience
    • What search filters are most useful?
    • Should you support both interactive and scriptable modes?
    • How do you handle “no results found” gracefully?

Thinking Exercise

Trace a CVE Record

Before coding, manually fetch and analyze this CVE using curl:

curl "https://services.nvd.nist.gov/rest/json/cves/2.0?cveId=CVE-2021-44228" | python -m json.tool > log4shell.json

Questions while analyzing the JSON:

  • How many top-level fields are there?
  • Where is the CVSS v3.1 score located? What’s the full path?
  • How many CPE matches are there? What’s the structure of each?
  • What’s the difference between descriptions and metrics?
  • Where do references come from? How are they categorized?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Explain the difference between CVE and NVD. Why do both exist?”
  2. “How would you query the NVD API to find all critical vulnerabilities from the last 30 days?”
  3. “What does each component of a CVSS vector string mean?”
  4. “How would you match a CVE to your organization’s software inventory?”
  5. “What’s the difference between a CVE being ‘Reserved’ vs ‘Published’ vs ‘Rejected’?”
  6. “How do you handle API rate limiting in a production application?”

Hints in Layers

Hint 1: Start Simple Begin with a single function that fetches one CVE by ID. Print the raw JSON. Get comfortable with the data structure before building anything complex.

Hint 2: Understand Pagination The NVD API uses startIndex and resultsPerPage parameters. The response includes totalResults and resultsPerPage to help you calculate how many more requests you need.

Hint 3: Parse CVSS Vectors CVSS vectors are formatted as CVSS:3.1/AV:N/AC:L/.... Split by /, then split each component by :. The first part after CVSS: is the version.

Hint 4: Use the API Explorer Test your queries in the browser first: https://services.nvd.nist.gov/rest/json/cves/2.0?keywordSearch=log4j&cvssV3Severity=CRITICAL. See exactly what comes back before writing code.


Books That Will Help

Topic Book Chapter
NVD data understanding “Practical Vulnerability Management” by Magnusson Ch. 2-3
Python API interaction “Black Hat Python” by Seitz Ch. 2
JSON handling patterns “Fluent Python” by Ramalho Ch. 17-18
CLI tool design “The Linux Command Line” by Shotts Ch. 25-26

Implementation Hints

Data structure to represent a CVE: Think about creating a class or dataclass that holds:

  • Core identifiers (CVE ID, status, dates)
  • Description (may have multiple languages)
  • CVSS scores (multiple versions possible)
  • CWE classifications (list)
  • CPE matches (complex nested structure)
  • References (list with tags)

API request flow:

  1. Build query parameters based on user input
  2. Make initial request
  3. Check totalResults vs resultsPerPage
  4. If more results exist, loop with incrementing startIndex
  5. Aggregate all results
  6. Parse and display

Error handling considerations:

  • 404: CVE not found
  • 403: Rate limited (wait and retry)
  • 503: Service unavailable
  • Timeout: Network issues

Learning Milestones

  1. You can fetch and display a single CVE → You understand the basic API and data structure
  2. You can search with filters and handle pagination → You understand how to work within API constraints
  3. You can parse and explain every CVSS metric → You deeply understand vulnerability scoring
  4. You can export structured data for further analysis → You can integrate this into larger workflows

Project 2: CVSS Calculator & Analyzer

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: JavaScript, Go, Rust
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Security Metrics / Risk Analysis
  • Software or Tool: CVSS Calculator libraries, FIRST.org specification
  • Main Book: “Effective Vulnerability Management” by Chris Hughes

What you’ll build: A CVSS v4.0 calculator that takes metric values interactively or via vector string, calculates scores, and explains the reasoning. It will also analyze existing CVEs and show how environmental modifications change severity for YOUR context.

Why it teaches CVE/Exploit databases: CVSS is the universal language of vulnerability severity. Building a calculator forces you to understand every metric, the mathematical formulas, and why different environments yield different scores.

Core challenges you’ll face:

  • Implementing CVSS v4.0 equations → maps to understanding severity calculation mathematics
  • Handling metric dependencies → maps to some metrics affect others
  • Environmental score customization → maps to context-specific risk
  • Comparing v3.1 vs v4.0 scores → maps to understanding scoring evolution

Key Concepts:

  • CVSS v4.0 Specification: FIRST.org Full Specification
  • Score Calculation: CVSS v4.0 Specification — Section 7
  • Environmental Metrics: CVSS v4.0 User Guide — Environmental section
  • Risk Context: “Effective Vulnerability Management” Ch. 5

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1 completed, understanding of basic statistics


Real World Outcome

$ ./cvss-calc interactive

╔═══════════════════════════════════════════════════════════════════╗
║                 CVSS v4.0 Interactive Calculator                  ║
╚═══════════════════════════════════════════════════════════════════╝

Let's calculate a CVSS score. I'll ask about each metric.

═══════════════════════════════════════════════════════════════════
EXPLOITABILITY METRICS
═══════════════════════════════════════════════════════════════════

Attack Vector (AV) - How can the attacker reach the vulnerable component?
  [N] Network    - Remotely exploitable over the internet
  [A] Adjacent   - Requires same network segment (LAN/WiFi)
  [L] Local      - Requires local access to the system
  [P] Physical   - Requires physical access to the device

Your choice: N

Attack Complexity (AC) - How complex is the attack to execute?
  [L] Low  - No special conditions, works reliably
  [H] High - Requires specific conditions or preparation

Your choice: L

... (continues for all metrics)

═══════════════════════════════════════════════════════════════════
RESULTS
═══════════════════════════════════════════════════════════════════

Vector String: CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N

┌─────────────────────────────────────────────────────────────────┐
│                        BASE SCORE: 9.3                          │
│                        Severity: CRITICAL                        │
└─────────────────────────────────────────────────────────────────┘

Score Breakdown:
  Exploitability: HIGH
    • Network-accessible (AV:N)
    • Low complexity (AC:L)
    • No privileges needed (PR:N)
    • No user interaction (UI:N)

  Impact: HIGH
    • Full confidentiality loss (VC:H)
    • Full integrity loss (VI:H)
    • Full availability loss (VA:H)

Would you like to apply Environmental modifications? [y/N]: y

Your organization context:
  Confidentiality Requirement: [H]igh / [M]edium / [L]ow? M
  Integrity Requirement: H
  Availability Requirement: L

┌─────────────────────────────────────────────────────────────────┐
│                   ENVIRONMENTAL SCORE: 8.1                      │
│                   Severity: HIGH (was CRITICAL)                 │
└─────────────────────────────────────────────────────────────────┘

Why the difference?
  Your organization prioritizes integrity over availability.
  Since this vuln primarily affects availability, it's less
  critical in YOUR specific context.

The Core Question You’re Answering

“How do we objectively measure how ‘bad’ a vulnerability is? And why might the same vulnerability be critical for one organization but medium for another?”

CVSS provides a framework for consistent severity assessment, but the “right” score depends on context. Understanding this transforms how you think about vulnerability prioritization.


Concepts You Must Understand First

Stop and research these before coding:

  1. CVSS Metric Groups
    • What’s the difference between Base, Threat, and Environmental metrics?
    • Why are some metrics “required” and others “optional”?
    • How does the Supplemental group differ from the others?
    • Resource: CVSS v4.0 User Guide
  2. Impact vs Exploitability
    • What makes something “exploitable”?
    • How do you measure “impact” on confidentiality, integrity, availability?
    • What’s the difference between “Vulnerable System” and “Subsequent System”?
    • Resource: CVSS v4.0 Specification Section 5
  3. Environmental Context
    • What are Confidentiality/Integrity/Availability Requirements?
    • How do Modified metrics work?
    • When should an organization use environmental scoring?
    • Book: “Effective Vulnerability Management” Ch. 5

Questions to Guide Your Design

Before implementing, think through these:

  1. Score Calculation
    • How does the v4.0 formula differ from v3.1?
    • What happens when metrics are “Not Defined”?
    • How do you handle metric dependencies?
  2. User Interface
    • How do you explain each metric clearly to non-experts?
    • Should you support both interactive and vector-string input?
    • How do you visualize the score breakdown?
  3. Practical Application
    • How would you batch-process many CVEs?
    • Can you compare scores across CVSS versions?
    • How do you export results for reporting?

Thinking Exercise

Manually Calculate a Score

Take the Log4Shell CVSS vector and manually trace through each metric:

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H

Questions:

  • What does each metric abbreviation mean?
  • Why is Scope “Changed” (S:C)?
  • How would this score change if Attack Complexity were High?
  • What if it required user interaction (UI:R)?
  • How would YOUR organization’s environmental context modify this?

The Interview Questions They’ll Ask

  1. “Walk me through how you’d assess the severity of a newly discovered vulnerability.”
  2. “Why might two organizations assign different CVSS scores to the same CVE?”
  3. “What’s the difference between CVSS v3.1 and v4.0?”
  4. “How do you decide whether to use Base, Temporal, or Environmental scores?”
  5. “A vendor says a vulnerability is ‘Medium’ but CVSS says ‘Critical’. How do you resolve this?”

Hints in Layers

Hint 1: Start with the Specification The CVSS v4.0 specification document contains the exact formulas. Start by implementing the equations exactly as documented.

Hint 2: Use the Official Calculator to Verify Test your implementation against FIRST.org’s official calculator. Your scores should match exactly.

Hint 3: Metric Value Mappings Each metric value maps to a numerical constant. For example, Attack Vector: Network = 0.85, Adjacent = 0.62, Local = 0.55, Physical = 0.2.

Hint 4: Handle Edge Cases

  • What if all impacts are “None”? (Score should be 0.0)
  • What if Scope is “Changed”? (Impacts need adjustment)
  • What about “Not Defined” values? (Use defaults)

Books That Will Help

Topic Book Chapter
Risk assessment theory “Effective Vulnerability Management” by Hughes Ch. 4-6
CVSS deep dive FIRST.org CVSS v4.0 Specification Sections 5-7
Security metrics “Security Metrics” by Jaquith Ch. 3-5
Python math “Python for Data Analysis” by McKinney Ch. 4

Implementation Hints

Metric value mappings (partial example for v3.1):

Attack Vector:
  Network (N) = 0.85
  Adjacent (A) = 0.62
  Local (L) = 0.55
  Physical (P) = 0.2

Privileges Required (when Scope=Unchanged):
  None (N) = 0.85
  Low (L) = 0.62
  High (H) = 0.27

Score formula approach:

  1. Parse vector string into metric values
  2. Look up numerical constants for each value
  3. Calculate Exploitability sub-score
  4. Calculate Impact sub-score
  5. Combine using the specified formula
  6. Apply Environmental modifications if present
  7. Round to one decimal place

Learning Milestones

  1. You can parse any CVSS vector string → You understand the encoding format
  2. Your Base scores match official calculator → You implemented the formula correctly
  3. You can explain why each metric affects the score → Deep understanding of severity factors
  4. You can calculate Environmental scores → You understand context-specific risk

Project 3: Exploit Database Searcher (SearchSploit Clone)

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, Rust, Bash
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Exploit Research / Text Search / Database
  • Software or Tool: Exploit-DB, SQLite, Full-text search
  • Main Book: “Penetration Testing” by Georgia Weidman

What you’ll build: An offline exploit database searcher that mirrors Exploit-DB locally, indexes exploits in SQLite with full-text search, and finds relevant exploits by CVE ID, platform, or keyword. Essentially, your own SearchSploit.

Why it teaches CVE/Exploit databases: This connects CVE theory to practical exploitation. You’ll understand exploit categorization, how CVEs map to actual attack code, and why exploit availability dramatically changes vulnerability priority.

Core challenges you’ll face:

  • Downloading and parsing Exploit-DB archive → maps to understanding exploit metadata structure
  • Building full-text search index → maps to efficient exploit discovery
  • CVE-to-Exploit mapping → maps to connecting vulnerabilities to real attacks
  • Platform/type filtering → maps to understanding exploit categories

Key Concepts:

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1 completed, basic SQL knowledge, understanding of file I/O


Real World Outcome

$ ./exploit-search sync
Syncing Exploit-DB repository...
Cloning from gitlab.com/exploit-database/exploitdb...
Processing exploits: 45,234 total
  - Remote: 12,456
  - Local: 8,234
  - WebApps: 18,543
  - DoS: 3,421
  - Shellcode: 2,580
Building full-text search index...
Done! Database ready at ~/.exploit-search/exploits.db

$ ./exploit-search find "apache log4j"

╔═══════════════════════════════════════════════════════════════════╗
║                    Exploit Search Results                          ║
║                    Query: "apache log4j"                          ║
╠═══════════════════════════════════════════════════════════════════╣

┌─────────────────────────────────────────────────────────────────┐
│ [50592] Apache Log4j 2 - Remote Code Execution (RCE)           │
├─────────────────────────────────────────────────────────────────┤
│ Platform: Multiple    Type: Remote    Date: 2021-12-14         │
│ CVE: CVE-2021-44228                                            │
│ Author: marcioalm                                               │
│ Path: exploits/multiple/remote/50592.py                        │
│ Verified: Yes                                                   │
├─────────────────────────────────────────────────────────────────┤
│ Description:                                                    │
│ This module exploits the Log4Shell vulnerability in Apache     │
│ Log4j by triggering a JNDI lookup to a malicious server...     │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ [50590] Apache Log4j - JNDI LDAP Injection Scanner             │
├─────────────────────────────────────────────────────────────────┤
│ Platform: Multiple    Type: Remote    Date: 2021-12-12         │
│ CVE: CVE-2021-44228                                            │
│ Path: exploits/multiple/remote/50590.py                        │
└─────────────────────────────────────────────────────────────────┘

Found 7 exploits.

$ ./exploit-search cve CVE-2017-0144

╔═══════════════════════════════════════════════════════════════════╗
║                    Exploits for CVE-2017-0144                     ║
║                    (EternalBlue / MS17-010)                       ║
╠═══════════════════════════════════════════════════════════════════╣

  [42315] Microsoft Windows - 'EternalBlue' SMB Remote Code Execution
          Platform: Windows  |  Type: Remote  |  Verified: Yes
          Path: exploits/windows/remote/42315.py
          Metasploit: Yes (exploit/windows/smb/ms17_010_eternalblue)

  [42030] Microsoft Windows 7/2008 R2 - SMB Remote Code Execution
          Platform: Windows  |  Type: Remote  |  Verified: Yes
          Path: exploits/windows/remote/42030.py

Found 4 exploits for this CVE.

The Core Question You’re Answering

“A CVE exists for a vulnerability, but does an exploit? How do I quickly find if there’s actual attack code available?”

Exploit availability transforms risk assessment. A CVSS 9.8 with no public exploit is different from a CVSS 7.5 with weaponized Metasploit module. This project teaches you to bridge that gap.


Concepts You Must Understand First

Stop and research these before coding:

  1. Exploit Categories
    • What’s the difference between Remote, Local, and WebApp exploits?
    • What is shellcode? Why is it categorized separately?
    • What does “Verified” mean in Exploit-DB?
    • Resource: Exploit-DB Categories
  2. Full-Text Search
    • How does FTS5 indexing work in SQLite?
    • What’s the difference between exact match and fuzzy search?
    • How do you rank search results by relevance?
    • Resource: SQLite FTS5 documentation
  3. Exploit Maturity
    • What’s a PoC vs weaponized exploit?
    • How do Metasploit modules relate to Exploit-DB entries?
    • Why does exploit age matter?
    • Book: “Penetration Testing” Ch. 13-14

Questions to Guide Your Design

Before implementing, think through these:

  1. Data Architecture
    • How do you structure the database schema?
    • What fields should be indexed for search?
    • How do you handle exploits without CVE references?
  2. Search Experience
    • How do you rank results (newest? most relevant? verified first?)
    • Should you support boolean queries (AND, OR, NOT)?
    • How do you handle platform filtering?
  3. Maintenance
    • How often should the database sync?
    • How do you handle incremental updates vs full refresh?
    • What about disk space management?

Thinking Exercise

Analyze an Exploit Entry

Look at this Exploit-DB entry structure:

# Exploit Title: Apache Log4j 2 - Remote Code Execution (RCE)
# Date: 2021-12-14
# Exploit Author: marcioalm
# Vendor Homepage: https://logging.apache.org/log4j/2.x/
# Software Link: https://logging.apache.org/log4j/2.x/download.html
# Version: 2.0-beta9 <= 2.14.1
# Tested on: Linux
# CVE: CVE-2021-44228

Questions:

  • What metadata can you extract programmatically?
  • How reliable is the CVE field? (Hint: not all exploits have it)
  • What’s the relationship between “Tested on” and actual applicability?
  • How would you determine if this works on Windows?

The Interview Questions They’ll Ask

  1. “How would you determine if a vulnerability has public exploit code?”
  2. “What’s the difference between Exploit-DB and Metasploit?”
  3. “How do you assess if an exploit is reliable?”
  4. “Walk me through how you’d find exploits for a specific CVE.”
  5. “What’s the risk difference between a PoC and a Metasploit module?”

Hints in Layers

Hint 1: Start with the Data Clone the Exploit-DB repo: git clone https://gitlab.com/exploit-database/exploitdb.git. Explore the directory structure. Each exploit is a file with metadata in comments.

Hint 2: Parse the Header Exploit metadata is in comments at the top of each file. Lines starting with # followed by field names contain the data. Regular expressions work well here.

Hint 3: SQLite FTS5 Create a virtual table: CREATE VIRTUAL TABLE exploits_fts USING fts5(title, description, content). Then search with: SELECT * FROM exploits_fts WHERE exploits_fts MATCH 'apache AND log4j'.

Hint 4: CSV Shortcut Exploit-DB provides files_exploits.csv in the repo with all metadata. This is easier than parsing every file, though you lose the actual exploit code content for searching.


Books That Will Help

Topic Book Chapter
Exploit categorization “Penetration Testing” by Weidman Ch. 13-15
SQLite full-text search SQLite official documentation FTS5 section
Search ranking algorithms “Introduction to Information Retrieval” Ch. 6-7
Python file handling “Automate the Boring Stuff” by Sweigart Ch. 9-10

Implementation Hints

Database schema approach:

exploits:
  - id (INTEGER PRIMARY KEY)
  - edb_id (TEXT UNIQUE) - Exploit-DB ID
  - title (TEXT)
  - date_published (DATE)
  - author (TEXT)
  - platform (TEXT) - windows, linux, multiple, etc.
  - type (TEXT) - remote, local, webapps, dos, shellcode
  - cve (TEXT) - Can be NULL
  - verified (BOOLEAN)
  - file_path (TEXT)
  - content (TEXT) - Full exploit code

exploits_fts (FTS5):
  - Maps to title, content for full-text search

Sync strategy:

  1. Git pull (or clone if first time)
  2. Parse files_exploits.csv for metadata
  3. Optionally read full file content for FTS
  4. Upsert into SQLite
  5. Rebuild FTS index

Learning Milestones

  1. You can download and parse Exploit-DB locally → You understand the data source
  2. You can search by keyword with relevant results → Full-text search works
  3. You can map CVEs to exploits → You connect vulnerabilities to attacks
  4. You can filter by platform/type/verified → You understand exploit categorization

Project 4: Vulnerability Scanner Integration (CPE Matching)

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, Rust
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Asset Management / Vulnerability Matching
  • Software or Tool: NVD API, CPE Dictionary
  • Main Book: “Practical Vulnerability Management” by Andrew Magnusson

What you’ll build: A tool that takes a software inventory (list of installed software with versions) and matches it against the NVD to find applicable CVEs. This is the core logic behind commercial vulnerability scanners.

Why it teaches CVE/Exploit databases: CPE matching is the bridge between “vulnerabilities exist” and “YOUR systems are vulnerable.” Understanding this transforms abstract CVE data into actionable intelligence.

Core challenges you’ll face:

  • CPE string construction → maps to standardized product identification
  • Version range matching → maps to understanding affected version logic
  • Handling CPE wildcards → maps to partial and flexible matching
  • False positive reduction → maps to practical vulnerability management

Key Concepts:

  • CPE 2.3 Specification: NIST CPE Specification
  • CPE Match API: NVD CPE Match API
  • Version Comparison: Semantic versioning principles
  • Inventory Management: “Practical Vulnerability Management” Ch. 4

Difficulty: Intermediate Time estimate: 2 weeks Prerequisites: Projects 1-2 completed, understanding of version numbering


Real World Outcome

$ cat inventory.json
{
  "hostname": "web-server-01",
  "software": [
    {"name": "Apache HTTP Server", "vendor": "Apache", "version": "2.4.49"},
    {"name": "OpenSSL", "vendor": "OpenSSL", "version": "1.1.1k"},
    {"name": "Log4j", "vendor": "Apache", "version": "2.14.1"},
    {"name": "nginx", "vendor": "nginx", "version": "1.21.0"},
    {"name": "PHP", "vendor": "PHP Group", "version": "7.4.3"}
  ]
}

$ ./vuln-matcher scan inventory.json

╔═══════════════════════════════════════════════════════════════════╗
║              Vulnerability Scan Results                            ║
║              Host: web-server-01                                   ║
║              Software items: 5                                     ║
║              Scan time: 2024-12-22 14:32:01                       ║
╠═══════════════════════════════════════════════════════════════════╣

┌─────────────────────────────────────────────────────────────────┐
│ CRITICAL (2)                                                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│ ▶ Apache Log4j 2.14.1                                          │
│   CPE: cpe:2.3:a:apache:log4j:2.14.1:*:*:*:*:*:*:*             │
│                                                                 │
│   CVE-2021-44228 (CVSS 10.0) - Log4Shell RCE                   │
│     Affected: 2.0-beta9 <= version < 2.15.0                    │
│     Your version: 2.14.1 ✗ VULNERABLE                          │
│     Fix: Upgrade to 2.17.1+                                    │
│     Exploits available: 7 (Metasploit module exists)           │
│                                                                 │
│   CVE-2021-45046 (CVSS 9.0) - Log4j DoS/RCE                   │
│     Affected: 2.0-beta9 <= version < 2.16.0                    │
│     Your version: 2.14.1 ✗ VULNERABLE                          │
│                                                                 │
│ ▶ Apache HTTP Server 2.4.49                                    │
│   CPE: cpe:2.3:a:apache:http_server:2.4.49:*:*:*:*:*:*:*       │
│                                                                 │
│   CVE-2021-41773 (CVSS 7.5) - Path Traversal/RCE               │
│     Affected: 2.4.49 exactly                                   │
│     Your version: 2.4.49 ✗ VULNERABLE                          │
│     Fix: Upgrade to 2.4.51+                                    │
│     Exploits available: 12 (actively exploited ITW)            │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ HIGH (1)                                                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│ ▶ OpenSSL 1.1.1k                                               │
│   CVE-2021-3711 (CVSS 7.5) - SM2 Decryption Buffer Overflow    │
│     Fix: Upgrade to 1.1.1l+                                    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ MEDIUM (0) | LOW (0)                                           │
└─────────────────────────────────────────────────────────────────┘

╔═══════════════════════════════════════════════════════════════════╗
║ SUMMARY                                                           ║
╠═══════════════════════════════════════════════════════════════════╣
║ Critical: 2  |  High: 1  |  Medium: 0  |  Low: 0                 ║
║                                                                   ║
║ Priority Actions:                                                 ║
║   1. URGENT: Upgrade Log4j to 2.17.1+ (actively exploited)       ║
║   2. URGENT: Upgrade Apache HTTPD to 2.4.51+ (actively exploited)║
║   3. HIGH: Upgrade OpenSSL to 1.1.1l+                            ║
╚═══════════════════════════════════════════════════════════════════╝

The Core Question You’re Answering

“I know what software is installed on my systems. How do I efficiently find which CVEs actually affect MY specific versions?”

This is the fundamental question of vulnerability management. Answering it requires understanding CPE construction, version comparison logic, and the NVD’s matching algorithms.


Concepts You Must Understand First

Stop and research these before coding:

  1. CPE Format
    • What does each CPE field represent?
    • How do wildcards (*) work?
    • What’s the difference between CPE 2.2 and 2.3?
    • Resource: NVD CPE Specification
  2. Version Matching
    • How does “versionStartIncluding” vs “versionEndExcluding” work?
    • What about non-semantic versions (dates, codenames)?
    • How do you handle “all versions” matches?
    • Resource: NVD API documentation on CPE Match
  3. Product Identification
    • How do you map “Apache HTTP Server” to the correct CPE vendor/product?
    • What about products with multiple names (nginx vs engine-x)?
    • How do you handle vendor ambiguity?
    • Book: “Practical Vulnerability Management” Ch. 4

Questions to Guide Your Design

Before implementing, think through these:

  1. Inventory Input
    • What formats should you accept? (JSON, CSV, SBOM?)
    • How do you handle unknown/unrecognized software?
    • What about software without clear version numbers?
  2. Matching Logic
    • Should you use the CPE Match API or local matching?
    • How do you handle version range comparisons?
    • What about CVEs that affect “all versions”?
  3. Output & Prioritization
    • How do you sort results? (CVSS? Exploit availability? Age?)
    • What remediation info should you include?
    • How do you handle false positives?

Thinking Exercise

Manual CPE Matching

Given this inventory item:

Name: "Apache Tomcat"
Vendor: "Apache Software Foundation"
Version: "9.0.50"

Tasks:

  1. Construct the CPE string for this software
  2. Find 3 CVEs that affect this version using NVD
  3. Identify which version ranges apply
  4. Determine if your version is actually vulnerable

Questions:

  • How did you determine the correct CPE vendor name?
  • What if the version was “9.0.50-beta”?
  • How would you automate this process?

The Interview Questions They’ll Ask

  1. “How do vulnerability scanners match CVEs to installed software?”
  2. “What is CPE and why is it important?”
  3. “How would you handle software that doesn’t have an official CPE?”
  4. “What causes false positives in vulnerability scanning?”
  5. “How do you prioritize which vulnerabilities to fix first?”

Hints in Layers

Hint 1: Use the CPE Dictionary NVD provides a CPE dictionary that maps products to their official CPE strings. This helps with the “Apache HTTP Server” → “cpe:2.3:a:apache:http_server” mapping.

Hint 2: Version Comparison is Hard Don’t try to parse all version formats. Start with semver (1.2.3), then handle common patterns. Python’s packaging.version library helps.

Hint 3: CVE Configuration Logic Each CVE has “configurations” that specify affected products. These use AND/OR logic with CPE matches. A CVE might require BOTH a specific OS AND a specific application version.

Hint 4: Cache Aggressively Fetching all CVEs for each scan is slow. Cache the NVD data locally and use the “lastModified” parameter to get only updates.


Books That Will Help

Topic Book Chapter
Vulnerability management process “Practical Vulnerability Management” by Magnusson Ch. 3-5
Asset inventory “Effective Vulnerability Management” by Hughes Ch. 3
Version comparison “Semantic Versioning” specification Full doc
Python packaging “Python Packaging User Guide” Version specifiers

Implementation Hints

CPE construction approach:

Input: {"name": "Apache HTTP Server", "vendor": "Apache", "version": "2.4.49"}

Step 1: Normalize vendor → "apache"
Step 2: Normalize product → "http_server"
Step 3: Construct CPE → "cpe:2.3:a:apache:http_server:2.4.49:*:*:*:*:*:*:*"

Version range matching pseudo-logic:

For each CVE configuration:
  For each CPE match criteria:
    If our_cpe matches criteria.cpe_pattern:
      If criteria.versionStartIncluding exists:
        Check our_version >= start_version
      If criteria.versionEndExcluding exists:
        Check our_version < end_version
      If all checks pass: VULNERABLE

Learning Milestones

  1. You can construct valid CPE strings from software names → You understand product identification
  2. You can query NVD for CVEs affecting a CPE → You understand the API
  3. Your version matching logic works correctly → You understand range comparisons
  4. You produce actionable output with priorities → You understand vulnerability management

Project 5: CVE Feed Monitor & Alerting System

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, Rust, JavaScript/Node.js
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Monitoring / Alerting / Automation
  • Software or Tool: NVD API, SQLite, Email/Slack APIs
  • Main Book: “Practical Vulnerability Management” by Andrew Magnusson

What you’ll build: A daemon/service that continuously monitors NVD for new CVEs matching your criteria (products, severity, keywords), stores state, and sends alerts via email/Slack/webhook when relevant vulnerabilities appear.

Why it teaches CVE/Exploit databases: Real-world security operations require proactive monitoring, not just ad-hoc searches. This project teaches you about CVE publication timelines, change tracking, and building operational security tooling.

Core challenges you’ll face:

  • Incremental API polling → maps to tracking what’s new vs modified
  • State management → maps to avoiding duplicate alerts
  • Alert deduplication → maps to preventing alert fatigue
  • Multi-channel notification → maps to operational integration

Key Concepts:

  • NVD Change History API: NVD API - lastModStartDate parameter
  • Pub/Sub Patterns: Event-driven architecture basics
  • Alert Fatigue: “Effective Vulnerability Management” Ch. 7
  • Daemon Design: Linux daemon patterns

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Projects 1-4 completed, basic understanding of background services


Real World Outcome

$ cat ~/.cve-monitor/config.yaml
monitors:
  - name: "Our Tech Stack"
    cpe_patterns:
      - "cpe:2.3:a:apache:*"
      - "cpe:2.3:a:nginx:*"
      - "cpe:2.3:a:postgresql:*"
      - "cpe:2.3:a:redis:*"
    min_severity: "HIGH"

  - name: "Critical Everything"
    keywords: ["remote code execution", "authentication bypass"]
    min_severity: "CRITICAL"

  - name: "Log4j Watch"
    keywords: ["log4j", "log4shell"]
    min_severity: "LOW"

notifications:
  slack:
    webhook_url: "https://hooks.slack.com/services/XXX"
    channel: "#security-alerts"
  email:
    smtp_server: "smtp.company.com"
    recipients: ["security@company.com"]

$ ./cve-monitor start
[2024-12-22 14:00:00] CVE Monitor started
[2024-12-22 14:00:00] Loaded 3 monitor rules
[2024-12-22 14:00:01] Initial sync: fetching CVEs from last 24 hours...
[2024-12-22 14:00:15] Found 47 new CVEs, 12 match your criteria
[2024-12-22 14:00:15] Sending alerts...
[2024-12-22 14:00:16] ✓ Slack notification sent
[2024-12-22 14:00:17] ✓ Email sent to security@company.com
[2024-12-22 14:05:00] Polling for updates...
[2024-12-22 14:05:02] No new matching CVEs
[2024-12-22 14:10:00] Polling for updates...
[2024-12-22 14:10:03] NEW: CVE-2024-XXXXX matches "Our Tech Stack" (CRITICAL)
[2024-12-22 14:10:04] Sending alert...

# Slack message received:
┌─────────────────────────────────────────────────────────────────┐
│ 🚨 NEW CRITICAL CVE ALERT                                      │
├─────────────────────────────────────────────────────────────────┤
│ CVE-2024-XXXXX                                                 │
│ CVSS: 9.8 (CRITICAL)                                           │
│                                                                 │
│ Apache HTTP Server 2.4.x - Remote Code Execution               │
│                                                                 │
│ Affected: Apache HTTP Server 2.4.0 - 2.4.58                   │
│ Fix: Upgrade to 2.4.59                                         │
│                                                                 │
│ Matched rule: "Our Tech Stack"                                 │
│ Published: 2024-12-22 14:08:00 UTC                            │
│                                                                 │
│ Details: https://nvd.nist.gov/vuln/detail/CVE-2024-XXXXX      │
└─────────────────────────────────────────────────────────────────┘

The Core Question You’re Answering

“How do I know when a new vulnerability affecting my systems is published, without manually checking every day?”

Security teams can’t refresh NVD constantly. Automated monitoring transforms reactive “we got hacked” responses into proactive “patch before exploit” defense.


Concepts You Must Understand First

Stop and research these before coding:

  1. NVD Update Cadence
    • How often does NVD publish new CVEs?
    • What’s the difference between pubStartDate and lastModStartDate?
    • How do you track modifications to existing CVEs?
    • Resource: NVD API documentation
  2. State Management
    • How do you remember what you’ve already alerted on?
    • What if the service restarts?
    • How do you handle “modified” CVEs (re-alert or not)?
    • Pattern: Persistent state with SQLite or file-based storage
  3. Alert Design
    • What makes an alert actionable?
    • How do you prevent alert fatigue?
    • When should you batch alerts vs send immediately?
    • Book: “Effective Vulnerability Management” Ch. 7

Questions to Guide Your Design

Before implementing, think through these:

  1. Polling Strategy
    • How often should you poll? (Balance freshness vs API limits)
    • Should you use exponential backoff on errors?
    • What’s your retry strategy?
  2. Matching Rules
    • How flexible should the rule syntax be?
    • Should you support AND/OR logic?
    • How do you handle keyword matching (exact vs fuzzy)?
  3. Notification Channels
    • Which channels are essential? (Email, Slack, webhook?)
    • Should alerts be batched or individual?
    • How do you format alerts for each channel?

Thinking Exercise

Design an Alerting Strategy

Your organization has:

  • 50 production servers running various software
  • A Slack channel for security alerts
  • An email list for critical issues
  • A ticketing system (Jira)

Questions:

  • What severity levels warrant which notification channels?
  • How would you prevent duplicate tickets?
  • What information must be in an alert to be actionable?
  • How do you handle “noisy” rules that match too much?

The Interview Questions They’ll Ask

  1. “How would you design a system to monitor for new vulnerabilities?”
  2. “What’s the trade-off between polling frequency and API rate limits?”
  3. “How do you prevent alert fatigue in a vulnerability monitoring system?”
  4. “How would you handle CVEs that are modified after initial publication?”
  5. “What makes a good security alert?”

Hints in Layers

Hint 1: Use lastModStartDate Query with lastModStartDate set to your last successful poll time. This returns both new and modified CVEs since then. Store the timestamp after each successful poll.

Hint 2: Hash for Deduplication Create a hash of each CVE’s key fields (ID + CVSS + description). Store hashes. Only alert if the hash is new or changed.

Hint 3: Batch Small Time Windows Don’t poll every minute. Every 5-15 minutes is reasonable. Batch alerts that arrive in the same poll cycle into a single notification.

Hint 4: Graceful Degradation If Slack fails, try email. If email fails, log locally. Never lose an alert just because one channel is down.


Books That Will Help

Topic Book Chapter
Alert design “Effective Vulnerability Management” by Hughes Ch. 7
Daemon patterns “Linux System Programming” by Love Ch. 5-6
Event-driven design “Designing Data-Intensive Applications” by Kleppmann Ch. 11
Notification systems Various Slack/webhook API docs -

Implementation Hints

State storage schema:

alerts_sent:
  - cve_id (TEXT)
  - content_hash (TEXT)  -- Hash of CVE details
  - alerted_at (TIMESTAMP)
  - channels (TEXT)  -- JSON array of channels notified

poll_state:
  - last_poll_time (TIMESTAMP)
  - last_success_time (TIMESTAMP)
  - consecutive_failures (INTEGER)

Polling loop pseudo-logic:

while running:
  try:
    last_poll = get_last_success_time()
    cves = fetch_cves_since(last_poll)
    matching = filter_by_rules(cves)
    new_alerts = filter_already_sent(matching)
    if new_alerts:
      send_notifications(new_alerts)
      record_alerts_sent(new_alerts)
    update_last_success_time(now())
  except APIError:
    increment_failure_count()
    backoff()
  sleep(POLL_INTERVAL)

Learning Milestones

  1. You can poll NVD incrementally → You understand the API’s change tracking
  2. Your monitor correctly detects new CVEs → The core logic works
  3. Alerts arrive in Slack/email → Integration works
  4. No duplicate alerts on restarts → State management works

Project 6: CWE Pattern Analyzer

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: R, Julia, JavaScript
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Data Analysis / Security Patterns / Visualization
  • Software or Tool: NVD API, pandas, matplotlib/plotly
  • Main Book: “Effective Vulnerability Management” by Chris Hughes

What you’ll build: An analysis tool that downloads CVE data, extracts CWE classifications, and produces reports on vulnerability patterns—which weakness types are most common, trending, or most severe. Visualizations show security trends over time.

Why it teaches CVE/Exploit databases: Understanding CWE patterns transforms you from “fixing individual bugs” to “understanding systemic weaknesses.” This is how security researchers identify emerging threat categories.

Core challenges you’ll face:

  • Bulk CVE data processing → maps to handling large datasets efficiently
  • CWE hierarchy navigation → maps to understanding weakness relationships
  • Time-series analysis → maps to identifying trends
  • Data visualization → maps to communicating insights

Key Concepts:

  • CWE Hierarchy: CWE View
  • CWE Top 25: MITRE CWE Top 25
  • Data Analysis: pandas and data visualization basics
  • Trend Analysis: “Effective Vulnerability Management” Ch. 8

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Projects 1-2 completed, basic data analysis skills


Real World Outcome

$ ./cwe-analyzer fetch --years 2020-2024
Fetching CVE data for 2020-2024...
  2020: 18,352 CVEs (processed)
  2021: 20,150 CVEs (processed)
  2022: 25,081 CVEs (processed)
  2023: 28,902 CVEs (processed)
  2024: 31,234 CVEs (processed)
Total: 123,719 CVEs with CWE data

$ ./cwe-analyzer report --top 10

╔═══════════════════════════════════════════════════════════════════╗
║           CWE Analysis Report (2020-2024)                         ║
╠═══════════════════════════════════════════════════════════════════╣

TOP 10 MOST COMMON WEAKNESS TYPES
─────────────────────────────────────────────────────────────────────
Rank  CWE       Name                           Count    % of Total
─────────────────────────────────────────────────────────────────────
  1   CWE-79    Cross-site Scripting (XSS)     15,234    12.3%
  2   CWE-89    SQL Injection                   9,876     8.0%
  3   CWE-787   Out-of-bounds Write             8,543     6.9%
  4   CWE-20    Improper Input Validation       7,234     5.8%
  5   CWE-125   Out-of-bounds Read              6,987     5.6%
  6   CWE-22    Path Traversal                  5,432     4.4%
  7   CWE-352   Cross-Site Request Forgery      4,876     3.9%
  8   CWE-78    OS Command Injection            4,321     3.5%
  9   CWE-416   Use After Free                  3,987     3.2%
 10   CWE-434   Unrestricted Upload             3,654     3.0%

SEVERITY BY CWE TYPE
─────────────────────────────────────────────────────────────────────
CWE        Avg CVSS   Critical%   High%    Medium%   Low%
─────────────────────────────────────────────────────────────────────
CWE-78        8.9       45%        42%       12%      1%
CWE-787       8.7       41%        44%       14%      1%
CWE-89        8.4       38%        45%       16%      1%
CWE-416       8.3       35%        48%       16%      1%
CWE-79        5.8        2%        18%       72%      8%

YEAR-OVER-YEAR TRENDS
─────────────────────────────────────────────────────────────────────
                2020    2021    2022    2023    2024    Trend
CWE-79         2,345   2,876   3,102   3,456   3,455    ↗ +47%
CWE-89         2,012   1,987   1,923   1,876   2,078    → flat
CWE-787        1,234   1,567   1,876   2,012   1,854    ↗ +50%
CWE-400 (DoS)    876   1,234   1,765   2,345   2,987    ↗ +241%

EMERGING THREATS (Fastest Growing)
─────────────────────────────────────────────────────────────────────
  1. CWE-1321 (Prototype Pollution)         +312% YoY
  2. CWE-918  (SSRF)                         +189% YoY
  3. CWE-502  (Deserialization)              +156% YoY
  4. CWE-400  (Resource Exhaustion)          +134% YoY

$ ./cwe-analyzer visualize --output report.html
Generated interactive report: report.html

The Core Question You’re Answering

“What types of vulnerabilities are most common? Which are getting worse? What should developers focus on preventing?”

Moving from individual CVEs to patterns reveals where the industry is failing repeatedly. This is how security standards like the CWE Top 25 are created.


Concepts You Must Understand First

Stop and research these before coding:

  1. CWE Hierarchy
    • What’s the difference between a Pillar, Class, Base, and Variant CWE?
    • How do you roll up specific CWEs to broader categories?
    • What are the major CWE views (Research, Development, etc.)?
    • Resource: CWE Hierarchy Documentation
  2. Statistical Analysis
    • What metrics are meaningful for CWE analysis?
    • How do you identify statistically significant trends?
    • What about accounting for overall CVE volume growth?
    • Skill: Basic statistics and pandas
  3. Data Visualization
    • What chart types work for this data?
    • How do you make trends visible?
    • What makes a security report actionable?
    • Tool: matplotlib, plotly, or similar

Questions to Guide Your Design

Before implementing, think through these:

  1. Data Collection
    • How much historical data do you need for meaningful trends?
    • How do you handle CVEs with multiple CWEs?
    • What about CVEs with “NVD-CWE-noinfo”?
  2. Analysis Approaches
    • Should you normalize by total CVE count per year?
    • How do you define “emerging” vs “declining”?
    • What’s a meaningful severity breakdown?
  3. Output Formats
    • CLI tables vs HTML reports vs interactive dashboards?
    • What visualizations tell the story best?
    • How do you make it useful for different audiences?

Thinking Exercise

Look at the CWE Top 25 from 2020, 2021, 2022, 2023:

Questions:

  • Which CWEs moved up significantly?
  • Which dropped?
  • What does this tell you about the threat landscape?
  • How would you automate this analysis?

The Interview Questions They’ll Ask

  1. “What are the most common types of vulnerabilities?”
  2. “How would you analyze vulnerability trends over time?”
  3. “What’s the relationship between CWE and CVE?”
  4. “How do you identify emerging vulnerability categories?”
  5. “What insights can you derive from bulk CVE data?”

Hints in Layers

Hint 1: Bulk Download Strategy Use the NVD’s date range parameters to fetch CVEs year by year. Store in SQLite for analysis. Don’t hit the API repeatedly for the same data.

Hint 2: Handle Multiple CWEs A CVE can have multiple CWEs (e.g., both XSS and input validation). Decide whether to count once per CVE or once per CWE-CVE pair.

Hint 3: Normalize for Growth CVE volume increases every year. Raw counts are misleading. Use percentages or per-1000-CVE rates to identify real trends.

Hint 4: Roll Up to Categories CWE-79 (XSS), CWE-89 (SQLi), CWE-78 (Command Injection) are all “Injection” problems. Sometimes rolling up to parent CWEs gives clearer insights.


Books That Will Help

Topic Book Chapter
CWE understanding CWE Top 25 documentation Full list
Data analysis “Python for Data Analysis” by McKinney Ch. 5-10
Visualization “Fundamentals of Data Visualization” by Wilke Ch. 4-8
Security trends “Effective Vulnerability Management” by Hughes Ch. 8

Implementation Hints

Data model for analysis:

cves:
  - cve_id (TEXT PRIMARY KEY)
  - published_year (INTEGER)
  - cvss_score (REAL)
  - severity (TEXT)

cve_cwes:
  - cve_id (TEXT)
  - cwe_id (TEXT)
  PRIMARY KEY (cve_id, cwe_id)

Trend calculation approach:

1. Count CVEs per CWE per year
2. Calculate percentage of total per year
3. Compare year-over-year percentage change
4. Identify statistically significant changes
5. Rank by growth rate for "emerging"

Learning Milestones

  1. You can bulk download and store CVE data → You can handle large datasets
  2. You can extract and count CWE patterns → You understand the classification
  3. Your trend analysis shows meaningful patterns → You can interpret the data
  4. Your visualizations tell a clear story → You can communicate insights

Project 7: CVE-to-ATT&CK Mapper

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, JavaScript
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Threat Intelligence / NLP / Classification
  • Software or Tool: NVD API, MITRE ATT&CK STIX data, ML libraries
  • Main Book: “Practical Threat Intelligence and Data-Driven Threat Hunting” by Valentina Costa-Gazcon

What you’ll build: A tool that analyzes CVE descriptions and automatically maps them to MITRE ATT&CK techniques, enabling threat-informed vulnerability prioritization. This bridges vulnerability data with adversary behavior models.

Why it teaches CVE/Exploit databases: This is the cutting edge of vulnerability intelligence. Understanding how vulnerabilities enable specific attack techniques transforms static CVE data into actionable threat intelligence.

Core challenges you’ll face:

  • NLP/keyword extraction from CVE descriptions → maps to understanding vulnerability semantics
  • ATT&CK technique matching → maps to adversary behavior modeling
  • Handling ambiguous mappings → maps to real-world data messiness
  • Confidence scoring → maps to making automated decisions trustworthy

Key Concepts:

  • MITRE ATT&CK Framework: ATT&CK Matrix
  • CVE-to-ATT&CK Methodology: CTID Project
  • NLP Basics: Text classification and keyword extraction
  • STIX/TAXII: Threat intelligence data formats

Difficulty: Advanced Time estimate: 2-4 weeks Prerequisites: Projects 1-6 completed, basic NLP/ML understanding


Real World Outcome

$ ./cve-attack-mapper analyze CVE-2021-44228

╔═══════════════════════════════════════════════════════════════════╗
║          CVE-to-ATT&CK Analysis: CVE-2021-44228                  ║
╠═══════════════════════════════════════════════════════════════════╣

VULNERABILITY SUMMARY
─────────────────────────────────────────────────────────────────────
CVE-2021-44228 (Log4Shell)
CVSS: 10.0 (Critical)
Description: Apache Log4j2 JNDI features... remote code execution

ATT&CK TECHNIQUE MAPPINGS
─────────────────────────────────────────────────────────────────────

┌─────────────────────────────────────────────────────────────────┐
│ T1190 - Exploit Public-Facing Application                       │
│ Tactic: Initial Access                                          │
│ Confidence: HIGH (0.95)                                         │
├─────────────────────────────────────────────────────────────────┤
│ Rationale: CVE enables RCE via network-accessible service       │
│ Keywords matched: "remote", "JNDI", "injection"                 │
│                                                                 │
│ Detection: Monitor for unusual outbound LDAP/RMI connections    │
│ Mitigation: Update Log4j, disable JNDI lookup feature          │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ T1059.007 - Command and Scripting Interpreter: JavaScript      │
│ Tactic: Execution                                               │
│ Confidence: MEDIUM (0.72)                                       │
├─────────────────────────────────────────────────────────────────┤
│ Rationale: JNDI can load and execute arbitrary code            │
│ Keywords matched: "execute", "code", "JNDI"                     │
│                                                                 │
│ Detection: Monitor process creation from Java processes         │
│ Mitigation: Application whitelisting, EDR                      │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ T1071.001 - Application Layer Protocol: Web Protocols          │
│ Tactic: Command and Control                                     │
│ Confidence: MEDIUM (0.68)                                       │
├─────────────────────────────────────────────────────────────────┤
│ Rationale: Exploitation uses LDAP/HTTP for payload delivery    │
│                                                                 │
│ Detection: Monitor for LDAP connections to external hosts      │
└─────────────────────────────────────────────────────────────────┘

THREAT ACTOR RELEVANCE
─────────────────────────────────────────────────────────────────────
Groups known to use these techniques:
  • APT41 (China) - T1190, T1059
  • Lazarus Group (North Korea) - T1190
  • Various ransomware groups - T1190

DEFENSIVE RECOMMENDATIONS
─────────────────────────────────────────────────────────────────────
Based on ATT&CK mappings, prioritize these defenses:
  1. Web Application Firewall rules for JNDI patterns
  2. Egress filtering for LDAP/RMI protocols
  3. Java process monitoring for unusual child processes
  4. Network segmentation for vulnerable services

The Core Question You’re Answering

“This CVE exists, but what can an attacker actually DO with it? What attack techniques does it enable?”

CVEs tell you something is broken. ATT&CK tells you how adversaries will use it. This mapping transforms vulnerability data into threat intelligence.


Concepts You Must Understand First

Stop and research these before coding:

  1. MITRE ATT&CK Structure
    • What are Tactics, Techniques, and Sub-techniques?
    • How do you navigate the ATT&CK matrix?
    • What data does ATT&CK provide per technique?
    • Resource: ATT&CK Getting Started
  2. Text Classification Approaches
    • How do keyword-based classifiers work?
    • What about embedding-based similarity?
    • How do you handle multi-label classification?
    • Skill: Basic NLP (can be rule-based)
  3. Threat Intelligence Integration
    • What is STIX/TAXII?
    • How do you consume ATT&CK data programmatically?
    • How do threat actors map to techniques?
    • Resource: ATT&CK STIX Data

Questions to Guide Your Design

Before implementing, think through these:

  1. Mapping Strategy
    • Rule-based (keywords) vs ML-based vs hybrid?
    • How do you handle CVEs that don’t clearly map?
    • What’s an acceptable confidence threshold?
  2. ATT&CK Data
    • Which ATT&CK matrix? (Enterprise, Mobile, ICS?)
    • How do you stay updated with ATT&CK changes?
    • Should you map to sub-techniques or just techniques?
  3. Output Utility
    • What makes the mapping actionable?
    • Should you include detection/mitigation from ATT&CK?
    • How do you present confidence levels?

Thinking Exercise

Manual Mapping Exercise

Read these CVE descriptions and determine ATT&CK mappings:

  1. CVE-2020-1472 (Zerologon): “An elevation of privilege vulnerability exists when an attacker establishes a vulnerable Netlogon secure channel connection to a domain controller…”

  2. CVE-2019-11510 (Pulse Secure): “In Pulse Secure Pulse Connect Secure, an unauthenticated remote attacker can send a specially crafted URI to perform an arbitrary file reading vulnerability…”

Questions:

  • What techniques does each enable?
  • What tactics do they fall under?
  • How confident are you in your mappings?
  • What keywords helped you decide?

The Interview Questions They’ll Ask

  1. “How would you map a CVE to MITRE ATT&CK techniques?”
  2. “What’s the value of integrating vulnerability data with ATT&CK?”
  3. “How do you handle CVEs that could map to multiple techniques?”
  4. “How would you validate your CVE-to-ATT&CK mappings?”
  5. “What’s the difference between vulnerability management and threat intelligence?”

Hints in Layers

Hint 1: Start with Rules Create a keyword-to-technique mapping. “remote code execution” → T1190. “privilege escalation” → T1068. This gets you 70% of the way.

Hint 2: Use CWE as a Bridge Some CWEs map cleanly to ATT&CK. CWE-89 (SQL Injection) → T1190. Use CWE data from the CVE to inform your mapping.

Hint 3: Leverage Existing Work The CTID project has published CVE-to-ATT&CK mappings. Use these as training data or validation.

Hint 4: Confidence Scoring Multiple keyword matches = higher confidence. Single weak match = lower confidence. No matches = “Unable to map.”


Books That Will Help

Topic Book Chapter
ATT&CK Framework MITRE ATT&CK documentation All
Threat Intelligence “Practical Threat Intelligence” by Costa-Gazcon Ch. 4-6
NLP basics “Natural Language Processing with Python” Ch. 1-3
Classification “Hands-On Machine Learning” by Géron Ch. 3-4

Implementation Hints

Simple rule-based approach:

Rules (keyword → technique):
  "remote code execution" → T1190
  "authentication bypass" → T1078
  "privilege escalation" → T1068
  "SQL injection" → T1190
  "command injection" → T1059
  "deserialization" → T1190 + T1059

For each CVE:
  1. Lowercase description
  2. Match against rules
  3. Calculate confidence = matched_rules / total_rules_checked
  4. Return techniques with confidence >= threshold

ATT&CK data loading:

Download STIX data from github.com/mitre/cti
Parse techniques into lookup structure:
  technique_id → {name, tactic, description, detection, mitigation}

Learning Milestones

  1. You can load and query ATT&CK data → You understand the framework structure
  2. Your rule-based mapper produces reasonable results → Core logic works
  3. Confidence scores correlate with mapping quality → You can trust the output
  4. You include actionable detection/mitigation info → The output is useful

Project 8: Vulnerability Disclosure Timeline Tracker

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, JavaScript
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Data Analysis / Timeline Visualization
  • Software or Tool: NVD API, CVE data, timeline visualization
  • Main Book: “Effective Vulnerability Management” by Chris Hughes

What you’ll build: A tool that tracks and visualizes the timeline from vulnerability discovery to patch availability to exploit publication, showing the “window of exposure” for different products and vendors.

Why it teaches CVE/Exploit databases: Understanding disclosure timelines reveals how the vulnerability ecosystem actually works—vendor response times, exploit development speed, and the race between attackers and defenders.

Core challenges you’ll face:

  • Extracting dates from multiple sources → maps to data integration complexity
  • Handling missing/incomplete data → maps to real-world messiness
  • Timeline visualization → maps to communicating temporal data
  • Vendor performance analysis → maps to actionable metrics

Key Concepts:

  • Coordinated Disclosure: Standard vulnerability disclosure process
  • Patch Window: Time between disclosure and patch
  • Exploit Window: Time between disclosure and public exploit
  • Timeline Visualization: Gantt-style charts, waterfall diagrams

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Projects 1-3 completed, basic data visualization


Real World Outcome

$ ./disclosure-tracker analyze CVE-2021-44228

╔═══════════════════════════════════════════════════════════════════╗
║         Disclosure Timeline: CVE-2021-44228 (Log4Shell)          ║
╠═══════════════════════════════════════════════════════════════════╣

TIMELINE
─────────────────────────────────────────────────────────────────────

2021-11-24  │█░░░░░░░░░░░░░░░░░░░│  Reported to Apache
            │                     │
2021-12-05  │████░░░░░░░░░░░░░░░░│  CVE Reserved (11 days)
            │                     │
2021-12-09  │█████░░░░░░░░░░░░░░░│  Patch Released (2.15.0) (15 days)
            │                     │
2021-12-10  │██████░░░░░░░░░░░░░░│  CVE Published (16 days)
            │                     │
2021-12-10  │██████░░░░░░░░░░░░░░│  First PoC on GitHub (<1 day from pub)
            │                     │
2021-12-11  │███████░░░░░░░░░░░░░│  First Exploit-DB entry (1 day from pub)
            │                     │
2021-12-11  │███████░░░░░░░░░░░░░│  Active exploitation observed (1 day)
            │                     │
2021-12-14  │██████████░░░░░░░░░░│  Metasploit module (4 days from pub)

KEY METRICS
─────────────────────────────────────────────────────────────────────
  Report → CVE Reserved:     11 days
  Report → Patch:            15 days
  CVE Publish → PoC:         <1 day   ⚠️ EXTREMELY FAST
  CVE Publish → ITW Exploit: 1 day    ⚠️ CRITICAL URGENCY
  Patch → ITW Exploit:       2 days   (Patch available before exploit)

RISK ASSESSMENT
─────────────────────────────────────────────────────────────────────
  ✓ Patch was available before active exploitation
  ⚠️ Exploit development was extremely rapid
  ⚠️ Weaponized within 24 hours of disclosure

$ ./disclosure-tracker vendor-stats apache

╔═══════════════════════════════════════════════════════════════════╗
║              Apache Software Foundation                           ║
║              Disclosure Statistics (2020-2024)                   ║
╠═══════════════════════════════════════════════════════════════════╣

                          Avg Days    Median    Best    Worst
─────────────────────────────────────────────────────────────────────
Report → Patch:              23         18        3       180
Report → CVE:                15         12        1        90
CVE → Exploit (if exists):    7          3       <1        45

COMPARISON TO INDUSTRY
─────────────────────────────────────────────────────────────────────
             Report→Patch
Microsoft    ████████████████████░░░░  31 days
Google       ██████████░░░░░░░░░░░░░░  15 days
Apache       ███████████████░░░░░░░░░  23 days (this vendor)
Industry Avg █████████████████████░░░  28 days

Total CVEs analyzed: 847
CVEs with exploit data: 234

The Core Question You’re Answering

“How fast do vulnerabilities go from discovery to patch to exploit? How do different vendors compare?”

This data drives security strategy. If a vendor patches slowly, you need compensating controls. If exploits appear before patches, you need zero-day detection.


Concepts You Must Understand First

Stop and research these before coding:

  1. Disclosure Process
    • What’s coordinated disclosure vs full disclosure?
    • What’s the typical timeline?
    • Who are the stakeholders?
    • Resource: CERT/CC disclosure guidelines
  2. Date Extraction
    • Where do different dates come from? (CVE pub, patch release, exploit date)
    • How reliable are these dates?
    • What about embargoed vulnerabilities?
    • Pattern: Multi-source data aggregation
  3. Timeline Visualization
    • What chart types show timelines effectively?
    • How do you compare multiple timelines?
    • How do you handle missing data points?
    • Tool: matplotlib, plotly, or similar

The Interview Questions They’ll Ask

  1. “How long does it typically take from vulnerability discovery to patch?”
  2. “How do you measure vendor security responsiveness?”
  3. “What’s the ‘window of exposure’ and why does it matter?”
  4. “How do you handle incomplete timeline data?”
  5. “What metrics would you use to compare vendor security practices?”

Hints in Layers

Hint 1: Multiple Data Sources NVD gives CVE publish date. Vendor advisories give patch dates. Exploit-DB gives exploit dates. You need to correlate across sources.

Hint 2: Proxy Dates You often can’t know the actual “discovery” date. Use CVE reserved date as a proxy for when the disclosure process started.

Hint 3: Normalize for Comparison “Days from CVE publication to exploit” is comparable across vendors. “Days from discovery to patch” varies based on how “discovery” is defined.

Hint 4: Handle Missing Data Gracefully Not all CVEs have exploits. Not all have clear patch dates. Your analysis should specify sample sizes and data completeness.


Learning Milestones

  1. You can extract dates from CVE/NVD data → You understand the data model
  2. You can correlate exploit dates from Exploit-DB → Multi-source integration
  3. Your timeline visualizations are clear → You can communicate findings
  4. Your vendor comparisons are statistically meaningful → Actionable analysis

Project 9: SBOM Vulnerability Analyzer

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, Rust, JavaScript
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Software Supply Chain / SBOM / Dependency Analysis
  • Software or Tool: SPDX/CycloneDX parsers, NVD API, dependency resolvers
  • Main Book: “Software Transparency” by various SBOM contributors

What you’ll build: A tool that ingests Software Bill of Materials (SBOM) files in SPDX or CycloneDX format, resolves all dependencies to CPEs, and produces a comprehensive vulnerability report for the entire software supply chain.

Why it teaches CVE/Exploit databases: Software supply chain security is the modern frontier. Understanding how to trace vulnerabilities through dependency trees is critical as attacks increasingly target the supply chain (SolarWinds, Log4j, etc.).

Core challenges you’ll face:

  • Parsing multiple SBOM formats → maps to industry standardization efforts
  • Package-to-CPE resolution → maps to the hard problem of identification
  • Transitive dependency handling → maps to supply chain depth
  • License and vulnerability correlation → maps to multi-dimensional risk

Key Concepts:

  • SBOM Formats: SPDX, CycloneDX, SWID
  • Package URL (PURL): Universal package identifier
  • Transitive Dependencies: Dependencies of dependencies
  • Supply Chain Security: “Effective Vulnerability Management” Ch. 9

Difficulty: Advanced Time estimate: 3-4 weeks Prerequisites: Projects 1-4 completed, understanding of package managers


Real World Outcome

$ ./sbom-analyzer scan my-app-sbom.json

╔═══════════════════════════════════════════════════════════════════╗
║              SBOM Vulnerability Analysis                          ║
║              File: my-app-sbom.json                               ║
║              Format: CycloneDX 1.4                                ║
╠═══════════════════════════════════════════════════════════════════╣

SBOM SUMMARY
─────────────────────────────────────────────────────────────────────
  Direct dependencies:        47
  Transitive dependencies:   312
  Total components:          359
  Components with CVEs:       23

DEPENDENCY TREE WITH VULNERABILITIES
─────────────────────────────────────────────────────────────────────

my-application@1.0.0
├── express@4.17.1
│   ├── body-parser@1.19.0
│   │   └── qs@6.7.0 ⚠️ CVE-2022-24999 (HIGH 7.5)
│   └── cookie@0.4.0
├── lodash@4.17.19 🔴 CVE-2021-23337 (CRITICAL 7.2)
│                  🔴 CVE-2020-28500 (MEDIUM 5.3)
├── log4j@2.14.1 🔴 CVE-2021-44228 (CRITICAL 10.0) ⚡ EXPLOIT EXISTS
│               🔴 CVE-2021-45046 (CRITICAL 9.0)
│               🔴 CVE-2021-45105 (HIGH 7.5)
└── axios@0.21.1
    └── follow-redirects@1.14.1 ⚠️ CVE-2022-0155 (MEDIUM 6.5)

VULNERABILITY SUMMARY BY SEVERITY
─────────────────────────────────────────────────────────────────────
  🔴 Critical:  4 (2 components)
  ⚠️  High:      3 (2 components)
  ⚠️  Medium:    8 (5 components)
  ℹ️  Low:       2 (2 components)

TOP PRIORITY REMEDIATIONS
─────────────────────────────────────────────────────────────────────
1. CRITICAL: Upgrade log4j 2.14.1 → 2.17.1
   - Fixes 3 CVEs including Log4Shell
   - Exploit actively used in the wild
   - Breaking changes: None expected

2. CRITICAL: Upgrade lodash 4.17.19 → 4.17.21
   - Fixes 2 CVEs
   - Breaking changes: Minor

3. HIGH: Upgrade qs 6.7.0 → 6.11.0
   - Transitive via body-parser
   - Consider upgrading body-parser instead

SUPPLY CHAIN INSIGHTS
─────────────────────────────────────────────────────────────────────
  Most vulnerable dependency: log4j (3 CVEs)
  Deepest vulnerable dep:     qs (depth: 3)
  Unmapped components:        12 (no CPE found)

Export options: [J]SON  [C]SV  [H]TML  [S]ARif

The Core Question You’re Answering

“My application has 300 dependencies. Which ones have vulnerabilities, including the dependencies of my dependencies?”

Modern software is built on towers of dependencies. A vulnerability anywhere in that tower is a vulnerability in your application. SBOM analysis reveals this hidden attack surface.


Concepts You Must Understand First

Stop and research these before coding:

  1. SBOM Formats
    • What’s the difference between SPDX and CycloneDX?
    • What information does an SBOM contain?
    • How are components identified? (PURL, CPE, etc.)
    • Resource: SPDX and CycloneDX specifications
  2. Package-to-CPE Mapping
    • How do you map “lodash@4.17.19” to a CPE?
    • What about packages without official CPEs?
    • How do you handle version format differences?
    • Challenge: This is the hardest part
  3. Dependency Resolution
    • What are transitive dependencies?
    • How do you handle version conflicts?
    • What about optional dependencies?
    • Pattern: Dependency tree traversal

The Interview Questions They’ll Ask

  1. “What is an SBOM and why does it matter for security?”
  2. “How do you map npm/PyPI packages to CVEs?”
  3. “What’s a transitive dependency vulnerability?”
  4. “How do you handle components without known CPEs?”
  5. “What was the Log4j incident and how did it relate to supply chain security?”

Hints in Layers

Hint 1: Use Package URL (PURL) PURL (pkg:npm/lodash@4.17.19) is a universal package identifier. Some databases map PURLs directly to vulnerabilities.

Hint 2: OSV Database Google’s Open Source Vulnerability database (osv.dev) maps directly to package ecosystem identifiers, avoiding the CPE problem.

Hint 3: Start with One Ecosystem npm, PyPI, and Maven each have different identifier schemes. Start with one (npm is well-documented) before generalizing.

Hint 4: Depth Matters Flag vulnerabilities in direct dependencies differently from deeply-nested transitive ones. Remediation strategies differ.


Learning Milestones

  1. You can parse SPDX and CycloneDX SBOMs → Format understanding
  2. You can map packages to CPEs/vulnerabilities → The hard problem
  3. You correctly trace transitive dependencies → Supply chain depth
  4. Your remediation advice is actionable → Practical utility

Project 10: Mini Vulnerability Scanner (Network-Based)

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, Rust
  • Coolness Level: Level 5: Pure Magic
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 4: Expert
  • Knowledge Area: Network Security / Service Detection / Fingerprinting
  • Software or Tool: socket programming, nmap-style detection, NVD API
  • Main Book: “Nmap Network Scanning” by Gordon Lyon

What you’ll build: A network vulnerability scanner that discovers services on target hosts, fingerprints their versions, and cross-references against your CVE database to find vulnerabilities. This is a mini-Nessus/OpenVAS.

Why it teaches CVE/Exploit databases: This is the ultimate integration project—combining network reconnaissance, service detection, version fingerprinting, and vulnerability matching into a working security tool.

Core challenges you’ll face:

  • TCP/UDP service detection → maps to network programming fundamentals
  • Version fingerprinting → maps to banner grabbing and protocol analysis
  • Accurate CPE construction → maps to matching detected services to CVEs
  • Safe scanning practices → maps to ethical/legal considerations

Key Concepts:

  • Service Detection: Banner grabbing, protocol probes
  • Version Fingerprinting: Identifying exact software versions
  • Network Scanning Ethics: Only scan what you own/have permission
  • Rate Limiting: Don’t DoS your targets

Difficulty: Expert Time estimate: 4-6 weeks Prerequisites: All previous projects, network programming experience


Real World Outcome

$ ./mini-scanner scan 192.168.1.0/24 --ports 22,80,443,3306,5432

╔═══════════════════════════════════════════════════════════════════╗
║              Mini Vulnerability Scanner v1.0                      ║
║              Target: 192.168.1.0/24                               ║
║              Ports: 22, 80, 443, 3306, 5432                       ║
╠═══════════════════════════════════════════════════════════════════╣

SCANNING...
  [====================================] 100% (254 hosts, 1270 ports)

DISCOVERED HOSTS: 12

HOST: 192.168.1.10 (web-server-01)
─────────────────────────────────────────────────────────────────────
  PORT      SERVICE           VERSION               VULNERABILITIES
  22/tcp    SSH               OpenSSH 7.9p1        1 Medium
  80/tcp    HTTP              Apache/2.4.49        🔴 2 Critical
  443/tcp   HTTPS             Apache/2.4.49        🔴 2 Critical

  CRITICAL FINDINGS:
  ┌─────────────────────────────────────────────────────────────────┐
  │ Apache HTTP Server 2.4.49                                       │
  │ CVE-2021-41773 (CVSS 7.5) - Path Traversal → RCE               │
  │   Exploit: Available (Exploit-DB #50383)                        │
  │   Fix: Upgrade to 2.4.51+                                       │
  │                                                                 │
  │ CVE-2021-42013 (CVSS 9.8) - Path Traversal RCE                 │
  │   Exploit: Available (Metasploit)                               │
  │   Fix: Upgrade to 2.4.51+                                       │
  └─────────────────────────────────────────────────────────────────┘

HOST: 192.168.1.20 (db-server-01)
─────────────────────────────────────────────────────────────────────
  PORT       SERVICE          VERSION               VULNERABILITIES
  22/tcp     SSH              OpenSSH 8.2p1        0
  5432/tcp   PostgreSQL       PostgreSQL 12.3      ⚠️ 3 High

HOST: 192.168.1.30 (dev-workstation)
─────────────────────────────────────────────────────────────────────
  PORT       SERVICE          VERSION               VULNERABILITIES
  22/tcp     SSH              OpenSSH 8.4p1        0
  3306/tcp   MySQL            MySQL 5.7.29         ⚠️ 5 Medium

SCAN SUMMARY
─────────────────────────────────────────────────────────────────────
  Hosts scanned:      254
  Hosts alive:         12
  Services found:      28
  Vulnerabilities:
    🔴 Critical:        4 (on 1 host)
    ⚠️  High:            3 (on 1 host)
    ⚠️  Medium:          6 (on 2 hosts)
    ℹ️  Low:             2 (on 2 hosts)

TOP PRIORITY: web-server-01 (192.168.1.10)
  Apache 2.4.49 has actively exploited RCE vulnerabilities!
  IMMEDIATE ACTION REQUIRED

Export: [J]SON  [X]ML  [H]TML  [N]essus

The Core Question You’re Answering

“What services are running on my network, what versions are they, and which have known vulnerabilities?”

This is the core question of vulnerability assessment. Building a scanner from scratch teaches you how commercial tools work and what their limitations are.


Concepts You Must Understand First

Stop and research these before coding:

  1. Network Scanning Basics
    • How does TCP connect scanning work?
    • What’s the difference between SYN scan and connect scan?
    • How do you detect UDP services?
    • Book: “Nmap Network Scanning” Ch. 1-5
  2. Service Fingerprinting
    • What’s banner grabbing?
    • How do you identify services without banners?
    • What are service probes?
    • Book: “Nmap Network Scanning” Ch. 7
  3. Legal/Ethical Considerations
    • NEVER scan systems without permission
    • What’s the Computer Fraud and Abuse Act?
    • How do you set up a safe test environment?
    • Critical: Only scan YOUR systems or authorized test ranges

Questions to Guide Your Design

Before implementing, think through these:

  1. Scanning Approach
    • Sequential or parallel scanning?
    • How do you handle rate limiting?
    • What about firewalls and IDS?
  2. Fingerprinting Strategy
    • Banner-based vs probe-based?
    • How accurate does version detection need to be?
    • How do you handle encrypted services (HTTPS)?
  3. Vulnerability Matching
    • How do you construct CPEs from detected versions?
    • How do you handle fuzzy version matching?
    • What about false positives?

The Interview Questions They’ll Ask

  1. “How does a vulnerability scanner work?”
  2. “What’s the difference between authenticated and unauthenticated scanning?”
  3. “How do you fingerprint a service version?”
  4. “What causes false positives in vulnerability scanning?”
  5. “How do you handle scanning at scale?”

Hints in Layers

Hint 1: Start with TCP Connect Don’t try to implement SYN scanning (requires raw sockets/root). TCP connect scanning works and is simpler.

Hint 2: Banner Grabbing First Many services send banners on connect. SSH, SMTP, FTP all do. HTTP requires sending a request. Start with easy protocols.

Hint 3: Use Service Probes nmap-service-probes is a public database of probes and match patterns. You can use it (with attribution) to identify services.

Hint 4: Test Environment Use VMs (VulnHub, Metasploitable) for testing. NEVER scan production systems or networks you don’t own.


Books That Will Help

Topic Book Chapter
Network scanning “Nmap Network Scanning” by Lyon All
Service detection “Nmap Network Scanning” by Lyon Ch. 7-9
Socket programming “Linux Socket Programming” by Gay Ch. 1-5
Python networking “Black Hat Python” by Seitz Ch. 2-4

Implementation Hints

Basic TCP scanner structure:

def scan_port(host, port, timeout=2):
  sock = socket.socket(AF_INET, SOCK_STREAM)
  sock.settimeout(timeout)
  result = sock.connect_ex((host, port))
  if result == 0:
    banner = sock.recv(1024)  # Banner grab
    return {"port": port, "state": "open", "banner": banner}
  return {"port": port, "state": "closed"}

Version extraction from banners:

SSH banner: "SSH-2.0-OpenSSH_7.9p1 Debian-10+deb10u2"
  → Service: SSH
  → Product: OpenSSH
  → Version: 7.9p1
  → CPE: cpe:2.3:a:openbsd:openssh:7.9p1:*:*:*:*:*:*:*

Learning Milestones

  1. You can scan ports and detect open services → Network fundamentals
  2. You can extract banners and identify services → Protocol understanding
  3. You can map services to CPEs and find CVEs → Integration complete
  4. Your results match what nmap/Nessus would find → Production quality

⚠️ ONLY scan systems you own or have explicit written permission to test.

Unauthorized network scanning is illegal in most jurisdictions under laws like:

  • Computer Fraud and Abuse Act (USA)
  • Computer Misuse Act (UK)
  • Similar laws worldwide

Safe Testing Environments:

  • VulnHub VMs (intentionally vulnerable)
  • Metasploitable (designed for testing)
  • HackTheBox (legal CTF environment)
  • Your own lab network

Project Comparison Table

# Project Name Difficulty Time Depth of Understanding Fun Factor
1 CVE Data Explorer Beginner Weekend ⭐⭐⭐ ⭐⭐
2 CVSS Calculator & Analyzer Intermediate 1-2 weeks ⭐⭐⭐⭐ ⭐⭐⭐
3 Exploit Database Searcher Intermediate 1-2 weeks ⭐⭐⭐⭐ ⭐⭐⭐⭐
4 Vulnerability Scanner (CPE Matching) Intermediate 2 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
5 CVE Feed Monitor & Alerting Intermediate 1-2 weeks ⭐⭐⭐ ⭐⭐⭐⭐
6 CWE Pattern Analyzer Intermediate 1-2 weeks ⭐⭐⭐⭐ ⭐⭐⭐
7 CVE-to-ATT&CK Mapper Advanced 2-4 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
8 Disclosure Timeline Tracker Intermediate 1-2 weeks ⭐⭐⭐ ⭐⭐⭐
9 SBOM Vulnerability Analyzer Advanced 3-4 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
10 Mini Vulnerability Scanner Expert 4-6 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐

Recommendation

For Beginners (New to Security/CVEs)

Start with Project 1: CVE Data Explorer

This builds your foundation. You’ll understand the data model, learn to work with the NVD API, and become comfortable with CVE structure. Without this knowledge, the other projects will be frustrating.

Then proceed to: Project 2 (CVSS understanding) → Project 3 (Exploit context)

For Intermediate Developers (Know some security)

Start with Project 4: Vulnerability Scanner (CPE Matching)

This is the “aha moment” project where everything clicks. You’ll understand how commercial vulnerability scanners work and why vulnerability management is hard.

Then proceed to: Project 3 (add exploit context) → Project 5 (operationalize it)

For Advanced Practitioners (Security professionals)

Start with Project 7: CVE-to-ATT&CK Mapper

This is cutting-edge threat intelligence integration. It’s the kind of project that impresses in interviews and produces genuinely useful tooling.

Then proceed to: Project 9 (supply chain focus) → Project 10 (network scanning)

Fastest Path to “I Understand CVE Databases”

  1. Project 1 (Weekend) - Understand the data
  2. Project 3 (1 week) - Connect to exploits
  3. Project 4 (2 weeks) - Build a scanner

Total time: ~4 weeks to solid understanding


Final Overall Project: Integrated Vulnerability Intelligence Platform

  • File: CVE_EXPLOIT_DATABASES_LEARNING_PATH.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go (for performance-critical components)
  • Coolness Level: Level 5: Pure Magic
  • Business Potential: 5. The “Industry Disruptor”
  • Difficulty: Level 5: Master
  • Knowledge Area: Full-Stack Security / Platform Engineering
  • Software or Tool: All previous tools integrated
  • Main Book: “Building Secure and Reliable Systems” by Google SRE team

What you’ll build: A complete vulnerability intelligence platform that combines:

  • Network scanning to discover assets and services
  • SBOM ingestion for software inventory
  • NVD/Exploit-DB integration for vulnerability data
  • CVSS environmental scoring for your context
  • ATT&CK mapping for threat intelligence
  • Alerting and reporting for operations
  • A web dashboard for visualization

This is essentially a mini-Tenable/Qualys/Rapid7 built from scratch.

Why it’s the capstone: This project synthesizes EVERYTHING from the previous 10 projects into a coherent platform. You’ll face real integration challenges, performance issues, and user experience decisions that commercial products face.

Core challenges you’ll face:

  • Data model design → How do you unify assets, vulnerabilities, exploits, and threats?
  • Performance at scale → Scanning networks and processing 300k+ CVEs
  • User experience → Making complex data actionable
  • Operational concerns → Running as a service, handling errors, maintaining state

Architecture Vision:

┌─────────────────────────────────────────────────────────────────┐
│                    Web Dashboard (React/Vue)                    │
├─────────────────────────────────────────────────────────────────┤
│                         REST API (FastAPI)                      │
├─────────────┬─────────────┬─────────────┬─────────────┬────────┤
│   Scanner   │   SBOM      │    CVE      │  Exploit    │ ATT&CK │
│   Engine    │   Ingester  │   Database  │  Matcher    │ Mapper │
├─────────────┴─────────────┴─────────────┴─────────────┴────────┤
│                    Unified Data Store (PostgreSQL)              │
├─────────────────────────────────────────────────────────────────┤
│              Background Jobs (Celery/Redis)                     │
│   • NVD Sync    • Scan Jobs    • Alert Processing              │
└─────────────────────────────────────────────────────────────────┘

Real World Outcome:

╔═══════════════════════════════════════════════════════════════════╗
║              VulnIntel Dashboard                                  ║
║              Last scan: 5 minutes ago                            ║
╠═══════════════════════════════════════════════════════════════════╣

┌─ RISK OVERVIEW ──────────────────────────────────────────────────┐
│                                                                  │
│   Critical: ████████████░░░░░░░░  12 (↑3 from yesterday)        │
│   High:     ██████████████████░░  45 (↓2 from yesterday)        │
│   Medium:   ████████░░░░░░░░░░░░  89                            │
│   Low:      ███░░░░░░░░░░░░░░░░░  234                           │
│                                                                  │
│   Exploitable: 23    In-the-Wild: 8    Zero-Day: 2              │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

┌─ TOP PRIORITIES ─────────────────────────────────────────────────┐
│                                                                  │
│  1. web-server-01: CVE-2021-44228 (Log4Shell)                   │
│     CVSS: 10.0 | Exploit: Metasploit | Tactic: Initial Access   │
│     Affected: Log4j 2.14.1 | Fix: Upgrade to 2.17.1+           │
│     [VIEW] [REMEDIATE] [SUPPRESS]                               │
│                                                                  │
│  2. api-gateway-03: CVE-2021-41773 (Apache Path Traversal)      │
│     CVSS: 9.8 | Exploit: Active ITW | Tactic: Initial Access    │
│     [VIEW] [REMEDIATE] [SUPPRESS]                               │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

┌─ ATTACK SURFACE BY ATT&CK TACTIC ────────────────────────────────┐
│                                                                  │
│  Initial Access    ██████████████████░░░░░░  23 vulns          │
│  Execution         ████████████░░░░░░░░░░░░  15 vulns          │
│  Persistence       ████████░░░░░░░░░░░░░░░░   8 vulns          │
│  Privilege Esc.    ██████████████░░░░░░░░░░  18 vulns          │
│  Defense Evasion   ████░░░░░░░░░░░░░░░░░░░░   5 vulns          │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

Difficulty: Master Time estimate: 3-6 months Prerequisites: All 10 previous projects completed


The Core Question You’re Answering

“How do I build a complete vulnerability management platform from scratch?”

This is the ultimate integration challenge. You’ll understand not just individual components, but how they work together to create actionable security intelligence.


Learning Milestones for Capstone

  1. All components work individually → You’ve mastered each domain
  2. Data flows between components → Integration works
  3. The dashboard shows actionable intelligence → User value delivered
  4. You can explain trade-offs in your design → Architectural maturity
  5. Someone else can use and extend it → Production quality

Summary

This learning path covers CVE/Exploit databases through 10 hands-on projects plus one capstone. Here’s the complete list:

# Project Name Main Language Difficulty Time Estimate
1 CVE Data Explorer Python Beginner Weekend
2 CVSS Calculator & Analyzer Python Intermediate 1-2 weeks
3 Exploit Database Searcher Python Intermediate 1-2 weeks
4 Vulnerability Scanner (CPE Matching) Python Intermediate 2 weeks
5 CVE Feed Monitor & Alerting Python Intermediate 1-2 weeks
6 CWE Pattern Analyzer Python Intermediate 1-2 weeks
7 CVE-to-ATT&CK Mapper Python Advanced 2-4 weeks
8 Disclosure Timeline Tracker Python Intermediate 1-2 weeks
9 SBOM Vulnerability Analyzer Python Advanced 3-4 weeks
10 Mini Vulnerability Scanner Python Expert 4-6 weeks
🏆 Integrated Vuln Intel Platform Python Master 3-6 months

For beginners: Start with projects #1, #2, #3 For intermediate: Jump to projects #4, #3, #5 For advanced: Focus on projects #7, #9, #10, then capstone

Expected Outcomes

After completing these projects, you will:

  1. Deeply understand the CVE ecosystem — from ID assignment through NVD enrichment to exploit publication
  2. Master CVSS scoring — including the mathematics, environmental customization, and practical application
  3. Connect vulnerabilities to exploits — understanding exploit maturity and its impact on risk
  4. Build production vulnerability tools — scanners, monitors, analyzers that security professionals actually use
  5. Integrate threat intelligence — mapping CVEs to ATT&CK techniques for threat-informed defense
  6. Understand supply chain security — SBOM analysis and transitive dependency vulnerabilities
  7. Communicate security effectively — visualizations, reports, and dashboards that drive action
  8. Think like a security architect — trade-offs, integration patterns, and operational concerns

You’ll have built 11 working projects that demonstrate deep understanding of vulnerability intelligence from first principles.


Key Resources

Official Sources

  • “Practical Vulnerability Management” by Andrew Magnusson (No Starch Press)
  • “Effective Vulnerability Management” by Chris Hughes (Wiley)
  • “Penetration Testing” by Georgia Weidman
  • “Nmap Network Scanning” by Gordon Lyon
  • “Black Hat Python” by Justin Seitz

Final Notes

The vulnerability ecosystem is constantly evolving:

  • CVE volume is increasing 30%+ year over year
  • NVD is modernizing its APIs (2.0 is now standard)
  • CVSS 4.0 is the new scoring standard
  • SBOM requirements are becoming regulatory mandates
  • Threat intelligence integration (ATT&CK) is becoming standard practice

The projects in this learning path teach timeless fundamentals while using current standards. The specific APIs may evolve, but the concepts—vulnerability identification, severity assessment, exploit correlation, and risk prioritization—remain constant.

Good luck on your journey to mastering CVE/Exploit databases!