Project 14: Secure String and Buffer Library

A security-focused string and buffer library with strict bounds, threat modeling, and constant-time operations.

Quick Reference

Attribute Value
Difficulty Level 4 - Expert
Time Estimate 2-3 weeks
Main Programming Language C
Alternative Programming Languages None
Coolness Level Level 4 - Hardcore Tech Flex
Business Potential Level 2 - Micro-SaaS
Prerequisites Memory safety basics, C strings, error handling
Key Topics Buffer overflow prevention, threat modeling, safe APIs

1. Learning Objectives

By completing this project, you will:

  1. Design a buffer/string library that prevents common memory vulnerabilities.
  2. Implement constant-time comparison and secure erase functions.
  3. Define explicit threat models and misuse cases.
  4. Integrate fuzzing and sanitizer testing.
  5. Produce security-focused documentation and usage guidelines.

2. All Theory Needed (Per-Concept Breakdown)

Concept 1: Memory Safety Vulnerabilities and Threat Modeling

Fundamentals

C gives you direct access to memory, which makes it powerful but dangerous. Common vulnerabilities include buffer overflows, use-after-free, and out-of-bounds reads. These bugs can lead to crashes or security exploits. A secure library must assume adversarial inputs and explicitly define a threat model: what attackers can control, what assets must be protected, and what failures are unacceptable.

Deep Dive into the concept

Memory safety vulnerabilities arise because C does not enforce bounds checks or lifetime rules. A buffer overflow occurs when code writes past the end of an array, overwriting adjacent memory. This can corrupt data, change control flow, or expose secrets. Out-of-bounds reads can leak sensitive data by reading memory that was never intended to be exposed. Use-after-free occurs when code accesses memory after it has been freed, which can lead to crashes or exploitation depending on allocator behavior.

Threat modeling is the process of identifying what attackers can do and what the system must defend. For a string library, the typical threat model includes untrusted input from users or network sources. Attackers may try to pass overly long strings, malformed encodings, or special inputs that trigger edge cases. The library must ensure that these inputs cannot cause memory corruption or leaks. The threat model should also include side-channel risks: for example, comparing secret tokens with strcmp can leak information via timing differences, because strcmp returns as soon as it finds a mismatch.

A secure library must explicitly define its failure behavior. If an operation fails due to insufficient capacity, it must return a clear error and leave the destination in a valid state. If an input pointer is NULL, it must detect and reject it. If an attacker attempts to overflow, the library must reject or truncate safely without compromising invariants. Security also requires secure erasure: sensitive data should be overwritten before freeing to prevent leakage through memory reuse.

In this project, you will document the threat model and map each API to its expected safety guarantees. You will then implement defensive checks, bounds enforcement, and fuzzing to validate that the library holds up under adversarial input. This is the difference between a “safe” library and a “secure” one.

A deeper security view includes how vulnerabilities are exploited. Buffer overflows can overwrite return addresses or function pointers, enabling control-flow hijacking. Use-after-free can be exploited by manipulating allocator behavior to place attacker-controlled data where freed objects used to be. Even if your library is not directly exposed to untrusted input, library bugs can become attack primitives when composed into larger systems. This is why threat modeling is not optional: you must decide which classes of attacks you are defending against and design your APIs accordingly. For example, you might decide that all input lengths must be explicit and validated, and that no API accepts untrusted null-terminated strings without a maximum length. This type of explicit policy is what makes security maintainable across a codebase.

To operationalize this concept in a real codebase, create a short checklist of invariants and a set of micro-experiments. Start with a minimal, deterministic test that isolates one rule or behavior, then vary a single parameter at a time (inputs, flags, platform, or data layout) and record the outcome. Keep a table of assumptions and validate them with assertions or static checks so violations are caught early. Whenever the concept touches the compiler or OS, capture tool output such as assembly, warnings, or system call traces and attach it to your lab notes. Finally, define explicit failure modes: what does a violation look like at runtime, and how would you detect it in logs or tests? This turns abstract theory into repeatable engineering practice and makes results comparable across machines and compiler versions.

How this fits on projects

Definitions & key terms

  • Buffer overflow: Writing past the end of a buffer.
  • Use-after-free: Accessing memory after it is freed.
  • Threat model: Assumptions about attacker capabilities.
  • Side-channel: Leaks information through timing or other signals.

Mental model diagram (ASCII)

[buffer][buffer][buffer][guard]  <-- overflow crosses boundary

How it works (step-by-step, with invariants and failure modes)

  1. Define threat model and attacker inputs.
  2. Identify safety invariants for each API.
  3. Add checks for all inputs and sizes.
  4. Validate with fuzzing and sanitizers.

Invariant: Inputs never cause out-of-bounds access. Failure mode: Missing checks allow memory corruption.

Minimal concrete example

if (len > dst->capacity) return SEC_EBOUNDS;

Common misconceptions

  • “Sanitizers make code safe.” → They only detect bugs, they don’t fix them.
  • “Bounds checks are enough.” → Timing and error semantics matter.
  • “Security is only about memory writes.” → Reads and side channels matter too.

Check-your-understanding questions

  1. What is the difference between safe and secure?
  2. Why is strcmp unsafe for secret comparisons?
  3. How does a threat model guide API design?
  4. What is the impact of an out-of-bounds read?
  5. Why should errors leave objects valid?

Check-your-understanding answers

  1. Safe means no crashes; secure means resilient to attacks and leaks.
  2. It exits early, leaking information via timing.
  3. It defines attacker input and required defenses.
  4. It can leak secrets or cause undefined behavior.
  5. Invalid state can be exploited or cause later crashes.

Real-world applications

  • Password/token handling in authentication systems.
  • Secure parsing of network protocols.

Where you’ll apply it

References

  • CERT C Secure Coding Standard
  • “Secure Coding in C and C++” — Seacord

Key insights

Security requires explicit threat models and strict invariants, not just defensive coding.

Summary

Memory safety vulnerabilities are the biggest security risk in C. A secure library must be designed with an attacker in mind, enforcing strict invariants and clear failure behavior.

Homework/Exercises to practice the concept

  1. Write a threat model for a login token API.
  2. Identify three ways an attacker could misuse a string API.
  3. Add guard checks to a vulnerable function.

Solutions to the homework/exercises

  1. Identify attacker input, assets, and unacceptable failures.
  2. Overlong input, NULL pointer, malformed UTF-8.
  3. Validate input length and pointer before copying.

Concept 2: Defensive API Design and Constant-Time Operations

Fundamentals

A secure API must be hard to misuse. That means explicit sizes, clear return codes, and no silent truncation. For sensitive data, comparisons and equality checks should be constant-time to avoid timing attacks. Secure erase functions should overwrite memory before freeing to reduce the chance of secret leakage.

Deep Dive into the concept

Defensive API design starts with explicit sizes. Every function that copies or appends data should accept lengths or operate on length-tracked structures. Return values should indicate success, failure, and truncation explicitly. For example, a secure append might return SEC_ETRUNC if truncation occurred. This ensures that callers cannot ignore errors accidentally.

Constant-time operations are essential for comparing secrets. Standard functions like memcmp often return early when a mismatch is found, which makes execution time dependent on the input. Attackers can exploit this with timing measurements. A constant-time comparison loops through all bytes and accumulates differences, returning only at the end. This does not remove all timing leaks (cache effects still exist) but significantly reduces exploitable differences.

Secure erase is another requirement: when sensitive data is no longer needed, it should be overwritten. Compilers may optimize away memset if the data is not used afterward, so you must use a technique that prevents optimization, such as volatile pointers or platform-specific functions like explicit_bzero. Your library should provide a secure_zero function that is guaranteed not to be optimized away.

A professional secure library also includes clear documentation of safe usage patterns and anti-patterns. For example, it should warn against using standard string APIs on secret data. It should provide “safe defaults,” such as refusing to operate on uninitialized objects or returning errors when invariants are violated. This design philosophy turns security from an optional feature into the default behavior.

To operationalize this concept in a real codebase, create a short checklist of invariants and a set of micro-experiments. Start with a minimal, deterministic test that isolates one rule or behavior, then vary a single parameter at a time (inputs, flags, platform, or data layout) and record the outcome. Keep a table of assumptions and validate them with assertions or static checks so violations are caught early. Whenever the concept touches the compiler or OS, capture tool output such as assembly, warnings, or system call traces and attach it to your lab notes. Finally, define explicit failure modes: what does a violation look like at runtime, and how would you detect it in logs or tests? This turns abstract theory into repeatable engineering practice and makes results comparable across machines and compiler versions.

Another way to deepen understanding is to map the concept to a small decision table: list inputs, expected outcomes, and the assumptions that must hold. Create at least one negative test that violates an assumption and observe the failure mode, then document how you would detect it in production. Add a short trade-off note: what you gain by following the rule and what you pay in complexity or performance. Where possible, instrument the implementation with debug-only checks so violations are caught early without affecting release builds. If the concept admits multiple approaches, implement two and compare them; the act of measuring and documenting the difference is part of professional practice. This habit turns theoretical understanding into an engineering decision framework you can reuse across projects.

How this fits on projects

Definitions & key terms

  • Constant-time: Execution time independent of input values.
  • Secure erase: Overwriting memory so secrets are removed.
  • Defensive API: An interface designed to prevent misuse.
  • Truncation: Partial output due to capacity limits.

Mental model diagram (ASCII)

for i in 0..n-1:
  diff |= a[i] ^ b[i]
return diff == 0

How it works (step-by-step, with invariants and failure modes)

  1. Validate inputs and lengths.
  2. Perform operation with bounds checks.
  3. Return explicit status codes.
  4. For secrets, use constant-time comparisons and secure erase.

Invariant: APIs never silently ignore errors or truncate without notice. Failure mode: Silent truncation leaks or corrupts data.

Minimal concrete example

int secure_equals(const uint8_t *a, const uint8_t *b, size_t n) {
    uint8_t diff = 0;
    for (size_t i = 0; i < n; i++) diff |= a[i] ^ b[i];
    return diff == 0;
}

Common misconceptions

  • “Constant-time is unnecessary unless you’re doing crypto.” → It matters for any secret tokens.
  • memset is enough for secure erase.” → The compiler can optimize it away.
  • “Truncation is harmless.” → It can cause security logic errors.

Check-your-understanding questions

  1. Why is constant-time comparison important?
  2. How can a compiler remove a memset?
  3. What should a secure API return on error?
  4. Why avoid silent truncation?
  5. How can you ensure secure erase is not optimized away?

Check-your-understanding answers

  1. It prevents timing attacks on secret comparisons.
  2. If the memory is not used afterward, it may remove the write.
  3. Explicit error codes with defined semantics.
  4. It can hide data loss and cause security bugs.
  5. Use volatile or platform-specific secure erase APIs.

Real-world applications

  • Authentication token checks.
  • Secure password handling.

Where you’ll apply it

References

  • “Cryptography Engineering” — Ferguson, Schneier, Kohno (timing attacks)
  • OpenBSD explicit_bzero docs

Key insights

Security is in the API contract as much as the implementation.

Summary

Defensive APIs, constant-time operations, and secure erase functions turn a safe string library into a secure one. These features protect against both memory corruption and information leaks.

Homework/Exercises to practice the concept

  1. Implement a secure compare and test its timing behavior.
  2. Write a secure erase function that resists optimization.
  3. Design error codes for a secure buffer API.

Solutions to the homework/exercises

  1. Loop over all bytes and measure constant execution time.
  2. Use volatile pointer or explicit_bzero.
  3. Use distinct codes for invalid input, truncation, and allocation failure.

3. Project Specification

3.1 What You Will Build

A security-focused string and buffer library (secstr) with explicit length tracking, strict bounds enforcement, constant-time comparisons, and secure erase. Includes fuzz tests and sanitizer integration.

3.2 Functional Requirements

  1. Secure string type: length, capacity, invariant checking.
  2. Safe operations: append, copy, compare, slice with bounds checks.
  3. Constant-time compare: for secret buffers.
  4. Secure erase: guaranteed overwrite of sensitive data.
  5. Fuzz testing: harness for randomized inputs.

3.3 Non-Functional Requirements

  • Performance: Constant-time operations remain O(n).
  • Reliability: Library passes fuzz and sanitizer tests.
  • Usability: Clear API documentation and threat model notes.

3.4 Example Usage / Output

secstr_t s;
secstr_init(&s, "token");
if (secstr_equals_ct(&s, "token")) { /* ok */ }

3.5 Data Formats / Schemas / Protocols

Error enum:

typedef enum { SEC_OK=0, SEC_EINVAL=-1, SEC_ENOMEM=-2, SEC_ETRUNC=1 } sec_err_t;

3.6 Edge Cases

  • NULL inputs.
  • Extremely long attacker-controlled strings.
  • Attempts to compare strings of different lengths.

3.7 Real World Outcome

What you will see:

  1. A secure string library with constant-time operations.
  2. Fuzz tests that do not crash or leak.
  3. Documentation mapping APIs to threat model.

3.7.1 How to Run (Copy/Paste)

make
./secstr_demo

3.7.2 Golden Path Demo (Deterministic)

Compare a known secret token and show success.

3.7.3 If CLI: exact terminal transcript

$ ./secstr_demo
compare secret: match
Exit: 0

Failure demo (deterministic):

$ ./secstr_demo --null
ERROR: invalid input
Exit: 2

4. Solution Architecture

4.1 High-Level Design

+-------------------+
| secstr API         |
+---------+---------+
          |
          v
+-------------------+     +-------------------+
| bounds checks      | -->| constant-time ops |
+-------------------+     +-------------------+

4.2 Key Components

| Component | Responsibility | Key Decisions | |———–|—————-|—————-| | Core API | String ops | Strict invariants | | CT compare | Constant-time equality | Always O(n) | | Secure erase | Overwrite secrets | prevent opt-out |

4.3 Data Structures (No Full Code)

typedef struct { uint8_t *data; size_t len; size_t cap; } secstr_t;

4.4 Algorithm Overview

  1. Validate inputs and lengths.
  2. Enforce capacity checks.
  3. Perform operation with constant-time logic if needed.
  4. Return explicit status codes.

Complexity Analysis:

  • Time: O(n)
  • Space: O(n)

5. Implementation Guide

5.1 Development Environment Setup

clang -std=c23 -Wall -Wextra -Werror -fsanitize=address,undefined -g

5.2 Project Structure

secstr/
├── src/
│   ├── secstr.c
│   ├── demo.c
│   └── fuzz.c
├── include/
│   └── secstr.h
├── tests/
└── Makefile

5.3 The Core Question You’re Answering

“How do I design string APIs that remain safe even with hostile input?”

5.4 Concepts You Must Understand First

  1. Memory safety vulnerabilities.
  2. Defensive API design.
  3. Constant-time operations.

5.5 Questions to Guide Your Design

  1. What is your explicit threat model?
  2. How will you ensure errors leave valid state?
  3. Which operations must be constant-time?

5.6 Thinking Exercise

Write a misuse case for every API function.

5.7 The Interview Questions They’ll Ask

  1. Why is strcmp unsafe for secrets?
  2. What is a threat model?
  3. How do you prevent buffer overflows in C?

5.8 Hints in Layers

  • Hint 1: Start with a safe string type and invariant checks.
  • Hint 2: Add constant-time compare and secure erase.
  • Hint 3: Add fuzz tests and sanitizer builds.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Secure C | “Secure Coding in C and C++” — Seacord | Ch. 5-7 |

5.10 Implementation Phases

Phase 1: Foundation (1 week)

  • Build core string type with invariants.
  • Checkpoint: Basic append/copy works.

Phase 2: Core Functionality (1 week)

  • Add constant-time comparisons and secure erase.
  • Checkpoint: Tests pass under sanitizers.

Phase 3: Polish & Edge Cases (3-4 days)

  • Add fuzzing and threat model docs.
  • Checkpoint: Fuzz tests run without crashes.

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Error handling | return codes, abort | return codes | library-safe | | Secret compare | memcmp, constant-time | constant-time | prevent timing leaks |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |———|———|———-| | Unit tests | Validate each API | append, compare | | Integration tests | Demo program | secstr_demo | | Fuzz tests | Random inputs | fuzz harness |

6.2 Critical Test Cases

  1. Fuzzing with long random inputs.
  2. Constant-time compare returns correct results.
  3. Secure erase overwrites buffers.

6.3 Test Data

Secret: "token123"

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |——–|———|———-| | Silent truncation | Data loss | Return explicit truncation error | | Using memcmp for secrets | Timing leaks | Use constant-time compare | | Secure erase optimized out | Secrets remain | Use explicit_bzero or volatile |

7.2 Debugging Strategies

  • Run with ASan and UBSan.
  • Inspect timing behavior with micro-benchmarks.

7.3 Performance Traps

Constant-time comparisons are slower; use only for secrets.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add hex encoding/decoding helpers.

8.2 Intermediate Extensions

  • Add secure allocator integration.

8.3 Advanced Extensions

  • Integrate with a FIPS-compliant crypto library.

9. Real-World Connections

9.1 Industry Applications

  • Secure credential storage and authentication.
  • Hardening network services.
  • OpenBSD libc (explicit_bzero)
  • libsodium memory utilities

9.3 Interview Relevance

  • Security and memory safety questions are common.

10. Resources

10.1 Essential Reading

  • CERT C Secure Coding Standard
  • “Cryptography Engineering” — Ferguson et al.

10.2 Video Resources

  • Talks on secure C coding and timing attacks

10.3 Tools & Documentation

  • AFL/libFuzzer, ASan, UBSan

11. Self-Assessment Checklist

11.1 Understanding

  • I can define a threat model for string handling.
  • I can explain constant-time comparisons.
  • I can implement secure erase correctly.

11.2 Implementation

  • All APIs enforce bounds.
  • Fuzz tests run without crashes.
  • Documentation includes misuse cases.

11.3 Growth

  • I can design secure APIs for other C libraries.
  • I can audit code for memory vulnerabilities.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Secure string type with bounds checks.
  • Constant-time compare and secure erase.
  • Fuzz tests and sanitizer builds.

Full Completion:

  • All minimum criteria plus:
  • Threat model documentation and misuse cases.

Excellence (Going Above & Beyond):

  • Integration with external security tools and static analyzers.