Project 7: String Library from Scratch
A complete string library with safe string functions, UTF-8 support, and bounds-checking interfaces that prevent buffer overflows.
Quick Reference
| Attribute | Value |
|---|---|
| Primary Language | C |
| Alternative Languages | None |
| Difficulty | Level 3 - Advanced |
| Time Estimate | See main guide |
| Knowledge Area | String Handling, Security |
| Tooling | GCC, Valgrind, AddressSanitizer |
| Prerequisites | See main guide |
What You Will Build
A complete string library with safe string functions, UTF-8 support, and bounds-checking interfaces that prevent buffer overflows.
Why It Matters
This project builds core skills that appear repeatedly in real-world systems and tooling.
Core Challenges
- Null terminator handling → Maps to understanding C strings
- Buffer overflow prevention → Maps to secure coding
- UTF-8 encoding → Maps to Unicode support
Key Concepts
- Map the project to core concepts before you code.
Real-World Outcome
# 1. Basic safe operations
$ ./string_test safe_ops
Testing safe_strcpy:
Source: "Hello, World!" (13 chars)
Dest buffer: 10 bytes
Result: ERROR_BUFFER_TOO_SMALL
Dest contents: "Hello, Wo" (truncated with null terminator)
Testing safe_strcat:
Dest: "Hello" (5 chars)
Source: ", World!" (8 chars)
Dest buffer: 15 bytes
Result: SUCCESS
Dest contents: "Hello, World!" (13 chars)
# 2. UTF-8 handling
$ ./string_test utf8 "Hello, 世界! 🌍"
Input string bytes: 19
ASCII character count: 19 (wrong!)
UTF-8 codepoint count: 12 (correct!)
Codepoints:
H (U+0048) - 1 byte
e (U+0065) - 1 byte
l (U+006C) - 1 byte
l (U+006C) - 1 byte
o (U+006F) - 1 byte
, (U+002C) - 1 byte
(space) (U+0020) - 1 byte
世 (U+4E16) - 3 bytes
界 (U+754C) - 3 bytes
! (U+0021) - 1 byte
(space) (U+0020) - 1 byte
🌍 (U+1F30D) - 4 bytes
# 3. Overflow prevention demo
$ ./string_test overflow
Standard strcpy (DANGEROUS):
Attempting to copy 100 bytes into 10-byte buffer...
[AddressSanitizer would catch: stack-buffer-overflow]
Safe strcpy:
Attempting to copy 100 bytes into 10-byte buffer...
Result: ERROR_BUFFER_TOO_SMALL
No overflow occurred. Buffer contains: "123456789" (truncated safely)
# 4. Format string safety
$ ./string_test format
safe_snprintf(buf, 10, "Value: %d", 12345)
Result: "Value: 12" (truncated, no overflow)
Return value: 12 (would need 12 chars for full output)
Implementation Guide
- Reproduce the simplest happy-path scenario.
- Build the smallest working version of the core feature.
- Add input validation and error handling.
- Add instrumentation/logging to confirm behavior.
- Refactor into clean modules with tests.
Milestones
- Milestone 1: Minimal working program that runs end-to-end.
- Milestone 2: Correct outputs for typical inputs.
- Milestone 3: Robust handling of edge cases.
- Milestone 4: Clean structure and documented usage.
Validation Checklist
- Output matches the real-world outcome example
- Handles invalid inputs safely
- Provides clear errors and exit codes
- Repeatable results across runs
References
- Main guide:
PROFESSIONAL_C_PROGRAMMING_MASTERY.md - Effective C, 2nd Edition by Robert C. Seacord