Project 7: Static Analysis Tool for Vulnerabilities
A command-line tool that scans a C source file and flags calls to dangerous, legacy functions like
gets,strcpy,strcat, andsprintf(without a size-limiting format string).
Quick Reference
| Attribute | Value |
|---|---|
| Primary Language | Python |
| Alternative Languages | C++ (using libclang), Go |
| Difficulty | Level 2: Intermediate |
| Time Estimate | 1-2 weeks |
| Knowledge Area | Static Analysis / Parsing / Tooling |
| Tooling | Python re module or libclang bindings |
| Prerequisites | Basic Python or another scripting language. |
What You Will Build
A command-line tool that scans a C source file and flags calls to dangerous, legacy functions like gets, strcpy, strcat, and sprintf (without a size-limiting format string).
Why It Matters
This project builds core skills that appear repeatedly in real-world systems and tooling.
Core Challenges
- Reading and processing a source file → maps to basic file I/O
- Using regular expressions to find function calls → maps to the simple but brittle approach
- (Advanced) Using a C parser like
libclang→ maps to the robust approach using Abstract Syntax Trees (AST) - Reporting findings with file names and line numbers → maps to making the tool useful
Key Concepts
- Static Application Security Testing (SAST): The formal name for this type of tool.
- Regular Expressions: Essential for the simple version of this tool.
- Abstract Syntax Trees (AST): The output of a true compiler front-end, which provides a much more accurate way to analyze code.
Real-World Outcome
$ cat test.c
#include <stdio.h>
int main() {
char buf[10];
gets(buf); // Dangerous!
return 0;
}
$ ./c_linter test.c
[WARNING] test.c:4: Call to dangerous function 'gets'. Use 'fgets' instead.
Found 1 potential issue(s).
Implementation Guide
- Reproduce the simplest happy-path scenario.
- Build the smallest working version of the core feature.
- Add input validation and error handling.
- Add instrumentation/logging to confirm behavior.
- Refactor into clean modules with tests.
Milestones
- Milestone 1: Minimal working program that runs end-to-end.
- Milestone 2: Correct outputs for typical inputs.
- Milestone 3: Robust handling of edge cases.
- Milestone 4: Clean structure and documented usage.
Validation Checklist
- Output matches the real-world outcome example
- Handles invalid inputs safely
- Provides clear errors and exit codes
- Repeatable results across runs
References
- Main guide:
LEARN_C_SECURE_CODING_DEEP_DIVE.md - “Language Implementation Patterns” by Terence Parr