Project 8: The Wi-Fi Sniffer & Traffic Analyzer (XIAO ESP32S3/C3)
Build a Wi-Fi sniffer that captures 802.11 frames in promiscuous mode, parses headers, and logs nearby traffic patterns.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 1-2 weeks |
| Main Programming Language | C (ESP-IDF) |
| Alternative Programming Languages | None recommended (promiscuous mode in C) |
| Coolness Level | Very High |
| Business Potential | Medium (RF analysis tools) |
| Prerequisites | Wi-Fi basics, C parsing, bitfields |
| Key Topics | 802.11 frames, promiscuous mode, channel hopping |
1. Learning Objectives
By completing this project, you will:
- Explain 802.11 frame types and header fields.
- Capture Wi-Fi frames using promiscuous mode callbacks.
- Implement a parser for beacon and probe frames.
- Design a channel hopping strategy to scan 2.4 GHz.
- Log RSSI, SSID, and frame statistics in real time.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Concept 1: 802.11 Frame Structure and Management Frames
Fundamentals
Wi-Fi traffic consists of 802.11 frames, which include management, control, and data frames. Management frames include beacons and probe requests, which advertise network presence and search for networks. A sniffer needs to parse the 802.11 MAC header to extract frame type, addresses, and SSIDs. Understanding frame structure is essential for correct parsing and for distinguishing between beacon, probe, and data frames.
Deep Dive into the Concept
An 802.11 frame starts with a Frame Control field that encodes type and subtype. Type 0 indicates management frames, type 1 control frames, and type 2 data frames. The subtype differentiates beacons, probe requests, association requests, etc. The MAC header includes addresses that represent different roles depending on the frame type: source, destination, BSSID, transmitter, and receiver. For beacons, the BSSID is the AP’s MAC address. For probe requests, the source is the client searching for networks. Understanding these address fields allows you to interpret who is talking to whom.
Management frames often include information elements (IEs). A beacon includes an SSID IE, supported rates, capabilities, and channel information. A probe request can include the SSID the client is searching for (or a wildcard). Parsing IEs requires walking a variable-length list: each IE has an ID, length, and value. If you parse incorrectly, you may read out of bounds or misinterpret data. This makes robust parsing a key skill.
RSSI is provided by the radio with each captured frame. It is a coarse measure of signal strength and can be used to approximate distance or detect movement. However, RSSI is noisy and depends on antenna orientation and environment. A sniffer should log RSSI but not assume it is precise. For analysis, you might average RSSI across multiple frames or use it to identify the strongest APs.
Because the ESP32 promiscuous mode provides a raw 802.11 packet plus metadata, you must parse in a callback. The callback runs in a high-priority context, so you must be careful to do minimal work. A good design copies the relevant fields into a queue and parses them in a lower-priority task. This avoids watchdog resets and packet drops.
How this fits on projects
You will parse management frames to extract SSID, BSSID, and RSSI. This is the core of the analyzer.
Definitions & Key Terms
- Frame control: Field that encodes type/subtype.
- Beacon: Management frame that advertises an AP.
- Probe request: Management frame sent by clients to find APs.
- SSID: Network name.
- BSSID: MAC address of the AP.
Mental Model Diagram (ASCII)
[802.11 Frame]
| FC | Duration | Addr1 | Addr2 | Addr3 | Seq | IEs...
How It Works (Step-by-Step)
- Capture raw frame in promiscuous callback.
- Read frame control and determine subtype.
- Extract MAC addresses from header.
- Parse IEs to find SSID.
- Log RSSI and frame details.
Minimal Concrete Example
uint16_t fc = frame->frame_ctrl;
uint8_t type = (fc >> 2) & 0x3;
uint8_t subtype = (fc >> 4) & 0xF;
Common Misconceptions
- “All frames include SSID.” Only management frames like beacons/probes include it.
- “RSSI is distance.” It is affected by many environmental factors.
- “Parsing is trivial.” Variable-length IEs require careful bounds checking.
Check-Your-Understanding Questions
- What field identifies a beacon frame?
- Why do probe requests sometimes have an empty SSID?
- Which address field is the BSSID in a beacon frame?
Check-Your-Understanding Answers
- Frame control type=0, subtype=8 (beacon).
- The client is searching for any network.
- Addr3 is typically the BSSID for beacons.
Real-World Applications
- Wi-Fi site surveys and channel planning.
- Security monitoring and intrusion detection.
Where You’ll Apply It
- See Section 3.2 Functional Requirements and Section 5.10 Phase 2.
- Also used in: P03 ESP-NOW Remote for MAC-level understanding.
References
- 802.11 specification summaries
- Wireshark 802.11 dissector docs
Key Insights
If you can parse a beacon, you can understand the Wi-Fi environment around you.
Summary
802.11 management frames contain the metadata that defines Wi-Fi networks. Parsing them correctly is the foundation of a sniffer.
Homework/Exercises to Practice the Concept
- Write a small parser that extracts SSID IE from a byte buffer.
- List the subtype values for beacon and probe request.
Solutions to the Homework/Exercises
- Iterate through IEs, find ID=0 (SSID), read length.
- Beacon subtype=8, probe request subtype=4.
2.2 Concept 2: Promiscuous Mode, Callbacks, and Buffering
Fundamentals
Promiscuous mode lets the Wi-Fi radio receive all frames on a channel, not just those addressed to it. The ESP32 provides a callback for each captured frame. This callback runs in a high-priority context, so heavy processing can cause dropped packets or watchdog resets. The correct approach is to copy minimal metadata and push it to a queue for processing in a separate task.
Deep Dive into the Concept
In normal operation, the Wi-Fi MAC filters frames so the CPU only sees relevant traffic. Promiscuous mode disables this filter and delivers all frames. This can be a huge volume of data, especially in crowded environments. The ESP32 provides metadata such as RSSI, channel, and timestamp. The callback receives a pointer to the raw frame buffer. It is crucial to avoid heavy parsing or logging inside the callback because it can block the Wi-Fi driver and cause packet loss. Instead, you should copy only what you need (for example, the first 64 bytes and metadata) into a ring buffer or queue. A separate task can then parse and log the data.
Buffering introduces tradeoffs: if the queue is too small, packets will be dropped when the environment is busy. If too large, you waste RAM. You must also avoid memory fragmentation, so using a fixed-size ring buffer is often better than dynamic allocation. The ESP-IDF provides FreeRTOS queues that are easy to use but have overhead. For high capture rates, you may prefer a custom ring buffer.
Another key consideration is that the callback may be invoked at a high frequency. If you log every frame, the UART will become the bottleneck. A more realistic strategy is to aggregate statistics: count frames per SSID, track RSSI averages, and log summaries once per second. This reduces output and improves stability. The sniffer becomes a real analyzer rather than a raw dump.
Finally, promiscuous mode can interfere with normal Wi-Fi usage. You cannot be connected to an AP on another channel while sniffing; the radio can only be on one channel. For this project, you will run in STA mode without connecting to an AP, set the channel manually, and use promiscuous mode to capture traffic. This is a deliberate choice that simplifies the environment and makes your results predictable.
How this fits on projects
You will register a promiscuous callback, push metadata to a queue, and parse in a worker task. This structure makes the sniffer stable and scalable.
Definitions & Key Terms
- Promiscuous mode: Receive all frames on a channel.
- Callback: Function invoked for each captured frame.
- Ring buffer: Fixed-size buffer for streaming data.
- Backpressure: When processing cannot keep up with input.
Mental Model Diagram (ASCII)
Radio -> Promiscuous callback -> Queue -> Parser task -> Logs
How It Works (Step-by-Step)
- Enable Wi-Fi in STA mode without connecting.
- Enable promiscuous mode and register callback.
- In callback, copy metadata to queue.
- Parser task reads queue, parses frames, updates stats.
- Log summaries periodically.
Minimal Concrete Example
void wifi_sniffer_cb(void* buf, wifi_promiscuous_pkt_type_t type) {
enqueue_metadata(buf, type);
}
Common Misconceptions
- “You can parse everything in the callback.” This causes drops and resets.
- “Bigger queue is always better.” It wastes RAM and increases latency.
- “Promiscuous mode works with AP connection.” It usually conflicts.
Check-Your-Understanding Questions
- Why should the promiscuous callback be minimal?
- What happens if the queue fills?
- Why can you not sniff all channels at once?
Check-Your-Understanding Answers
- It runs in high-priority context and must not block.
- Packets are dropped.
- The radio can only tune to one channel at a time.
Real-World Applications
- Packet sniffers and wireless security tools.
- RF monitoring for IoT deployments.
Where You’ll Apply It
- See Section 4.2 Key Components and Section 6.2 Critical Test Cases.
- Also used in: P10 Web Oscilloscope for streaming patterns.
References
- ESP-IDF Wi-Fi promiscuous API docs
- Wireshark capture strategies
Key Insights
Stable sniffing requires separating capture from parsing.
Summary
Promiscuous mode delivers a firehose of frames. You must buffer and process carefully to avoid drops and watchdog resets.
Homework/Exercises to Practice the Concept
- Design a ring buffer size for 100 frames per second with 64-byte metadata.
- Propose a logging strategy that avoids UART saturation.
Solutions to the Homework/Exercises
- 100 * 64 = 6400 bytes per second; a 4 KB buffer holds <1 second, 16 KB holds ~2.5 seconds.
- Aggregate stats and log once per second.
2.3 Concept 3: Channel Hopping and Coverage Strategy
Fundamentals
Wi-Fi in 2.4 GHz uses multiple channels. A sniffer must either stay on one channel or hop between channels to capture wider activity. Channel hopping trades completeness for coverage: you see more networks but miss some frames. A good hopping strategy balances dwell time on each channel and total scan cycle.
Deep Dive into the Concept
If you stay on one channel, you capture all frames on that channel but miss others. This is useful for focused analysis of a specific network. If you want a broader view, you hop channels. The simplest strategy is round-robin: dwell on each channel for a fixed time (e.g., 200 ms) and then switch. However, too short a dwell time means you may miss beacons, which typically occur every 100 ms. To reliably capture beacons, you should dwell at least 100-200 ms per channel. For 11 channels, that makes a full scan around 2 seconds. If you need more complete data, increase dwell time.
Channel hopping also interacts with parsing. When you switch channels, the RSSI and frame counts are no longer continuous. You must tag each captured frame with the channel and timestamp. This allows you to build per-channel statistics and avoid mixing data. A good analyzer logs counts per channel and can show which channels are busiest.
The ESP32 Wi-Fi API allows you to set the channel. Channel switching itself takes time, and some frames may be missed during the switch. You should avoid switching too frequently. A robust approach is to run a timer that triggers channel changes, while the callback continues to capture. The parser task can detect the channel from metadata and update per-channel stats.
Channel selection also depends on regulatory domain. Some regions have channels 1-11, others 1-13. For a learning project, stick to 1-11. If you want more accuracy, you can read the current regulatory domain and adjust your hopping list.
How this fits on projects
You will implement a channel hopping timer and log per-channel frame counts and SSIDs.
Definitions & Key Terms
- Dwell time: Time spent on each channel.
- Channel hop: Switching the radio to a different channel.
- Regulatory domain: Region-specific allowed channels.
- Beacon interval: Period between beacon frames (often 100 ms).
Mental Model Diagram (ASCII)
Channel 1 -> Channel 2 -> Channel 3 -> ... -> Channel 11 -> repeat
How It Works (Step-by-Step)
- Define channel list (1-11).
- Set timer to change channel every dwell period.
- Capture frames and tag with channel.
- Log stats per channel.
Minimal Concrete Example
esp_wifi_set_channel(chan, WIFI_SECOND_CHAN_NONE);
Common Misconceptions
- “Shorter dwell always better.” You can miss beacons and probes.
- “Channel hopping gives full coverage.” It samples, not captures everything.
- “Channel numbers are universal.” They vary by region.
Check-Your-Understanding Questions
- Why should dwell time be at least 100 ms?
- What is the tradeoff of hopping vs staying on one channel?
- Why is tagging frames with channel important?
Check-Your-Understanding Answers
- Many APs send beacons every 100 ms.
- Hopping gives broader view but misses some frames.
- It allows per-channel stats and avoids mixing data.
Real-World Applications
- Wi-Fi site surveys and channel utilization measurement.
- Security monitoring across multiple channels.
Where You’ll Apply It
- See Section 5.10 Phase 3 and Section 6.2 Critical Test Cases.
- Also used in: P03 ESP-NOW Remote for channel awareness.
References
- 802.11 channel planning guides
- ESP-IDF Wi-Fi channel APIs
Key Insights
Channel hopping is sampling, not full capture.
Summary
A sniffer can only listen to one channel at a time. Channel hopping expands coverage but must be designed with dwell time and beacon intervals in mind.
Homework/Exercises to Practice the Concept
- Calculate total scan time for 11 channels with 200 ms dwell.
- Design a hopping schedule that focuses more on channels 1, 6, and 11.
Solutions to the Homework/Exercises
- 11 * 200 ms = 2200 ms.
- Repeat 1-6-11 more often, e.g., 1,6,11,1,6,11,2,3,4,5,7,8,9,10.
3. Project Specification
3.1 What You Will Build
A Wi-Fi sniffer firmware that captures management frames (beacons and probe requests), parses SSIDs and BSSIDs, logs RSSI, and produces per-channel statistics using a channel hopping strategy.
3.2 Functional Requirements
- Promiscuous Mode: Capture frames on the current channel.
- Frame Parsing: Extract frame type, SSID, and BSSID.
- RSSI Logging: Record RSSI and channel for each frame.
- Channel Hopping: Cycle through channels 1-11.
- Statistics: Report frame counts per channel and per SSID.
3.3 Non-Functional Requirements
- Stability: Runs for 10 minutes without watchdog reset.
- Performance: Handles at least 50 frames per second.
- Usability: Logs are readable and rate-limited.
3.4 Example Usage / Output
[CH 1] Beacon SSID=HomeNet BSSID=aa:bb:cc:dd:ee:ff RSSI=-40
[CH 6] Probe from 11:22:33:44:55:66 RSSI=-55
[STATS] CH1=120 CH6=98 CH11=60
3.5 Data Formats / Schemas / Protocols
- Parsed frame structure:
{type, subtype, bssid, ssid, rssi, channel}
3.6 Edge Cases
- Hidden SSID (empty SSID IE).
- Oversized or malformed IEs.
- High traffic causing queue overflow.
3.7 Real World Outcome
A live console log showing nearby SSIDs, RSSI, and channel statistics.
3.7.1 How to Run (Copy/Paste)
idf.py set-target esp32s3
idf.py build
idf.py -p /dev/ttyUSB0 flash monitor
3.7.2 Golden Path Demo (Deterministic)
- Fixed dwell 200 ms per channel.
- Log summary once per second.
Expected log:
[STATS] t=1s CH1=60 CH6=45 CH11=30
3.7.3 Failure Demo (Queue Overflow)
E (1234) SNIFF: frame queue full, dropping packets
3.7.4 If CLI
No standalone CLI. Exit codes not applicable.
3.7.5 If Web App
Not applicable.
3.7.6 If API
No API is exposed. Error JSON shape not applicable.
3.7.7 If GUI / Desktop / Mobile
Not applicable.
3.7.8 If TUI
Not applicable.
4. Solution Architecture
4.1 High-Level Design
Wi-Fi Radio -> Promiscuous Callback -> Queue -> Parser Task -> Logs/Stats
4.2 Key Components
| Component | Responsibility | Key Decisions | |———-|—————-|—————| | Promiscuous callback | Capture frames | Minimal work | | Parser task | Decode headers and IEs | Bounds checking | | Channel hopper | Cycle channels | 200 ms dwell | | Stats aggregator | Count frames per channel/SSID | Periodic logging |
4.3 Data Structures (No Full Code)
struct sniff_record {
uint8_t channel;
int8_t rssi;
uint8_t subtype;
char ssid[33];
};
4.4 Algorithm Overview
Key Algorithm: Capture and Parse
- Capture raw frame in callback.
- Push minimal data to queue.
- Parser extracts subtype and SSID.
- Update stats tables.
Complexity Analysis:
- Time: O(frame length) per parse
- Space: O(stats tables)
5. Implementation Guide
5.1 Development Environment Setup
idf.py set-target esp32s3
idf.py build
5.2 Project Structure
p08_wifi_sniffer/
+-- main/
| +-- sniffer.c
| +-- parser.c
| +-- stats.c
+-- README.md
5.3 The Core Question You’re Answering
“What is actually flying through the air right now?”
5.4 Concepts You Must Understand First
- 802.11 frame structure
- Promiscuous mode callback constraints
- Channel hopping strategies
5.5 Questions to Guide Your Design
- Which frame types will you parse first?
- How will you avoid blocking the callback?
- What dwell time balances coverage and completeness?
5.6 Thinking Exercise
Design a channel hopping strategy that maximizes beacon capture on channels 1, 6, and 11.
5.7 The Interview Questions They’ll Ask
- What is the difference between a beacon and a probe request?
- Why can promiscuous callbacks trigger watchdogs?
- What does RSSI tell you?
5.8 Hints in Layers
Hint 1: Start with beacons only before parsing probes.
Hint 2: Copy only the first 64 bytes in the callback.
Hint 3: Use a timer task to hop channels.
Hint 4: Log summaries instead of every frame.
5.9 Books That Will Help
| Topic | Book | Chapter | |——|——|———| | Wi-Fi protocols | 802.11 Wireless Networks | Frame formats | | Networking | Computer Networks | Link layer |
5.10 Implementation Phases
Phase 1: Capture (2 days)
Goals:
- Enable promiscuous mode.
- Count frames per second.
Tasks:
- Register callback.
- Log frame count.
Checkpoint: Frames captured on one channel.
Phase 2: Parsing (2-3 days)
Goals:
- Parse beacon frames and SSIDs.
- Log RSSI.
Tasks:
- Parse frame control and SSID IE.
- Add RSSI logging.
Checkpoint: SSIDs appear in logs.
Phase 3: Channel Hopping (2 days)
Goals:
- Scan channels 1-11.
- Aggregate stats.
Tasks:
- Implement channel hopping timer.
- Log per-channel stats.
Checkpoint: Stats show multiple channels.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Parsing depth | Full frame vs headers only | Headers + SSID | Enough for analysis | | Logging | Per-frame vs summary | Summary | Avoid UART saturation | | Channel dwell | 50 ms vs 200 ms | 200 ms | Capture beacons reliably |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples | |———-|———|———-| | Unit Tests | Parser correctness | SSID extraction | | Integration Tests | Live capture | Beacon logs | | Edge Case Tests | Hidden SSID | Empty SSID handling |
6.2 Critical Test Cases
- Beacon Parsing: SSID and BSSID extracted correctly.
- Hidden SSID: Parser handles empty SSID safely.
- Queue Overflow: Logs drop count without crashing.
6.3 Test Data
SSID="TestNet"
BSSID=aa:bb:cc:dd:ee:ff
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution | |——–|———|———-| | Parsing beyond buffer | Crash or garbage SSID | Add bounds checks | | Heavy callback | Watchdog resets | Move parsing to task | | Channel mismatch | Missing networks | Use hopping or set correct channel |
7.2 Debugging Strategies
- Start on a single channel and compare with a known AP.
- Use Wireshark on a laptop to validate frame parsing.
7.3 Performance Traps
- Logging every frame overwhelms UART and slows capture.
8. Extensions & Challenges
8.1 Beginner Extensions
- Count data frames vs management frames.
- Add a simple RSSI average per SSID.
8.2 Intermediate Extensions
- Detect probe requests and log client MACs.
- Track channel utilization over time.
8.3 Advanced Extensions
- Export captures in PCAP format.
- Implement a small web UI to visualize stats.
9. Real-World Connections
9.1 Industry Applications
- Wi-Fi site surveys and troubleshooting.
- Security monitoring for rogue AP detection.
9.2 Related Open Source Projects
- Kismet - wireless detector and sniffer.
- Aircrack-ng - Wi-Fi analysis tools.
9.3 Interview Relevance
- 802.11 frame parsing and wireless debugging are advanced interview topics.
10. Resources
10.1 Essential Reading
- 802.11 frame format documentation
- ESP-IDF Wi-Fi promiscuous API docs
10.2 Video Resources
- “Wi-Fi Frame Analysis” (tutorial)
- “Promiscuous Mode on ESP32” (talk)
10.3 Tools & Documentation
- Wireshark
- ESP-IDF Wi-Fi API reference
10.4 Related Projects in This Series
- P03 ESP-NOW Remote - MAC layer understanding
- P10 Web Oscilloscope - real-time data streaming
11. Self-Assessment Checklist
11.1 Understanding
- I can identify beacon and probe frames.
- I can parse SSID IEs safely.
- I can design a channel hopping schedule.
11.2 Implementation
- Sniffer runs without watchdog resets.
- Logs show SSIDs and RSSI.
- Channel stats are updated each second.
11.3 Growth
- I can explain tradeoffs between capture depth and performance.
- I can extend parsing to new frame types.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Capture and parse beacon frames on one channel.
Full Completion:
- Channel hopping with per-channel statistics.
Excellence (Going Above & Beyond):
- PCAP export or visualization dashboard.
- Advanced parsing of probe requests and data frames.