Project 2: Universal IR Remote with Learning Capability

Build a handheld IR learning remote that captures raw pulse timings, decodes common protocols, stores profiles on microSD, and replays signals with precise microsecond timing.

Quick Reference

Attribute Value
Difficulty Advanced
Time Estimate 2–3 weeks
Main Programming Language C/C++ (ESP-IDF)
Alternative Programming Languages Arduino
Coolness Level High
Business Potential Medium (universal control, field service tooling)
Prerequisites GPIO timers, basic interrupts, microSD basics
Key Topics IR modulation, timing capture, protocol decoding, deterministic replay

1. Learning Objectives

By completing this project, you will:

  1. Capture microsecond-accurate IR pulse trains using timers and interrupts.
  2. Decode common protocols (NEC-style) and store raw captures safely.
  3. Implement a replay engine with precise carrier modulation.
  4. Build a UI for learning, naming, and replaying IR profiles.
  5. Validate timing accuracy using a logic analyzer or loopback tests.

2. All Theory Needed (Per-Concept Breakdown)

2.1 IR Modulation and Protocol Structure (NEC-style)

Fundamentals

Infrared remotes work by turning an IR LED on and off rapidly, usually around a 38 kHz carrier. The receiver in a TV or appliance filters that carrier and outputs a demodulated digital signal: a sequence of high/low pulses with specific durations. Protocols like NEC encode bits by pulse widths: a fixed leading burst, then a space, then a sequence of short/long spaces representing 0s and 1s. Learning an IR remote means capturing these durations accurately and either decoding them into protocol fields (address, command) or storing the raw timing sequence. The challenge is timing precision: a few hundred microseconds of error can break decoding or cause a device to ignore the replay.

Deep Dive into the concept

The NEC protocol is a great reference because it is widely used and timing-based rather than phase-based. A typical NEC frame starts with a 9 ms leading burst (carrier on), followed by a 4.5 ms space (carrier off). Each data bit is then encoded as a 562.5 µs burst followed by a 562.5 µs space for a logical 0, or a 1.6875 ms space for a logical 1. The receiver outputs a digital line where the “burst” appears as a low or high depending on the module, so you must know whether your IR receiver inverts the signal. To decode, you measure time between edges, classify each space into short or long, and reconstruct bytes. A repeat code is a special short frame that indicates a held button, which you must detect to implement long-press behavior.

Carrier modulation is equally important for replay. You cannot simply toggle the LED at arbitrary speed; you must reproduce a 38 kHz carrier (or whatever frequency the target device expects), and then gate that carrier on and off according to the pulse sequence. On the ESP32-S3, you can use hardware peripherals (RMT) or timers to generate carrier PWM and modulate it. If you generate carrier in software, timing jitter will be high and devices may not respond. Therefore, the “learning” side is about measuring the envelope durations, while the “replay” side is about regenerating the carrier with accurate envelope gating.

Protocol decoding helps you compress storage and support editing (e.g., change a command or address), but raw storage is more general and supports unknown protocols. A robust learning remote stores both: raw durations for faithful replay, and decoded fields if possible for display and organization. When decoding fails, the UI should show “Unknown” but still allow replay. This dual-path design is common in professional learning remotes.

How this fits in projects

This is the core of the IR learning engine. Timing capture and replay discipline also appears in P03-real-time-audio-spectrum-analyzer.md (timing + sampling) and the capstone P08-complete-cardputer-security-toolkit.md where IR is one module among many.

Definitions & key terms

  • Carrier frequency → high-frequency PWM (commonly 38 kHz) that carries IR data
  • Burst → period where carrier is ON
  • Space → period where carrier is OFF
  • Envelope → the on/off pattern of bursts and spaces
  • Repeat code → shortened frame indicating a button is held
  • Demodulated output → IR receiver output after carrier filtering

Mental model diagram (ASCII)

[Carrier 38kHz] --- gated by ---> [Envelope Pulses] ---> [IR LED]
[Receiver] ---> [Edge timestamps] ---> [Pulse durations] ---> [Decode/Store]

How it works (step-by-step, with invariants and failure modes)

  1. IR receiver outputs edges; ISR records timestamps.
  2. Convert timestamp deltas into pulse durations.
  3. Identify the leading burst and sync space.
  4. Classify each space as 0 or 1 based on threshold.
  5. If decode succeeds, store address/command; if not, store raw durations.
  6. Replay by generating carrier and gating it with stored durations.

Invariants: timestamps are monotonic; thresholds are calibrated; carrier frequency stable. Failure modes: inverted signal leads to wrong decoding; jitter causes misclassification; wrong carrier frequency causes no response.

Minimal concrete example

// Example thresholding of space durations
if (space_us > 1200) bit = 1; else bit = 0;

Common misconceptions

  • “IR is just on/off.” → It is modulated at a carrier frequency.
  • “Timing errors don’t matter.” → They do; devices are strict.
  • “All remotes use NEC.” → Many protocols exist; raw storage matters.

Check-your-understanding questions

  1. Why is carrier modulation required for replay?
  2. What is the role of the leading burst in NEC?
  3. Why store raw timings even if decoding succeeds?

Check-your-understanding answers

  1. Receivers are tuned to a specific carrier; without it, they ignore signals.
  2. It provides frame sync and allows the receiver to detect a new command.
  3. Raw storage preserves unknown protocols and timing nuances.

Real-world applications

  • Universal remotes for AV systems.
  • Field service tools for HVAC and industrial equipment.

Where you’ll apply it

  • This project: see §4.1 and §5.4 for decoding logic and replay engine.
  • Also used in: P08-complete-cardputer-security-toolkit.md.

References

  • NEC IR protocol timing reference (Renesas/industry docs).
  • ESP-IDF RMT peripheral documentation.

Key insight

IR learning works because you precisely capture the envelope and precisely regenerate it with a stable carrier.

Summary

IR protocols are timing-based. Capture edges, classify durations, and replay using a stable carrier to match receiver expectations.

Homework/Exercises to practice the concept

  1. Measure a remote’s pulse durations with a logic analyzer.
  2. Write a simple decoder that prints 0/1 bits for NEC frames.

Solutions to the homework/exercises

  1. Use a 1 MHz sampling clock, measure leading burst and bit spaces.
  2. Use thresholds around 1.1–1.3 ms to distinguish 0 vs 1 spaces.

2.2 High-Resolution Timing Capture with ISR and Ring Buffers

Fundamentals

Capturing IR pulses requires microsecond timing precision. Your MCU must record edges (transitions) and compute pulse durations without missing any events. The correct approach is to use an interrupt on each edge and record the current timer count. Because interrupts may arrive rapidly during a burst, the ISR must be very small: record the timestamp and push it into a ring buffer. A separate task converts timestamps into durations, applies protocol decoding, and updates the UI. This separation keeps your capture reliable even when UI or storage tasks are busy.

Deep Dive into the concept

On the ESP32-S3, you can implement timing capture with either hardware peripherals (RMT) or a general-purpose timer plus GPIO interrupts. The RMT peripheral is ideal because it captures high/low durations in hardware and produces a buffer of pulse widths with minimal CPU load. However, using RMT means you must configure its memory blocks and manage its RX buffer size. If the buffer overflows, you lose part of the signal. With GPIO interrupts, you manage a ring buffer of timestamps. The ISR reads a high-resolution timer (e.g., esp_timer_get_time() or a hardware timer register) and stores it. The decoder task then computes delta times between consecutive timestamps, producing high/low durations.

Backpressure matters here too. If the ring buffer fills because the decoder task is blocked, you lose edges and corrupt the sequence. Therefore, the decoder task must have a high priority and minimal blocking. UI updates should be throttled or done in a separate task that reads decoded results from a queue. If you store captures to SD, do that after decoding, not during capture. A safe strategy is to store decoded sequences into a small in-memory structure, then write them when the capture finishes.

Timing accuracy must be validated. Use a logic analyzer or a loopback method: drive the IR LED and capture it with the receiver, then compare measured durations to expected values. If your timestamps drift or jitter, adjust the timer source. Avoid software delays in the ISR. If you must debounce or filter noise, do it in the decoder task, not in the ISR.

How this fits in projects

High-resolution capture techniques are reused in P03-real-time-audio-spectrum-analyzer.md for stable sampling and in the capstone P08-complete-cardputer-security-toolkit.md when multiple timing-sensitive modules coexist.

Definitions & key terms

  • Edge interrupt → interrupt triggered on rising/falling edge
  • Timestamp → high-resolution timer reading at an edge
  • Delta → time difference between edges, representing pulse width
  • RMT → ESP32 Remote Control peripheral for capturing/generating pulses

Mental model diagram (ASCII)

[IR Receiver] -> [GPIO Edge ISR] -> [Timestamp Ring Buffer] -> [Decoder Task]

How it works (step-by-step, with invariants and failure modes)

  1. Configure GPIO for edge interrupts (both edges).
  2. ISR records timer value into ring buffer.
  3. Decoder task reads timestamps and computes deltas.
  4. Apply thresholds to classify pulses and decode.
  5. Store decoded results and raw durations.

Invariants: ISR must be constant time; ring buffer size must exceed max edges per frame. Failure modes: buffer overflow corrupts decoding; noisy edges create false pulses; timer rollover not handled.

Minimal concrete example

void IRAM_ATTR ir_isr(void *arg) {
    uint32_t t = timer_get_us();
    ringbuf_push_ts(t);
}

Common misconceptions

  • “I can debounce in ISR.” → Debounce belongs in the decoder task.
  • “Timer overflow won’t happen.” → It will; handle wrap-around.
  • “UI updates are free.” → They can block decoding if not decoupled.

Check-your-understanding questions

  1. Why should ISR only push timestamps, not decode?
  2. What happens if the ring buffer is too small?
  3. Why is RMT often better than GPIO interrupts?

Check-your-understanding answers

  1. ISR must be minimal to avoid missing edges.
  2. You lose edges and decode invalid frames.
  3. RMT offloads timing capture to hardware, reducing jitter.

Real-world applications

  • IR learning remotes and industrial controllers.
  • Ultrasonic distance capture and pulse-width sensors.

Where you’ll apply it

  • This project: see §5.10 Phase 1 and §4.3 data structures.
  • Also used in: P03-real-time-audio-spectrum-analyzer.md (timing discipline).

References

  • ESP-IDF RMT capture examples.
  • Making Embedded Systems – timing constraints and ISR design.

Key insight

Your ISR should act like a camera shutter: fast, deterministic, and leaving interpretation for later.

Summary

Accurate IR learning requires high-resolution timestamps and disciplined separation between ISR capture and task-level decoding.

Homework/Exercises to practice the concept

  1. Capture 100 edges and print deltas to serial.
  2. Simulate buffer overflow and confirm error handling.

Solutions to the homework/exercises

  1. Record timer_get_us() on each edge and compute differences in a task.
  2. Limit buffer to 16 entries, flood with edges, and verify drop counter.

2.3 Signal Storage, Profiling, and Deterministic Replay

Fundamentals

A learning remote is only useful if it can store and replay signals reliably. Storage means defining a data format for captured pulses, metadata (device name, command name), and optional decoded protocol fields. Replay means generating a stable carrier and gating it according to stored durations. Deterministic replay requires that you preserve the exact sequence of on/off durations or a decoded representation that reproduces them. If storage corrupts or timestamps drift, devices will not respond.

Deep Dive into the concept

There are two broad storage strategies: raw timing lists and decoded protocol fields. Raw lists are universal: you store an array of durations, alternating on/off. This guarantees replay fidelity but costs more memory. Decoded fields (address, command, repeat count) are compact and human-readable but depend on protocol correctness. The safest approach is to store raw timings and, if decoding succeeds, also store protocol metadata. Your UI can show decoded values while replay uses raw timings by default.

File formats should be simple and resilient. A compact binary format might store a header with version, carrier frequency, and count, followed by 16-bit durations (microseconds). If you use CSV or JSON, it is easier to debug but larger. Because IR command lists are small, binary storage is fine. Use checksums to detect corruption. On load, validate the count and ensure durations are within a reasonable range (e.g., 100–20000 µs). This prevents malformed files from triggering unsafe replays.

Replay requires generating a stable carrier. The ESP32-S3 can use LEDC or RMT for PWM carrier generation. The replay engine should schedule “carrier on” and “carrier off” intervals based on the stored durations. If you use RMT for replay, you can preload the pulse sequence and let hardware drive the timing, which is more accurate than software loops. If you do it in software, you will need to disable interrupts or accept jitter, which can cause missed commands on strict devices. Therefore, hardware-driven replay is the recommended design.

Determinism also involves state management: a command should behave identically every time. That means resetting the output pin, clearing any previous modulation state, and ensuring the same timing profile is used. Provide a “test replay” that repeats a command N times with fixed delays so you can verify reliability. Use a loopback test (IR LED pointing at receiver on the same device or an external photodiode) to measure replay timing.

How this fits in projects

Signal storage and deterministic replay patterns appear again in the capstone P08-complete-cardputer-security-toolkit.md, where IR modules coexist with other tools and must be stable under load.

Definitions & key terms

  • Raw timing list → array of on/off durations for a captured IR command
  • Carrier gate → enabling/disabling carrier according to timing list
  • Checksum → validation field to detect corruption
  • Replay jitter → timing error during playback

Mental model diagram (ASCII)

[Captured Durations] -> [Profile File] -> [Replay Engine] -> [Carrier Gate] -> [IR LED]

How it works (step-by-step, with invariants and failure modes)

  1. Capture pulse durations and store them in RAM.
  2. Normalize durations and validate ranges.
  3. Save to microSD with metadata and checksum.
  4. When replaying, load durations and configure carrier PWM.
  5. Gate carrier on/off according to durations with hardware timing.

Invariants: durations are positive and ordered; carrier frequency constant; replay engine resets state. Failure modes: corrupted files cause invalid durations; jitter leads to non-responsive devices; missing carrier results in no effect.

Minimal concrete example

profile.header.carrier_hz = 38000;
profile.durations_us = {9000, 4500, 560, 560, 560, 1690, ...};

Common misconceptions

  • “Decoded protocol is always enough.” → Unknown protocols require raw timings.
  • “Software delay loops are accurate.” → They are not under load.
  • “Replaying once is enough.” → Some devices expect repeats or long presses.

Check-your-understanding questions

  1. Why store a checksum with each profile?
  2. Why is hardware PWM preferred for carrier generation?
  3. What could happen if a duration is out of range?

Check-your-understanding answers

  1. To detect corrupted files before replaying invalid signals.
  2. It provides stable frequency and duty cycle independent of CPU load.
  3. The target device may ignore the command or interpret it incorrectly.

Real-world applications

  • Universal control systems in AV and smart home installations.
  • Automated testing rigs for IR-controlled devices.

Where you’ll apply it

  • This project: see §3.5 data format and §4.4 replay algorithm.
  • Also used in: P08-complete-cardputer-security-toolkit.md.

References

  • IR protocol timing notes (NEC, RC5 resources).
  • ESP-IDF LEDC/RMT PWM documentation.

Key insight

Learning is only half the problem; reliable replay is what makes the tool useful.

Summary

Store raw timings with validation, and replay using hardware-timed carrier gating for deterministic behavior.

Homework/Exercises to practice the concept

  1. Define a binary profile format with a header and checksum.
  2. Replay a captured command 5 times and measure success rate.

Solutions to the homework/exercises

  1. Use a 16-byte header + CRC32, then 16-bit duration entries.
  2. Count device responses; if <5, adjust carrier frequency or timing.

3. Project Specification

3.1 What You Will Build

A universal IR remote that can:

  • learn commands from other remotes,
  • display and name learned commands,
  • store profiles on microSD,
  • replay commands with accurate timing.

3.2 Functional Requirements

  1. Learning mode: capture pulse timings from an external IR receiver module.
  2. Protocol decode: attempt to decode NEC-style frames and show results.
  3. Storage: save profiles with metadata and checksums.
  4. Replay: send learned commands with accurate carrier frequency.
  5. UI: provide menus for learn, list, and replay.

3.3 Non-Functional Requirements

  • Performance: capture without missed edges for typical remotes.
  • Reliability: stored profiles survive power cycles and validate on load.
  • Usability: commands can be named and grouped by device.

3.4 Example Usage / Output

1) Choose “Learn” and press a button on a TV remote.
2) Name it “TV Power”.
3) Press “Replay” and the TV toggles power.

3.5 Data Formats / Schemas / Protocols

Profile header (binary):

version, carrier_hz, count, crc32, name_len, name_bytes...

Durations:

uint16_t durations_us[count]; // alternating ON/OFF

3.6 Edge Cases

  • Receiver sees noise or partial capture.
  • Unknown protocol (decode fails).
  • microSD missing or full.
  • Carrier frequency mismatch with target device.

3.7 Real World Outcome

A successful build learns a command from a commercial remote, stores it, and replays it reliably. The UI shows the learned command name, protocol (if decoded), and carrier frequency.

3.7.1 How to Run (Copy/Paste)

idf.py set-target esp32s3
idf.py build
idf.py -p /dev/ttyUSB0 flash monitor

3.7.2 Golden Path Demo (Deterministic)

  • Learn “TV Power” from a known NEC remote.
  • Save as “TV_PWR”.
  • Replay three times; TV toggles each time.

Failure demo (deterministic):

  • Start learning with the IR receiver disconnected. Expected: UI shows “No IR input,” capture times out after 5 seconds, and no profile is saved. Exit code: 2.

3.7.3 If CLI: exact terminal transcript

I (2100) ir: capture start
I (2101) ir: decoded NEC addr=0x00FF cmd=0x1AE5
I (2102) ir: profile saved /sd/tv_pwr.ir
I (2120) ir: replay ok (3/3)

Exit codes: 0 = success, 2 = no IR input/timeout, 3 = SD write error.

3.7.4 If Web App

Not applicable.

3.7.5 If API

Not applicable.

3.7.6 If Library

Not applicable.

3.7.7 If GUI / Desktop / Mobile

Not applicable.

3.7.8 If TUI

+----------------------------+
| IR Learning Remote         |
| [1] Learn New Command      |
| [2] TV Power               |
| [3] AC Temp Up             |
| Status: READY              |
+----------------------------+

4. Solution Architecture

4.1 High-Level Design

[IR Receiver] -> [Edge Capture] -> [Decoder] -> [Profile Store]
                                          |
                                          v
                               [Replay Engine] -> [IR LED]

4.2 Key Components

Component Responsibility Key Decisions
Capture ISR Timestamp edges Minimal ISR, ring buffer
Decoder Convert durations to bits Threshold tuning
Profile store Save/load profiles Binary format + checksum
Replay engine Generate carrier + envelope RMT for timing accuracy
UI Learn/list/replay Simple menus, safe prompts

4.3 Data Structures (No Full Code)

typedef struct {
    uint32_t carrier_hz;
    uint16_t count;
    uint32_t crc32;
    char name[16];
} ir_profile_hdr_t;

4.4 Algorithm Overview

Key Algorithm: NEC Decode

  1. Detect leading burst and sync space.
  2. For each bit, classify space length as 0 or 1.
  3. Assemble bytes and validate inverse bytes.

Complexity Analysis:

  • Time: O(n) per frame
  • Space: O(n) for durations

5. Implementation Guide

5.1 Development Environment Setup

idf.py set-target esp32s3
idf.py build

5.2 Project Structure

project-root/
├── main/
│   ├── ir_capture.c
│   ├── ir_decode.c
│   ├── ir_replay.c
│   ├── ui.c
│   └── storage.c
└── README.md

5.3 The Core Question You’re Answering

“How can I capture and replay IR signals precisely enough that real devices respond?”

5.4 Concepts You Must Understand First

  1. IR carrier modulation and timing.
  2. Edge capture with hardware timers.
  3. Deterministic replay and storage validation.

5.5 Questions to Guide Your Design

  1. What thresholds distinguish 0 vs 1 in your protocol?
  2. How will you handle unknown protocols?
  3. How will you validate stored profiles?

5.6 Thinking Exercise

Sketch a timeline of a NEC frame and label burst/space durations. Decide which durations are most critical for decoding.

5.7 The Interview Questions They Will Ask

  1. Why is carrier frequency required for IR control?
  2. What happens if timing is off by 20%?
  3. Why store raw timings alongside decoded fields?

5.8 Hints in Layers

Hint 1: Start with raw duration capture and print to serial.

Hint 2: Implement replay of the raw durations with carrier PWM.

Hint 3: Add decoding and name storage once replay works.

5.9 Books That Will Help

Topic Book Chapter
Timing & ISR design Making Embedded Systems Ch. 5
Defensive parsing Effective C Ch. 10

5.10 Implementation Phases

Phase 1: Capture & Replay (4–5 days)

Goals: record durations and replay with carrier.

Phase 2: Decode & UI (5–7 days)

Goals: decode NEC, show results, save profiles.

Phase 3: Storage & Robustness (5–7 days)

Goals: checksum, load/validate, error handling.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Capture method RMT, GPIO ISR RMT Hardware timing accuracy
Storage format JSON, binary Binary Compact and fast
Replay Software delay, RMT RMT Deterministic timing

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Decode logic NEC bit parsing
Integration Tests Capture->replay learn then replay
Edge Tests Noise handling partial frames

6.2 Critical Test Cases

  1. Known NEC remote command decodes correctly.
  2. Replay of stored profile toggles device reliably.
  3. Corrupt file is rejected with an error message.

6.3 Test Data

Known NEC pulse timings for Power command
Noise burst with short pulses

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong carrier frequency Device ignores replay Adjust PWM frequency
ISR too slow Missing edges Use RMT or minimal ISR
No checksum Corrupt profiles Add CRC validation

7.2 Debugging Strategies

  • Use a logic analyzer to compare captured vs replayed pulses.
  • Print raw durations to verify thresholds.

7.3 Performance Traps

  • Software delays in replay under multitasking.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add favorites list for common commands.

8.2 Intermediate Extensions

  • Support RC5 protocol decode.

8.3 Advanced Extensions

  • Implement automatic carrier frequency detection.

9. Real-World Connections

9.1 Industry Applications

  • Universal remotes in AV integrators.
  • IR control in industrial automation.
  • IRremote: popular Arduino IR library.

9.3 Interview Relevance

  • Timing capture, protocol decoding, and ISR design are core embedded topics.

10. Resources

10.1 Essential Reading

  • NEC IR protocol timing references.
  • ESP-IDF RMT documentation.

10.2 Video Resources

  • IR signal analysis tutorials (logic analyzer demos).

10.3 Tools & Documentation

  • Logic analyzer software (PulseView).
  • P08-complete-cardputer-security-toolkit.md – IR module integration.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain how IR carrier modulation works.
  • I can describe how NEC encodes bits.

11.2 Implementation

  • Learning mode captures full commands reliably.
  • Replay works on at least two devices.

11.3 Growth

  • I can explain why RMT is better than software delay.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Learn a command and replay it successfully.
  • Store and load profiles from microSD.

Full Completion:

  • Decode NEC and display address/command.
  • Provide UI for naming and organizing profiles.

Excellence (Going Above & Beyond):

  • Auto-detect carrier frequency.
  • Support multiple protocols with auto-selection.