Project 9: Pitch Shifter / Time Stretcher (Breaking the Speed of Sound)
Build a basic phase-based pitch shifter or time stretcher with audible results.
Project Overview
| Attribute | Value |
|---|---|
| Difficulty | Level 3: Advanced |
| Time Estimate | 2-3 weeks |
| Main Language | C |
| Alternative Languages | Python, Rust, C++ |
| Knowledge Area | Phase and windowed processing |
| Tools | Audio player, plotting tool |
| Main Book | “Understanding Digital Signal Processing” by Richard G. Lyons |
What you’ll build: A tool that changes pitch without changing tempo, or tempo without changing pitch, using short-time analysis.
Why it teaches DSP: You must understand phase, window overlap, and the limits of time-frequency manipulation.
Core challenges you’ll face:
- Maintaining phase continuity between frames
- Managing artifacts like echo or warbling
- Tuning window size and overlap for clarity
Real World Outcome
You will be able to shift the pitch of a voice or stretch a sound without changing its pitch, with clear but imperfect results.
Example Output:
$ ./pitch_shift --semitones +3 --input voice.wav --output voice_shifted.wav
Frames: 320
Window: 1024
Output: voice_shifted.wav
Verification steps:
- Compare original and processed audio
- Listen for phase-related artifacts
The Core Question You’re Answering
“How can I change the pitch or duration of a signal while preserving its character?”
This project shows the practical cost of violating time-frequency limits.
Concepts You Must Understand First
Stop and research these before coding:
- Phase vocoder basics
- Why does phase continuity matter between frames?
- Book Reference: “Understanding Digital Signal Processing” by Richard G. Lyons, Ch. 11
- Window overlap and synthesis
- How do overlap-add methods reconstruct the signal?
- Book Reference: “The Scientist and Engineer’s Guide to DSP” by Steven W. Smith, Ch. 17
- Artifacts and tradeoffs
- What causes smearing or metallic sounds?
- Book Reference: “Understanding Digital Signal Processing” by Richard G. Lyons, Ch. 11
Questions to Guide Your Design
- Processing strategy
- Will you adjust playback rate or phase progression?
- How will you keep frames aligned during synthesis?
- Quality tuning
- What window size gives the best balance for speech vs music?
- How will you expose quality vs speed tradeoffs?
Thinking Exercise
Frame Alignment
Imagine you analyze a signal with 50 percent overlap and then reconstruct it. If you double the analysis hop size, what happens to time duration?
Questions while working:
- Why does changing hop size affect time scaling?
- How does pitch shifting differ from time stretching?
The Interview Questions They’ll Ask
Prepare to answer these:
- “What is a phase vocoder?”
- “Why does phase continuity matter?”
- “How does overlap-add reconstruction work?”
- “What artifacts are common in pitch shifting?”
- “How would you evaluate audio quality objectively?”
Hints in Layers
Hint 1: Starting Point Start with time stretching by changing the hop size between frames.
Hint 2: Next Level Track phase differences between frames and preserve them on synthesis.
Hint 3: Technical Details Use a consistent window and overlap to reduce discontinuities.
Hint 4: Tools/Debugging Test with simple tones and verify that pitch or duration changes as expected.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Phase vocoder | “Understanding Digital Signal Processing” by Richard G. Lyons | Ch. 11 |
| Overlap-add | “The Scientist and Engineer’s Guide to DSP” by Steven W. Smith | Ch. 17 |
| Audio artifacts | “Understanding Digital Signal Processing” by Richard G. Lyons | Ch. 11 |
Implementation Hints
- Focus on clarity, not perfection, in the first version.
- Use a test tone sweep to reveal artifacts.
- Keep window and hop sizes configurable.
Learning Milestones
- First milestone: You can time-stretch without total distortion.
- Second milestone: You can shift pitch by a few semitones.
- Final milestone: You can explain the artifacts and how to reduce them.