Project 10: The Search-Replace Master (Substitutions)

Perform powerful file-wide transformations using Vim’s :substitute, regex, and global commands.

Quick Reference

Attribute Value
Difficulty Level 4: Expert
Time Estimate 6-10 hours
Main Programming Language Text (SQL/CSV/Code)
Alternative Programming Languages Any
Coolness Level Level 4: Power user
Business Potential 4: Bulk editor
Prerequisites Search, basic regex, command-line mode
Key Topics :substitute, regex groups, ranges, :global

1. Learning Objectives

By completing this project, you will:

  1. Use :substitute with correct ranges and flags.
  2. Build regex patterns with capture groups and backreferences.
  3. Use :global to target only matching lines.
  4. Apply substitutions safely with confirmation (c flag).
  5. Design deterministic, reversible transformations.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Ex Substitution and Ranges

Fundamentals

:substitute is Vim’s built-in search/replace command. Its general form is :[range]s/pattern/replacement/flags. The range determines where the substitution applies: current line (:s), whole file (:%s), or a selected range. Flags like g (global) and c (confirm) modify behavior. Understanding ranges is essential to avoid accidental changes.

Deep Dive into the concept

Ranges can be explicit line numbers (:1,10s), marks (:'a,'b), or patterns (:/start/,/end/). The % range means the entire file. The g flag applies the replacement to all matches on each line; without it, only the first match per line is replaced. The c flag prompts for confirmation, which is critical in risky changes. The deeper skill is combining range and flags to create a safe, precise transformation.

How this fits on projects

  • You will apply substitutions over the whole file and within ranges.
  • You will use confirmation for safety.

Definitions & key terms

  • Range: The set of lines targeted by the command.
  • Flags: Modifiers like g, c, i.
  • Substitution: Replacement of matched text.

Mental model diagram (ASCII)

[range] -> [pattern] -> [replacement] -> [flags]

p10_substitution_parts_flow

Substitution parts flow

How it works (step-by-step)

  1. Define the range (current line, visual, entire file).
  2. Write the pattern.
  3. Write the replacement.
  4. Choose flags (g, c, i).
  5. Execute and review.

Minimal concrete example

:%s/old/new/gc

Common misconceptions

  • ”% is optional”: Without it, only the current line changes.
  • “g means global file”: It means all matches per line.
  • “c is slow”: It prevents mistakes.

Check-your-understanding questions

  1. What does :%s do?
  2. What does the g flag do?
  3. When would you use c?

Check-your-understanding answers

  1. Substitutes across the entire file.
  2. Replaces all matches on each line.
  3. When you want confirmation before replacing.

Real-world applications

  • Renaming identifiers across a codebase.
  • Reformatting CSV or SQL statements.

Where you’ll apply it

References

  • :help :substitute
  • “Practical Vim” Ch. 14

Key insights

Range selection is what makes substitution safe.

Summary

Substitution is powerful but dangerous without the correct range and flags. The range is your safety guardrail.

Homework/Exercises to practice the concept

  1. Replace the first occurrence of “foo” on each line.
  2. Replace all occurrences of “foo” with confirmation.

Solutions to the homework/exercises

  1. :s/foo/bar/.
  2. :%s/foo/bar/gc.

2.2 Regex Groups and Backreferences

Fundamentals

Regex groups let you capture parts of a match and reuse them in the replacement. Vim uses \( and \) for groups in “magic” mode, or () in very magic mode (\v). Backreferences like \1 and \2 refer to captured groups. This enables powerful transformations such as reordering names or swapping fields.

Deep Dive into the concept

Vim regex has multiple magic modes. The easiest for complex patterns is very magic (\v), which reduces escaping. In this mode, () creates capture groups, and \1 refers to the first group. This lets you transform data such as Last, First into First Last in a single command. The deeper skill is testing patterns with / before running the substitution.

How this fits on projects

  • You will use capture groups to reorder fields.
  • You will test regex patterns before applying them globally.

Definitions & key terms

  • Capture group: A subpattern you can reference later.
  • Backreference: A reference to a captured group (e.g., \1).
  • Very magic (\v): A regex mode with fewer escapes.

Mental model diagram (ASCII)

(Last), (First) -> \2 \1

Capture swap flow

How it works (step-by-step)

  1. Write a pattern that captures the parts you need.
  2. Use backreferences in the replacement.
  3. Test with /pattern.
  4. Apply with :%s and flags.

Minimal concrete example

:%s/\v(\w+), (\w+)/\2 \1/

Common misconceptions

  • “Regex is too complex”: Simple capture groups cover most needs.
  • “Backreferences are optional”: They are required for reordering.
  • “Magic modes are confusing”: \v simplifies patterns.

Check-your-understanding questions

  1. What does \1 refer to?
  2. Why use \v?
  3. How do you test a regex before replacement?

Check-your-understanding answers

  1. The first capture group.
  2. It reduces escaping in the pattern.
  3. Use /pattern.

Real-world applications

  • Reformatting names or dates.
  • Swapping fields in CSV.
  • Normalizing log formats.

Where you’ll apply it

  • See Section 3.7 Real World Outcome and Section 5.8 Hints in Layers in this project.
  • Also used in: Project 5: Log Parser.

References

  • :help pattern
  • “Practical Vim” Ch. 12

Key insights

Capture groups let you reshape text without retyping.

Summary

Regex groups and backreferences unlock transformations that would be tedious manually. Test before running globally.

Homework/Exercises to practice the concept

  1. Swap first:last into last:first.
  2. Change YYYY-MM-DD into DD/MM/YYYY.

Solutions to the homework/exercises

  1. :%s/\v(\w+):(\w+)/\2:\1/.
  2. :%s/\v(\d{4})-(\d{2})-(\d{2})/\3\/\2\/\1/.

2.3 Global Command and Targeted Changes

Fundamentals

:global applies a command to lines that match a pattern. Combined with substitution, it allows you to target only specific lines, such as those containing ERROR. This reduces the risk of unintended changes. The syntax is :g/pattern/command.

Deep Dive into the concept

You can use :g to apply :s on matching lines, or :v (inverse global) to apply commands to non-matching lines. This is powerful when dealing with mixed data, because you can isolate the subset you want. The deeper idea is to build deterministic pipelines: select lines with :g, then apply a safe substitution with confirmation.

How this fits on projects

  • You will use :g to target a subset of lines.
  • You will combine :g with :s for controlled edits.

Definitions & key terms

  • Global command (:g): Apply a command to matching lines.
  • Inverse global (:v): Apply command to non-matching lines.
  • Targeted edit: Change that applies only to a subset.

Mental model diagram (ASCII)

Match lines -> apply command
Non-match lines -> skip

Global command flow

How it works (step-by-step)

  1. Identify the pattern that defines your target lines.
  2. Use :g/pattern/ to select them.
  3. Apply :s inside the global command.
  4. Confirm results.

Minimal concrete example

:g/ERROR/s/timeout/deadline/g

Common misconceptions

  • “Global replaces everything”: It only applies to matching lines.
  • “You cannot combine commands”: You can chain :g and :s.
  • “Confirmation is not needed”: It prevents mistakes.

Check-your-understanding questions

  1. What does :g/ERROR/ do?
  2. How do you apply a substitution only to matching lines?
  3. What does :v do?

Check-your-understanding answers

  1. Selects lines containing ERROR for the command.
  2. Use :g/pattern/s/old/new/.
  3. Applies the command to non-matching lines.

Real-world applications

  • Fixing only error lines in logs.
  • Updating only deprecated API calls in code.

Where you’ll apply it

References

  • :help :global
  • “Practical Vim” Ch. 15

Key insights

Global commands let you edit only what you intend to change.

Summary

:global is a selector. It lets you target specific lines and apply safe substitutions.

Homework/Exercises to practice the concept

  1. Replace TODO with DONE only on lines containing #.
  2. Delete empty lines using :g.

Solutions to the homework/exercises

  1. :g/#/s/TODO/DONE/.
  2. :g/^$/d.

3. Project Specification

3.1 What You Will Build

You will transform a file of structured text (CSV, SQL, or code) using Vim substitutions and global commands. The deliverable is a before/after file plus a command log.

3.2 Functional Requirements

  1. Input File: Create a file with at least 50 lines.
  2. Substitutions: Perform at least three different substitutions.
  3. Capture Groups: Use at least one substitution with backreferences.
  4. Global Command: Use :g to target specific lines.
  5. Command Log: Record every command used.

3.3 Non-Functional Requirements

  • Safety: Use c flag at least once.
  • Determinism: Commands produce the same output on repeated runs.
  • Clarity: Log explains intent of each substitution.

3.4 Example Usage / Output

Before:
Doe, John

Command:
:%s/\v(\w+), (\w+)/\2 \1/

After:
John Doe

3.5 Data Formats / Schemas / Protocols

Any structured text file. Example CSV:

last,first
Doe,John
Smith,Jane

3.6 Edge Cases

  • Matches that should not be replaced.
  • Lines with missing fields.
  • Ambiguous regex matches.

3.7 Real World Outcome

You will have a transformed file with consistent formatting and a repeatable substitution log.

3.7.1 How to Run (Copy/Paste)

vim data.txt

3.7.2 Golden Path Demo (Deterministic)

  1. Run :%s/\v(\w+), (\w+)/\2 \1/.
  2. Confirm the output lines are reordered.

3.7.3 Failure Demo (Deterministic)

If you forget the c flag, you may replace matches you did not intend to change, especially in mixed-format files.

3.7.4 If CLI

Not applicable.

3.7.5 If Web App

Not applicable.

3.7.6 If API

Not applicable.

3.7.7 If Library

Not applicable.

3.7.8 If TUI

+--------------------------+
| data.txt                 |
| Doe, John                |
| Smith, Jane              |
| -- NORMAL --             |
+--------------------------+

Data screen

Key interactions:

  • :%s for file-wide changes
  • c flag for confirmation
  • :g for targeted edits

4. Solution Architecture

4.1 High-Level Design

+-------------+    +-------------------+    +---------------+
| Input File  | -> | Vim Substitutions | -> | Output File   |
+-------------+    +-------------------+    +---------------+

Substitution pipeline

4.2 Key Components

Component Responsibility Key Decisions
Input file Provide structured text Use consistent format
Command log Track substitutions Include ranges + flags
Output file Store transformed text Save separately

4.3 Data Structures (No Full Code)

SubstitutionStep:
- range
- pattern
- replacement
- flags

4.4 Algorithm Overview

Key Algorithm: Safe Substitution

  1. Test pattern with /.
  2. Run substitution with c flag.
  3. Apply across range.

Complexity Analysis:

  • Time: O(n) lines
  • Space: O(1)

5. Implementation Guide

5.1 Development Environment Setup

vim --version

5.2 Project Structure

search-replace/
|-- data.txt
|-- data_after.txt
`-- command_log.md

5.3 The Core Question You’re Answering

“How do I perform programmatic text processing inside Vim?”

5.4 Concepts You Must Understand First

  1. Substitution ranges and flags
  2. Regex capture groups
  3. Global command

5.5 Questions to Guide Your Design

  1. Do I need a range or the whole file?
  2. Should I use \v to simplify regex?
  3. Do I need confirmation on this change?

5.6 Thinking Exercise

Write a command that removes trailing whitespace on every line.

5.7 The Interview Questions They’ll Ask

  1. “What does % mean in :%s?”
  2. “How do you make a search case-insensitive?”
  3. “How do you reuse the whole match in a replacement?”

5.8 Hints in Layers

Hint 1: Very magic Use \v to reduce escaping.

Hint 2: Confirmation Add c to confirm each match.

Hint 3: Reuse match Use & to reuse the entire match in replacement.

Hint 4: Range Use :'<,'> to limit to a visual selection.

5.9 Books That Will Help

Topic Book Chapter
Substitution “Practical Vim” Ch. 14
Regex “Practical Vim” Ch. 12
Global commands “Practical Vim” Ch. 15

5.10 Implementation Phases

Phase 1: Prepare Data (1-2 hours)

Goals:

  • Create a file with structured text.

Tasks:

  1. Create data.txt with 50 lines.
  2. Include a variety of patterns.

Checkpoint: Data file has mixed cases and patterns.

Phase 2: Build Substitutions (2-4 hours)

Goals:

  • Create three substitutions.

Tasks:

  1. Reorder names with a capture group.
  2. Change a prefix only on matching lines using :g.
  3. Remove trailing whitespace with a range.

Checkpoint: Output is correct and repeatable.

Phase 3: Validate and Document (1-2 hours)

Goals:

  • Confirm correctness and document commands.

Tasks:

  1. Review the output file.
  2. Document each substitution in the log.

Checkpoint: Another person can reproduce the edits.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Regex mode Default / very magic Very magic Simpler patterns
Safety Without c / with c Use c on risky edits Prevent mistakes
Scope Entire file / range Range when possible Safer

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Range tests Ensure correct scope Replace only selected lines
Regex tests Confirm pattern accuracy /pattern before :s
Safety tests Confirm c works Reject one replacement

6.2 Critical Test Cases

  1. Capture group reorder works correctly.
  2. :g affects only matching lines.
  3. Substitution without g changes only first match per line.

6.3 Test Data

Doe, John
Smith, Jane

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong range Too many lines changed Use explicit range or visual selection
Regex too broad Unexpected replacements Add boundaries or use c flag
Missing g flag Only first match replaced Add g when needed

7.2 Debugging Strategies

  • Test patterns with / before :s.
  • Use c flag to review each replacement.

7.3 Performance Traps

  • Running substitutions repeatedly without confirmation.
  • Using complex regex when a simple pattern works.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Replace trailing whitespace across a file.
  • Use & to reuse the match in replacements.

8.2 Intermediate Extensions

  • Use :g with :s for targeted changes.
  • Swap first/last names across a file.

8.3 Advanced Extensions

  • Apply multiple substitutions via a macro or :argdo.
  • Use :global with complex regex patterns.

9. Real-World Connections

9.1 Industry Applications

  • Bulk refactoring in codebases.
  • Cleaning data exports and logs.
  • Vim regex docs and examples.
  • Text processing tools like sed and awk.

9.3 Interview Relevance

  • Demonstrates ability to apply programmatic text transformations.

10. Resources

10.1 Essential Reading

  • “Practical Vim” Ch. 12, 14, 15

10.2 Video Resources

  • Vim substitution tutorials

10.3 Tools & Documentation

  • :help :substitute
  • :help pattern

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain range selection in :s.
  • I can use capture groups and backreferences.
  • I can apply :global confidently.

11.2 Implementation

  • Completed three substitutions with logs.
  • Used confirmation for at least one change.
  • Output file is correct and repeatable.

11.3 Growth

  • I can apply these techniques to real data.
  • I can teach substitution syntax to someone else.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Performed three substitutions with logs.
  • Used one capture group replacement.
  • Used :g for targeted changes.

Full Completion:

  • All minimum criteria plus:
  • Used confirmation on at least one risky change.
  • Demonstrated a safe range-limited substitution.

Excellence (Going Above & Beyond):

  • Applied the workflow to a real dataset or code file.
  • Documented a reusable substitution “playbook”.