Project 1: The Config File Updater

Build a safe, portable sed-based CLI that updates exactly one configuration key without touching anything else.

Quick Reference

Attribute Value
Difficulty Level 1: Beginner
Time Estimate 6-10 hours
Main Programming Language sed (shell wrapper for CLI)
Alternative Programming Languages awk, Python, Perl
Coolness Level Level 2: Practical but Forgettable
Business Potential 3: The “Service & Support” Model
Prerequisites Basic shell usage, simple regex, file I/O
Key Topics s command, addresses, anchors, -i editing, backups

1. Learning Objectives

By completing this project, you will:

  1. Design a single-purpose CLI that performs precise, safe configuration edits.
  2. Use sed substitution with anchored regex so only the intended line changes.
  3. Apply sed addresses to scope edits and avoid accidental global replacements.
  4. Implement portable in-place editing with backups across GNU and BSD sed.
  5. Validate changes with diffs and deterministic tests to prevent regressions.
  6. Explain how pattern space and command order affect the safety of edits.

2. All Theory Needed (Per-Concept Breakdown)

2.1 The s Command and Regex Matching

Fundamentals

The s (substitute) command is the core of sed. It finds text that matches a regular expression and replaces it with new text. The command has three parts: a pattern, a replacement, and optional flags. Even when you only plan to change a single setting like DEBUG=true, you are still using the full power of regex matching. The most important idea is that sed does not search for literal strings by default; it matches a regular expression. That means characters like . and * are special and can widen your match if you are not careful. Substitution also happens on the current pattern space, which is one line by default. This makes s fast and predictable, but it also means it only edits one line at a time unless you explicitly create a multi-line pattern space. When you control the pattern and replacement carefully, s becomes a surgical tool for safe config edits.

Deep Dive into the Concept

The s command takes the form s/pattern/replacement/flags. The first challenge is understanding what the pattern really matches. In sed, the pattern is a regular expression. In Basic Regular Expressions (BRE), some metacharacters like + and ? are not special unless escaped, while . and * are always special. That means s/true/false/ will replace the first true on each line, but s/true*/false/ will also match tru plus any number of e characters because * applies to the preceding token. The second challenge is replacement syntax. In the replacement, & means “the entire matched text” and \1, \2, etc. refer to capture groups from the pattern. That matters even for a simple config update, because you may want to preserve the key and only replace the value, e.g., s/^DEBUG=.*/DEBUG=false/ or s/^(DEBUG=).*/\1false/ (with ERE). The replacement is not a regex, but it still has special characters like & and \ that need escaping.

Substitution is also influenced by flags. The most common is g (global), which repeats the substitution for all non-overlapping matches on the line. Without g, only the first match per line is replaced. The p flag prints lines where a substitution occurred (useful with -n), and the i flag enables case-insensitive matching in GNU sed (not POSIX). When doing config edits, you usually do not want g unless the line contains multiple values you intend to change. For example, if a line is PATH=/bin:/usr/bin, and you meant to replace /bin with /sbin, using g might change both parts and surprise you. This is why understanding flags matters even in small tasks.

Another subtlety is delimiter choice. The slash / is the default delimiter, but you can choose any delimiter that does not appear in the pattern or replacement. For paths or URLs, switching to | or # avoids heavy escaping and makes commands readable: s#^ROOT=/var/www#ROOT=/srv/www#. This matters in scripts because readability reduces mistakes and makes review easier.

Finally, substitution interacts with the execution cycle. sed reads a line into the pattern space, executes commands, and then prints by default. If you run multiple s commands in a row, each one sees the output of the previous. That means command order matters: if you first normalize whitespace and then perform a key substitution, you may end up matching a different line than if you substitute first. In configuration files, order can prevent bugs. A safe approach is to target the line by address first (e.g., /^DEBUG=/) and then apply a single s inside that address. This keeps the scope tight and reduces unintended edits.

Another subtle point is match priority. sed uses leftmost-longest matching, so if your pattern can match in multiple ways, the engine chooses the leftmost position and then the longest match at that position. This matters when you use .* near the end of a pattern because it will happily consume more than you expect. A safer pattern uses explicit character classes and anchors to avoid ambiguity. When you build a config updater, you want the smallest valid match, not the largest. Thinking about match priority prevents accidental edits on lines that only partially resemble your key.

Also remember that sed matches are not overlapping. If you rely on the g flag, sed resumes searching after the end of the previous match, which can skip nested patterns. For config edits, this is usually fine, but it reinforces the idea that you should design patterns for one safe match rather than many ambiguous ones.

How This Fits on Projects

This project uses the s command to replace a value while preserving the key. You will use the basic syntax, choose safe delimiters, and avoid global replacements. In later projects, you will reuse substitution with capture groups and multiple commands to reformat log lines and generate HTML.

Definitions & Key Terms

  • Substitution (s): A sed command that replaces matched text with a replacement.
  • Pattern: The regex used to find text to replace.
  • Replacement: The text inserted in place of the match.
  • Flags: Optional modifiers like g, p, and i.
  • Delimiter: The character that separates s command parts.
  • Capture group: A parenthesized subpattern used for backreferences.

Mental Model Diagram (ASCII)

Input line -> [pattern space] -> s/pattern/replacement/ -> output line

Example:
"DEBUG=true" -> match /^DEBUG=.*/ -> replace -> "DEBUG=false"

How It Works (Step-by-Step)

  1. sed reads a line into the pattern space.
  2. The s command tests the line against the regex pattern.
  3. If it matches, sed replaces the matched portion with the replacement text.
  4. If g is set, sed continues searching for more matches on the same line.
  5. The modified pattern space is printed (unless suppressed) and the cycle repeats.

Minimal Concrete Example

# Replace DEBUG value on the matching line only
sed '/^DEBUG=/s/=true/=false/' app.conf

Common Misconceptions

  • Misconception: s/true/false/ only affects the DEBUG line.
    • Correction: It affects the first true on every line.
  • Misconception: The replacement part is also a regex.
    • Correction: Replacement is literal text with & and backreferences as special cases.
  • Misconception: g is always safer.
    • Correction: g can over-edit lines with multiple matches.

Check-Your-Understanding Questions

  1. What does s/true/false/ change if a line is ENABLE=true; DEBUG=true?
  2. Why might you choose s#pattern#replacement# instead of using /?
  3. What does the g flag do, and why can it be dangerous in config files?
  4. How does sed treat & in the replacement?
  5. If you run two s commands in sequence, which version of the line does the second command see?

Check-Your-Understanding Answers

  1. It changes only the first true on that line, leaving the second untouched.
  2. It avoids escaping / characters in paths and makes the command easier to read.
  3. g replaces all matches on a line, which can unintentionally alter multiple values.
  4. & expands to the entire matched text from the pattern.
  5. The second command sees the line after the first substitution has been applied.

Real-World Applications

  • Toggling feature flags in .env files during deployments.
  • Updating service endpoints in .conf files during migrations.
  • Sanitizing or reformatting configuration templates in CI pipelines.

Where You’ll Apply It

References

  • “sed & awk” (Dougherty, Robbins) – Chapters 2-3
  • POSIX sed specification – substitution command
  • GNU sed manual – s command flags

Key Insight

A safe configuration edit is a precise regex plus a minimal replacement applied only where intended.

Summary

The s command is powerful because it is generic. Your job is to make it precise with careful patterns, safe delimiters, and narrow scope.

Homework/Exercises to Practice the Concept

  1. Replace PORT=8080 with PORT=9090 using a single s command.
  2. Replace only the first occurrence of true on each line in a sample file.
  3. Use a non-slash delimiter to replace /var/www with /srv/www.

Solutions to the Homework/Exercises

  1. sed 's/^PORT=8080/PORT=9090/' app.conf
  2. sed 's/true/false/' file.txt
  3. sed 's#/var/www#/srv/www#' paths.conf

2.2 Addressing and Anchors for Precision

Fundamentals

sed addresses decide where a command applies. Without an address, a command runs on every line. With an address, you can target only specific lines by number or by pattern. The most reliable way to target a config setting is to match the line that starts with a key name. Regex anchors are essential: ^ matches the start of a line and $ matches the end. Anchors prevent partial matches, such as changing DEBUG=true inside a comment or a description. Addressing also supports negation with !, which lets you invert a match and operate on all lines except the target. In a config updater, addressing ensures your s command does not accidentally modify unrelated lines.

Deep Dive into the Concept

Addressing is the difference between a safe edit and a dangerous one. A pattern address like /^DEBUG=/ is evaluated for every input line, and only lines that match will have the associated command executed. This may sound simple, but there are subtle issues. First, regexes are greedy by default. A pattern like /DEBUG/ will match lines where DEBUG appears anywhere, including comments or other keys like ENABLE_DEBUGGING. Anchors (^ and $) convert that into a positional match. In configuration files, the “key” is almost always at the beginning of the line, so ^KEY= is the safest anchor. If your file allows whitespace, you might need ^[[:space:]]*KEY= to ignore leading spaces.

Line-number addresses are fragile. They are fine for fixed-format files, but in real configs, line numbers change frequently. A script that depends on line 12 being DEBUG will break silently the moment someone inserts a new comment line. Pattern addresses are resilient because they bind to meaning, not to position. This is one of the core reasons sed is useful for automation.

Address ranges (start,end) have internal state: once the start address matches, the range remains active until the end address matches. In this project you do not need ranges, but understanding them helps you avoid accidental matches if you ever expand the updater to handle sections (like [server] ... [/server]). Even simple single-line addressing can be affected by ranges if you place commands in blocks. For example, { /DEBUG/ s/true/false/; /PORT/ s/8080/9090/ } will run both substitutions when the block is addressed, which might be wrong if the address matches only one line.

Another critical detail is negation with !. The syntax /pattern/!command or address!command applies the command to all lines not matching the address. This is useful for filters (keep only matching lines) but can be hazardous in config edits because it changes almost everything. In an updater, you want positive, narrow addresses, not negation. The presence of ! in a script should trigger caution in code review because it can invert your intention.

Finally, multi-line pattern spaces break naive anchors. If you use N to append the next line, ^ and $ match the beginning and end of the entire pattern space, not each line. That means an anchored address can stop working once you enter multi-line logic. This project remains single-line, which is an advantage: your anchors are reliable and behave exactly as you expect.

Addresses are evaluated per command, not per script. If you have three commands, each with its own address, each command independently decides whether to run. This means you can mix a broad address for one command and a narrow address for another, as long as you understand the order. Also note that address matching happens against the current pattern space, so if you modify the line earlier in the script, later address checks can change their behavior. This is one more reason to keep your updater script simple and to avoid cascading substitutions.

If you need to match lines that may include inline comments, a robust pattern can capture the value and preserve the comment: s/^DEBUG=([^#]*)(#.*)?$/DEBUG=false/ (ERE). That kind of pattern shows how addressing and substitution combine to keep non-target text intact.

How This Fits on Projects

Addressing is how you guarantee the updater changes only the intended key. You will use pattern addresses in the core substitution logic. In Project 4, you will extend addressing to ranges and multi-line blocks; in Project 2, you will use addresses to select only certain log levels.

Definitions & Key Terms

  • Address: A selector that determines whether a sed command runs on a given line.
  • Anchor: A regex position marker (^ or $) used to enforce line boundaries.
  • Range: A start and end address that select blocks of lines.
  • Negation (!): Inverts an address selection.
  • Character class: A bracket expression like [[:space:]] used for portable whitespace matching.

Mental Model Diagram (ASCII)

Line -> address? -> command executes? -> output

Address check:
^KEY= matches only lines that START with KEY=

How It Works (Step-by-Step)

  1. sed loads a line into the pattern space.
  2. It evaluates the address associated with a command.
  3. If the address matches, the command runs on the pattern space.
  4. If it does not match, the command is skipped for that line.
  5. The cycle continues with the next command and line.

Minimal Concrete Example

# Only change the DEBUG line, ignore everything else
sed '/^DEBUG=/s/true/false/' app.conf

Common Misconceptions

  • Misconception: /DEBUG/ is specific enough.
    • Correction: It can match comments and other keys; use ^DEBUG=.
  • Misconception: Line numbers are reliable in configs.
    • Correction: Config files change frequently; pattern addresses are safer.
  • Misconception: ! is a convenient shortcut.
    • Correction: Negation can accidentally modify nearly every line.

Check-Your-Understanding Questions

  1. Why is ^DEBUG= safer than /DEBUG/?
  2. How would you match DEBUG= with optional leading spaces?
  3. What does 3,5s/a/b/ do if line 4 does not contain a?
  4. When should you avoid using ! in an updater script?

Check-Your-Understanding Answers

  1. It anchors the match to the start of the line, preventing partial matches in comments.
  2. Use ^[[:space:]]*DEBUG= to allow leading spaces.
  3. It still applies only to lines 3-5; on line 4 it runs but makes no change.
  4. When you need to limit changes to a single line; negation is too broad.

Real-World Applications

  • Updating key/value pairs in .env files during deployments.
  • Selecting specific sections in INI or systemd unit files.
  • Filtering log lines by level or tag before transformation.

Where You’ll Apply It

References

  • “sed & awk” (Dougherty, Robbins) – Addressing chapter
  • POSIX sed specification – address forms and ranges
  • The Grymoire SED tutorial – addressing and ranges

Key Insight

The safest sed command is a correct address plus a minimal change.

Summary

Addressing turns a blunt substitution into a precise operation by deciding exactly which lines are eligible for change.

Homework/Exercises to Practice the Concept

  1. Delete only blank lines that appear at the end of a file.
  2. Replace HOST=... only if the line starts with HOST= and not LOCALHOST=.
  3. Print only lines 10 through 15 in a file.

Solutions to the Homework/Exercises

  1. sed '${/^$/d;}' file.txt
  2. sed '/^HOST=/s/=.*/=127.0.0.1/' app.conf
  3. sed -n '10,15p' file.txt

2.3 In-Place Editing, Backups, and Portability

Fundamentals

In-place editing (-i) is what turns sed from a preview tool into a real file editor. With -i, sed writes changes directly back to the file instead of printing to standard output. This is powerful and risky. Different sed implementations handle -i differently: GNU sed allows -i without a suffix, while BSD sed (macOS) requires an explicit empty string or suffix. Backups are critical because a single wrong regex can permanently corrupt a file. The safest workflow is to preview output without -i, then run -i.bak so you can revert if necessary. In-place editing is also not guaranteed to be atomic; understanding how temporary files are used matters when editing critical configurations.

Deep Dive into the Concept

When you pass -i to sed, the implementation usually creates a temporary file, writes the transformed output to it, and then replaces the original file. GNU sed documents this behavior, and BSD sed does something similar. This means the edit is not purely “in-place”; it is a rewrite. The rename step is typically atomic on the same filesystem, but it can still fail if permissions or disk space are insufficient. For critical configs, you should check for errors and ensure you have a backup. If your script runs on multiple systems, you must also handle the fact that -i behaves differently across GNU and BSD. On macOS, sed -i requires a suffix argument: -i '' means no backup, -i '.bak' writes a backup. If you forget the empty string, BSD sed treats the next token as the suffix and your script breaks.

Portability is a key design constraint for any CLI tool that will run on developer laptops and CI servers. One approach is to detect the platform and use the correct -i variant. Another is to avoid -i entirely and manage temporary files yourself (e.g., sed ... file > file.tmp && mv file.tmp file). This gives you explicit control over backups and atomic moves, and it avoids sed dialect issues. The trade-off is that you now own the temp file logic and must handle errors carefully. This is often the safer choice for scripts that must be portable and auditable.

Backups are not just about safety; they enable deterministic debugging. If a script misbehaves, you can compare file and file.bak with diff to see exactly what changed. A good CLI should have a --dry-run option that prints the would-be changes and exits without modifying the file. This is the same philosophy as sed -n ... p: explicit output and explicit side effects.

Security and integrity considerations also matter. If your script runs with elevated privileges, in-place edits can change ownership or permissions depending on how the temp file is created. GNU sed tries to preserve ownership and permissions, but not all variants behave the same. If you must preserve file metadata, consider using cp -p to create a backup or using install to copy permissions. For a learning project, the important lesson is: editing configuration files is a system-level operation; treat it with care.

In-place editing is also affected by file metadata and symlinks. On many systems, sed -i writes to a temporary file and then renames it over the original. That can break symlinks (the new file is a regular file, not a link) and can change inode numbers, which matters for tools that watch files by inode. If your config is a symlink, you may want to resolve it before editing or use a temp file and mv carefully. These are advanced concerns, but they are exactly why careful tooling matters in real environments.

One more portability caveat: sed -i behaves poorly when reading from stdin or when the input is a symlink. Some implementations refuse -i with stdin, and some replace symlinks with regular files. A robust updater avoids these pitfalls by writing to a temp file and moving it into place, preserving metadata explicitly when needed.

How This Fits on Projects

This project requires in-place editing with safe backups and cross-platform compatibility. Projects 2 and 3 can run as pure pipelines, but the updater must change a real file. The decisions you make here become your safety pattern for all future sed automation.

Definitions & Key Terms

  • In-place editing (-i): Modify the input file directly by rewriting it.
  • Backup suffix: The extension used to save the original file (e.g., .bak).
  • Atomic rename: Replacing a file by renaming a temp file in a single filesystem operation.
  • Dry run: A mode that shows output without writing changes.
  • Portability: Writing scripts that behave the same on GNU and BSD sed.

Mental Model Diagram (ASCII)

Original file -> sed transforms -> temp file -> rename -> new original
           \-> backup file (optional)

How It Works (Step-by-Step)

  1. Run a dry run without -i to preview output.
  2. Choose a backup suffix (e.g., .bak).
  3. Execute sed -i.bak (GNU) or sed -i '' / -i '.bak' (BSD).
  4. sed writes the transformed content to a temp file.
  5. The temp file replaces the original, and the backup is preserved.
  6. Verify the change with diff or grep.

Minimal Concrete Example

# GNU sed (Linux)
sed -i.bak '/^DEBUG=/s/true/false/' app.conf

# BSD sed (macOS)
sed -i '.bak' '/^DEBUG=/s/true/false/' app.conf

Common Misconceptions

  • Misconception: -i edits the file byte-for-byte in place.
    • Correction: sed usually rewrites to a temp file and renames it.
  • Misconception: -i works the same on all systems.
    • Correction: BSD sed requires a suffix argument.
  • Misconception: Backups are optional when learning.
    • Correction: Backups are essential to avoid accidental data loss.

Check-Your-Understanding Questions

  1. Why is sed -i not fully portable between Linux and macOS?
  2. What is the safest way to preview changes before editing?
  3. Why can in-place editing be risky on read-only filesystems?
  4. How can a backup help you debug a faulty sed script?

Check-Your-Understanding Answers

  1. BSD sed requires an explicit suffix, while GNU sed allows -i without one.
  2. Run the command without -i and inspect stdout or use -n with p.
  3. The temp file creation or rename can fail, leaving the file unchanged or partially replaced.
  4. You can compare the backup and modified file with diff to see exact changes.

Real-World Applications

  • Automating config updates across fleets of servers.
  • Updating .env files in CI pipelines with safe rollback.
  • Applying consistent configuration edits in deployment scripts.

Where You’ll Apply It

  • In this project: See §3.7 Real World Outcome and §5.10 Implementation Phases.
  • Also used in: P02-log-file-cleaner.md for safe file outputs.

References

  • GNU sed manual – -i and backup behavior
  • BSD sed man page – suffix requirement
  • “Classic Shell Scripting” – safe file editing patterns

Key Insight

In-place editing is powerful, but the safest scripts treat it as a controlled, testable side effect.

Summary

Portability and backups are what make a sed script reliable in the real world.

Homework/Exercises to Practice the Concept

  1. Write a command that edits a file in place and leaves a .bak copy.
  2. Implement a tiny shell script that chooses GNU or BSD -i based on sed --version.
  3. Compare a file and its backup with diff to confirm the change.

Solutions to the Homework/Exercises

  1. sed -i.bak 's/^DEBUG=true/DEBUG=false/' app.conf
  2. Use if sed --version >/dev/null 2>&1; then SED_INPLACE='-i.bak'; else SED_INPLACE="-i .bak"; fi
  3. diff -u app.conf.bak app.conf

3. Project Specification

3.1 What You Will Build

You will build a small CLI tool named config-update that changes exactly one configuration key in a flat key/value file. The tool reads a file of KEY=VALUE lines, updates the requested key to a new value, and writes the result back safely with a backup. The tool supports a dry-run mode that prints the updated file to stdout without modifying disk. It does not attempt to parse nested formats (JSON/YAML) or INI sections; it operates strictly on flat lines.

3.2 Functional Requirements

  1. Update a single key: Given --key and --value, update only the matching line that starts with KEY=.
  2. Dry run mode: With --dry-run, print the modified file to stdout and do not change the file.
  3. Backup creation: With --backup .bak, create a backup file before replacing the original.
  4. Portability: Work on GNU and BSD sed by handling -i differences.
  5. Exit codes: Return distinct exit codes for success, usage error, key not found, and file errors.
  6. Idempotence: Running the tool twice with the same key/value should not change the file after the first run.

3.3 Non-Functional Requirements

  • Performance: Must handle files up to 10 MB in under 1 second on a typical laptop.
  • Reliability: Never partially write a file; failure must leave the original intact or backed up.
  • Usability: CLI usage must be clear and produce helpful error messages.

3.4 Example Usage / Output

# Dry run
$ ./config-update --file app.conf --key DEBUG --value false --dry-run
# Application Settings
SERVER_HOST=127.0.0.1
SERVER_PORT=8080
DEBUG=false

# Apply in-place with backup
$ ./config-update --file app.conf --key DEBUG --value false --backup .bak
Updated DEBUG in app.conf (backup: app.conf.bak)

3.5 Data Formats / Schemas / Protocols

Input file format (flat key/value):

# Comment lines begin with # and are ignored by the tool
KEY=VALUE
ANOTHER_KEY=ANOTHER_VALUE

Constraints:

  • Keys are uppercase letters, digits, or underscores: [A-Z0-9_]+
  • Values are any characters except newlines
  • Leading/trailing spaces are not significant and may be preserved

3.6 Edge Cases

  • Key not found in the file.
  • Key appears multiple times; only the first occurrence should be changed (explicitly define behavior).
  • Lines beginning with # that mention the key should not be modified.
  • Lines with leading whitespace before the key.
  • File not writable or missing.
  • Empty file.

3.7 Real World Outcome

You will have a working config-update CLI and a real config file updated safely with a backup.

3.7.1 How to Run (Copy/Paste)

# Create a sample config
cat > app.conf <<'EOF'
# Application Settings
SERVER_HOST=127.0.0.1
SERVER_PORT=8080
DEBUG=true
EOF

# Dry run
./config-update --file app.conf --key DEBUG --value false --dry-run

# In-place edit with backup
./config-update --file app.conf --key DEBUG --value false --backup .bak

3.7.2 Golden Path Demo (Deterministic)

Expected behavior (no timestamps, deterministic output):

  • The file changes only on the DEBUG line.
  • A backup app.conf.bak is created.
  • Exit code is 0.

3.7.3 CLI Transcript (Success)

$ ./config-update --file app.conf --key DEBUG --value false --backup .bak
Updated DEBUG in app.conf (backup: app.conf.bak)
$ echo $?
0
$ cat app.conf
# Application Settings
SERVER_HOST=127.0.0.1
SERVER_PORT=8080
DEBUG=false
$ cat app.conf.bak
# Application Settings
SERVER_HOST=127.0.0.1
SERVER_PORT=8080
DEBUG=true

3.7.4 CLI Transcript (Failure: Key Missing)

$ ./config-update --file app.conf --key LOG_LEVEL --value info --backup .bak
Error: key LOG_LEVEL not found in app.conf
$ echo $?
2

Exit codes:

  • 0 success
  • 1 usage error
  • 2 key not found
  • 3 file read/write error

4. Solution Architecture

4.1 High-Level Design

+-------------+     +------------------+     +------------------+
| CLI Parser  | --> | sed Command      | --> | Safe File Update |
+-------------+     +------------------+     +------------------+
        |                         |                      |
        |                         v                      v
        |                 Dry-run output            Backup + replace
        v
  Error handling

4.2 Key Components

Component Responsibility Key Decisions
CLI parser Parse flags, validate inputs Use explicit --file, --key, --value flags
sed engine Perform substitution with address Use anchored /^KEY=/ pattern
file updater Handle backup and in-place edit Use portable -i logic or temp file
validator Detect missing key Use grep -q before edit or check sed output

4.3 Data Structures (No Full Code)

# Logical representation of a config line
KEY=VALUE

# Internal representation (conceptual)
struct ConfigLine {
  key: string
  value: string
  raw: string
}

4.4 Algorithm Overview

Key Algorithm: Safe Key Replacement

  1. Validate arguments and ensure file exists.
  2. Build a safe address: ^KEY= (with optional whitespace support).
  3. Run a dry substitution to determine if a change would occur.
  4. If no change, return exit code 2 (key not found).
  5. If --dry-run, print output and exit.
  6. Otherwise, run sed with backup to update file.

Complexity Analysis:

  • Time: O(n) over file length
  • Space: O(1) streaming (plus temp file)

5. Implementation Guide

5.1 Development Environment Setup

# Check sed availability
sed --version 2>/dev/null | head -n 1 || sed -h 2>&1 | head -n 1

# Create a sample config for testing
printf '# Application Settings\nSERVER_HOST=127.0.0.1\nSERVER_PORT=8080\nDEBUG=true\n' > app.conf

5.2 Project Structure

config-update/
├── bin/
│   └── config-update
├── tests/
│   └── test-config-update.sh
└── README.md

5.3 The Core Question You’re Answering

“How do I surgically modify a single line in a file without opening an editor or risking collateral changes?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. The s command anatomy
    • How s/pattern/replacement/flags is parsed
    • The difference between pattern regex and replacement text
  2. Anchored addresses
    • Why ^KEY= is safer than /KEY/
  3. In-place editing portability
    • GNU vs BSD -i behavior
  4. Dry-run discipline
    • How to use stdout preview to avoid destructive edits

5.5 Questions to Guide Your Design

  1. How will you detect if the key does not exist?
  2. What happens if the file contains multiple KEY= lines?
  3. How will you prevent editing commented lines (# KEY=...)?
  4. How will your script behave on macOS vs Linux?
  5. How will you ensure the tool is idempotent?

5.6 Thinking Exercise

Before coding, trace the script on paper:

Input:

# Config
DEBUG=true
TIMEOUT=30

Command:

/^DEBUG=/s/true/false/

Questions:

  • Which line matches the address?
  • What is the exact output line after substitution?
  • What lines remain unchanged?

5.7 The Interview Questions They’ll Ask

  1. “How would you safely edit a config file from a script?”
  2. “Why is ^KEY= safer than /KEY/?”
  3. “What is the difference between GNU and BSD sed -i?”
  4. “How do you verify a sed command before editing in place?”

5.8 Hints in Layers

Hint 1: Start with dry-run Use sed without -i first and inspect output.

Hint 2: Address the line Use /^KEY=/ to restrict the substitution.

Hint 3: Portable in-place editing Use a wrapper that sets SED_INPLACE depending on GNU vs BSD.

5.9 Books That Will Help

Topic Book Chapter
sed substitution basics “sed & awk” (Dougherty, Robbins) Ch. 3
Regex fundamentals “Mastering Regular Expressions” (Friedl) Ch. 1-3
Shell scripting discipline “Classic Shell Scripting” Ch. 2-4

5.10 Implementation Phases

Phase 1: Foundation (2-3 hours)

Goals:

  • Build a minimal dry-run substitution.
  • Validate key matching with anchors.

Tasks:

  1. Write a sed command that replaces a known key.
  2. Add a wrapper script that accepts --file, --key, --value.

Checkpoint: Dry run prints the expected output and does not modify the file.

Phase 2: Core Functionality (2-4 hours)

Goals:

  • Implement backup and in-place editing.
  • Handle missing key detection.

Tasks:

  1. Add a backup suffix argument.
  2. Check for key existence before editing.

Checkpoint: Tool updates the file and creates a backup.

Phase 3: Polish & Edge Cases (2-3 hours)

Goals:

  • Handle whitespace, comments, and idempotence.
  • Provide clear error messages and exit codes.

Tasks:

  1. Support optional leading whitespace.
  2. Ensure comments are never modified.

Checkpoint: Tests pass for all edge cases.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
In-place editing sed -i vs temp file Temp file or portable -i wrapper Avoids GNU/BSD differences
Key matching /KEY/ vs /^KEY=/ /^KEY=/ Prevents accidental matches
Multiple keys Replace all vs first First only Safer default for configs

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Validate CLI parsing Missing flags, invalid key
Integration Tests Verify file edits Change DEBUG value, backup exists
Edge Case Tests Catch tricky inputs Comments, whitespace, missing key

6.2 Critical Test Cases

  1. Basic replacement: DEBUG=true -> DEBUG=false.
  2. Missing key: Exit code 2 and no file changes.
  3. Commented key: # DEBUG=true remains unchanged.
  4. Whitespace: ` DEBUG=true` should be updated if allowed.

6.3 Test Data

# Config
DEBUG=true
# DEBUG=true (commented)
PORT=8080

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Missing anchors Multiple lines changed Use ^KEY= in address
Wrong -i usage on macOS Error: invalid command code Use -i '' or suffix
Editing without backup Irreversible changes Always use .bak

7.2 Debugging Strategies

  • Use -n with p: Print only lines you expect to change.
  • Compare with diff: Validate file changes quickly.
  • Echo commands: Log the exact sed command used.

7.3 Performance Traps

  • Running multiple passes over a large file is slower; combine operations when possible.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add --dry-run output diff format.
  • Support lowercase keys by normalizing input.

8.2 Intermediate Extensions

  • Support INI sections (only change key inside a named section).
  • Add a --create flag to append the key if missing.

8.3 Advanced Extensions

  • Build a small library to perform safe config updates for multiple files.
  • Add a JSON output mode summarizing changes.

9. Real-World Connections

9.1 Industry Applications

  • CI/CD pipelines: toggle flags in environment files before deployment.
  • Infrastructure scripts: update ports and endpoints during migrations.
  • Ansible: uses idempotent config edits conceptually similar to this tool.
  • Chef: manages config changes safely with backups and diffs.

9.3 Interview Relevance

  • Text processing: shows mastery of regex and stream editing.
  • DevOps automation: demonstrates safe system changes under automation.

10. Resources

10.1 Essential Reading

  • “sed & awk” by Dougherty & Robbins – substitution and addressing
  • “Classic Shell Scripting” by Robbins & Beebe – safe file editing

10.2 Video Resources

  • SED basics walkthrough (YouTube or equivalent)

10.3 Tools & Documentation

  • GNU sed manual: authoritative reference for flags and behavior
  • BSD sed man page: portability differences
  • Project 2: Log File Cleaner – uses capture groups and reformatting.
  • Project 3: Markdown to HTML – uses multiple sed commands in sequence.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain how s/pattern/replacement/ works.
  • I can explain why ^KEY= is safer than /KEY/.
  • I know the GNU vs BSD -i difference.

11.2 Implementation

  • All functional requirements are met.
  • Dry-run mode works correctly.
  • Backup files are created and verified.

11.3 Growth

  • I can describe one improvement I would make next time.
  • I can explain this project in a job interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • A working config-update CLI with --file, --key, --value.
  • Dry-run mode prints output without editing files.
  • Backup creation works on your platform.

Full Completion:

  • All minimum criteria plus:
  • Distinct exit codes for error cases.
  • Tests covering edge cases and comments.

Excellence (Going Above & Beyond):

  • Portable implementation that runs on GNU and BSD.
  • Clear documentation with examples and troubleshooting tips.