← Back to all projects

LEARN SED COMMAND

Learn The sed Command: From Stream Editor to Text Manipulation Wizard

Goal: Master the sed (Stream Editor) command, one of the most powerful and ubiquitous text-processing tools in the Unix world. You will progress from simple substitutions to writing complex scripts that can perform sophisticated transformations on any text stream.


Why Learn sed?

sed is a command-line workhorse that has been part of Unix-like operating systems for decades. It processes text files line by line, making it incredibly efficient for a huge range of tasks.

  • Automation & Scripting: sed is a cornerstone of shell scripting. Automate log file analysis, code refactoring, and configuration changes.
  • Surgical Precision: Make targeted changes to huge files without ever opening them in an editor.
  • Universal Availability: sed is everywhere, from ancient servers to modern Docker containers. It’s a tool you can always rely on.
  • Deepen Your Command-Line Fu: Understanding sed is a rite of passage that fundamentally improves your ability to work with text on the command line.

After completing these projects, you will:

  • Confidently use sed for find-and-replace tasks in your daily workflow.
  • Understand the difference between the pattern space and the hold space.
  • Write sed scripts that perform complex, multi-line transformations.
  • Read and deconstruct advanced sed one-liners found in the wild.
  • Appreciate the power and elegance of stream-based text processing.

Core Concept Analysis

sed’s Mental Model: The Assembly Line

Imagine an assembly line. Each line of a file is a “part” placed on a conveyor belt. sed is a worker (or a series of workers) who can perform an action on that part as it passes by.

                  Input Stream (file.txt)
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                       PATTERN SPACE                         │
│ (The conveyor belt. Holds ONE line at a time)               │
│                                                             │
│   Line 1 -> | Modify Line 1 | -> Output Line 1              │
│   Line 2 -> | Modify Line 2 | -> Output Line 2              │
│   Line 3 -> | Delete Line 3 | -> (No Output)                │
│   ...                                                       │
└─────────────────────────────────────────────────────────────┘
                          │
                          ▼
                 Output Stream (stdout)

Key Concepts Explained

  1. Command Structure: sed [options] 'script' [input-file]
    • script: A set of [address][command] instructions.
    • address: (Optional) Specifies which line(s) the command should apply to.
    • command: The action to perform (e.g., s for substitute, d for delete).
    • If no file is given, sed reads from standard input.
  2. Addresses (The “Where”):
    • By Number: sed '3d' file.txt (delete the 3rd line).
    • By Range: sed '5,10d' file.txt (delete lines 5 through 10).
    • By Regex Pattern: sed '/ERROR/d' file.txt (delete any line containing “ERROR”).
    • Combinations: sed '/START/,/END/d' (delete from a line matching START to a line matching END).
  3. The s (substitute) Command: The most common command. s/regexp/replacement/flags
    • regexp: A regular expression to match.
    • replacement: The string to replace the match with.
    • flags:
      • g (global): Replace ALL occurrences on the line, not just the first.
      • i (case-insensitive): Ignore case when matching the regex.
      • p (print): If a substitution occurs, print the new pattern space. Often used with -n.
  4. Regular Expressions in sed:
    • sed uses Basic Regular Expressions (BRE) by default. Many characters like +, ?, |, (), {} lose their special meaning unless escaped ( , ` , , `, ", ` , , , `, ").
    • Use the -E (or -r on some systems) flag for Extended Regular Expressions (ERE), which is more modern and often easier to read.
    • Key Patterns:
      • . : Matches any single character.
      • * : Matches the preceding character zero or more times.
      • ^ : Matches the beginning of the line.
      • $ : Matches the end of the line.
      • [...] : Matches any one character inside the brackets.
      • ` , , , `, ", ` , , , `, " (BRE) or (...) (ERE): Capturing group. The matched text can be referenced in the replacement with ` , , , `, ", ` , , , `, ", ` , , , `, ", etc.
  5. The Hold Space (The “Workbench”): This is sed’s secret weapon. It’s a secondary, temporary buffer. You can copy or append the pattern space to the hold space, and vice-versa. This allows sed to “remember” previous lines, which is essential for multi-line operations.
    • h: Copy pattern space to hold space.
    • H: Append pattern space to hold space.
    • g: Copy hold space to pattern space.
    • G: Append hold space to pattern space.
    • x: Exchange pattern space and hold space.

Project List

These projects are designed to build your sed skills sequentially, from simple one-liners to complex scripts that use the hold space.


Project 1: The Config File Updater

  • File: LEARN_SED_COMMAND.md
  • Main Programming Language: sed (Bash/Shell)
  • Alternative Programming Languages: Python, Perl, Awk
  • Coolness Level: Level 2: Practical but Forgettable
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 1: Beginner
  • Knowledge Area: Text Substitution / In-place Editing
  • Software or Tool: sed
  • Main Book: sed & awk, 2nd Edition by Dale Dougherty & Arnold Robbins

What you’ll build: A sed command that finds and replaces a specific setting in a configuration file. For example, changing DEBUG=true to DEBUG=false.

Why it teaches sed: This is the most common use case for sed. It teaches you the fundamentals of the s command, using regex to anchor your search (^DEBUG=), and how to edit files in-place safely with the -i option.

Core challenges you’ll face:

  • Constructing the s command → maps to understanding the s/find/replace/ syntax
  • Matching the whole line vs. just the value → maps to using ^ and $ to make your regex more specific
  • Handling different file permissions and backups → maps to understanding the -i (in-place) and -i.bak flags
  • Applying the change only to specific lines → maps to using an address pattern like /^DEBUG=/

Key Concepts:

  • Substitution: “sed & awk” Ch. 3 - Dougherty & Robbins
  • In-place Editing: man sed (look for the -i option)
  • Regular Expressions: “sed & awk” Ch. 2

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic command-line navigation.

Real world outcome: You will have a config file, app.conf:

# Application Settings
SERVER_HOST=127.0.0.1
SERVER_PORT=8080
DEBUG=true

You will run a single sed command, and the file will be instantly modified:

$ sed -i.bak 's/^DEBUG=true/DEBUG=false/' app.conf

The file app.conf now contains DEBUG=false, and a backup of the original file, app.conf.bak, has been created.

Implementation Hints:

  1. Create a sample app.conf file to work with.
  2. Start by just printing the output to the terminal (don’t use -i yet). sed 's/true/false/' app.conf
  3. Notice that this might change other lines if they contain “true”. How can you make it more specific?
    • Anchor the search to the beginning of the line. What character does that?
    • Only apply the s command to lines that match a certain pattern. The syntax is /pattern/s/find/replace/.
  4. Once your command correctly isolates and changes only the DEBUG=true line, you can add the -i flag to perform the edit on the file itself. It’s good practice to use -i.bak to create a backup, especially when learning.

Learning milestones:

  1. You can replace a simple word in a file → You understand the basic s command.
  2. You can replace a word on a specific line number → You understand numeric addressing.
  3. You can replace a word only on lines that match a pattern → You understand regex addressing.
  4. You confidently use sed -i to modify a file → You’ve unlocked sed for scripting and automation.

Project 2: The Log File Cleaner

  • File: LEARN_SED_COMMAND.md
  • Main Programming Language: sed (Bash/Shell)
  • Alternative Programming Languages: Awk, Python
  • Coolness Level: Level 2: Practical but Forgettable
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 1: Beginner
  • Knowledge Area: Regex / Capturing Groups
  • Software or Tool: sed
  • Main Book: Mastering Regular Expressions, 3rd Edition by Jeffrey E.F. Friedl

What you’ll build: A sed script that processes a messy log file, removing unnecessary information (like log level and timestamps) and reformatting it into a cleaner, more readable format.

Why it teaches sed: This project forces you to learn about capturing groups. You’ll match parts of a line and then reference those parts in your replacement string, which is the key to reformatting text instead of just replacing it.

Core challenges you’ll face:

  • Matching a complex line structure → maps to writing a regular expression that describes the entire log line format
  • Capturing parts of the line → maps to using ` , , , `, " (or ( and ) with -E) to create groups
  • Referencing captured groups → maps to using ` , , , `, ", ` , , , `, ", ` , , , `, ", etc. in the replacement part of the s command
  • Deleting non-matching lines → maps to using the d command with pattern addressing

Key Concepts:

  • Capturing Groups: “sed & awk” Ch. 3
  • Extended Regular Expressions: man sed (the -E or -r option)
  • Combining Commands: Using multiple -e expressions or semicolons.

Difficulty: Beginner Time estimate: Weekend Prerequisites: Project 1.

Real world outcome: You’ll start with a log file app.log like this:

[2025-12-20 10:00:15] [INFO] User 'admin' logged in from 192.168.1.100.
[2025-12-20 10:01:02] [DEBUG] Caching mechanism triggered for key 'user:123'.
[2025-12-20 10:01:30] [ERROR] Failed to connect to database: Connection refused.

Your sed script will transform it into this:

ERROR: Failed to connect to database: Connection refused.

It extracts only the message from ERROR lines.

Implementation Hints:

  1. Create your sample app.log.
  2. Your goal is to match an entire ERROR line and extract only the part after the log level.
  3. Think about the structure: [timestamp] [ERROR] message.
  4. Write a regex to match this. With -E, it might look something like ^\t, \n, \r, \\, \", \t, \n, \r, \\, \", .*\t, \n, \r, \\, \", .*\t, \n, \r, \\, \", $.
    • ^\t, \n, \r, \\, \", .*\t, \n, \r, \\, \": Matches the timestamp part.
    • ` \t, \n, \r, \, ", ERROR \t, \n, \r, \, "`: Matches the log level part.
    • (.*)$: This is the key! It’s a capturing group that matches the rest of the line (the message) to the end.
  5. Now, construct your s command. You want to replace the entire line with just the part you captured. How do you reference the first captured group? (\t, \n, \r, \\, \"). s/^\t, \n, \r, \\, \", .*\t, \n, \r, \\, \", .*\t, \n, \r, \\, \", ` [ERROR] \t, \n, \r, \, ", (.*)$/\t, \n, \r, \, ", \1/`
  6. This command will only change the ERROR lines. What about the INFO and DEBUG lines? You want to delete them. You can use a separate d command. How can you apply a command only to lines that don’t match a pattern? (Hint: !). '/ERROR/!d'
  7. You can combine these two commands using the -e flag: sed -E -e '/ERROR/!d' -e 's/.../.../' app.log.

Learning milestones:

  1. You can write a regex that matches an entire structured line → You understand how to model text patterns.
  2. You can extract a substring from a line using a capturing group → You’ve learned the key to reformatting.
  3. You can re-order parts of a line → e.g., s/(part1)(part2)/\t, \n, \r, \\, \", \2 \t, \n, \r, \\, \", \1/.
  4. You can chain multiple commands to perform a multi-step transformation → You are starting to think like a sed scripter.

Project 3: Basic Markdown to HTML Converter

  • File: LEARN_SED_COMMAND.md
  • Main Programming Language: sed (Bash/Shell)
  • Alternative Programming Languages: N/A
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Scripting / Multiple Transformations
  • Software or Tool: sed
  • Main Book: Classic Shell Scripting by Arnold Robbins & Nelson H.F. Beebe

What you’ll build: A sed script that reads a simple Markdown file and converts its syntax (headings, bold, italics) into basic HTML tags.

Why it teaches sed: This project teaches you how to structure a sed script with multiple, ordered commands. You’ll learn that the order of substitutions matters and how to handle patterns that occur at the beginning, middle, or end of a line.

Core challenges you’ll face:

  • Handling multiple patterns in one script → maps to writing a .sed script file or using multiple -e flags
  • Order of operations → maps to realizing you should probably handle bold/italics before headings to avoid conflicts
  • Matching patterns at the beginning of a line → maps to using ^ for headings like ## Title
  • Handling greediness in regex → maps to understanding how .* can sometimes match more than you want

Key Concepts:

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 2.

Real world outcome: You’ll have a file notes.md:

# My Document

This is a paragraph with some *italic* and **bold** text.

## A Subheading

- A list item
- Another list item

Your script will produce this HTML output:

<h1>My Document</h1>

<p>This is a paragraph with some <em>italic</em> and <strong>bold</strong> text.</p>

<h2>A Subheading</h2>

<ul>
<li>A list item</li>
<li>Another list item</li>
</ul>

(Note: for p and ul tags, you might need a more advanced script using the hold space, but headings and emphasis are very achievable).

Implementation Hints:

  1. Create a file converter.sed to hold your script. You will run it with sed -E -f converter.sed notes.md.
  2. Start with the simplest transformation. How do you convert ## A Subheading to <h2>A Subheading</h2>?
    • Your address should match lines starting with ## .
    • Your s command needs to capture the text after the ## .
    • s/^## (.*)$/<h2>\1<\/h2>/. Note the escaped / in the closing tag. sed lets you use other delimiters to avoid this, e.g., s#^## (.*)$#<h2>\1</h2>#.
  3. Now, add a rule for <h1> headings. Does the order of the h1 and h2 rules in your script file matter? (No, because their patterns are distinct).
  4. Next, tackle bold text: **bold**. The command will look like s/ , ` , , `, ", ` , , , `, ", (.*) , ` , , `, ", ` , , , `, ", /<strong>\1<\/strong>/g.
  5. What happens if you have **bold** and **more bold** on one line? The (.*) is “greedy” and might match from the first ** to the last **. You need to match characters that are not asterisks. A pattern like [^*] can help. s/ , ` , , `, ", ` , , , `, ", ([^*]+) , ` , , `, ", ` , , , `, ", /<strong>\1<\/strong>/g.
  6. Add a rule for italics (*italic*). Does the order of the bold and italic rules matter? (Yes! If you do italics first, **bold** might become <em>*bold*</em>, which is wrong).

Learning milestones:

  1. Your script can convert one type of Markdown syntax → You can write a self-contained rule.
  2. Your script handles multiple heading levels correctly → You understand how to use multiple rules.
  3. You can convert bold and italic text on the same line → You understand the importance of command order and non-greedy matching.
  4. You use an alternate delimiter like # or | in your s command to handle file paths or HTML → You’ve learned a key trick for sed readability.

Project 4: The Multi-Line Address Parser

  • File: LEARN_SED_COMMAND.md
  • Main Programming Language: sed (Bash/Shell)
  • Alternative Programming Languages: Awk, Perl
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Advanced sed / Hold Space / Multi-line processing
  • Software or Tool: sed
  • Main Book: The Grymoire - SED (An excellent online tutorial)

What you’ll build: A sed script that transforms a multi-line address block into a single, comma-separated line.

Why it teaches sed: This is your first “real” multi-line problem. It is impossible to solve without using the hold space. This project forces you to leave the line-by-line assembly line model and start thinking about how to store and combine information across multiple lines.

Core challenges you’ll face:

  • “Remembering” previous lines → maps to using H to append lines to the hold space
  • Knowing when you’re at the end of a block → maps to using patterns (like a blank line) to trigger an action
  • Processing the combined block → maps to using g or x to bring the collected lines back into the pattern space for a final substitution
  • Handling newlines in the pattern space → maps to recognizing that the pattern space will now contain \t, \n, \r, \\, \" characters

Key Concepts:

  • Hold Space: The Grymoire - SED (Advanced section)
  • Multi-line Commands: N, P, D commands.
  • Advanced Flow Control: “sed & awk” Ch. 5

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Project 3. Be prepared to be confused; this is a big conceptual leap.

Real world outcome: You will have a file addresses.txt:

123 Fake St.
Anytown, ST 12345
USA

456 Main Ave.
Otherville, CA 67890
USA

Your sed script will transform it into:

123 Fake St., Anytown, ST 12345, USA
456 Main Ave., Otherville, CA 67890, USA

Implementation Hints:

This is a classic sed pattern. Here’s the logic broken down:

  1. The Goal: Read lines and append them together, replacing newlines with “, “. When we see a blank line, we print the result and start over.
  2. The sed Script Logic (in English):
    • For every line…
    • If this is the last line of the file ($), jump to a special block of code to handle it.
    • Read the next line from the input and append it to the pattern space. The two lines are now separated by a \t, \n, \r, \\, \". This is the N command.
    • If the pattern space now contains a blank line (\t, \n, \r, \\, \"$), it means we’ve read the line after an address block.
      • We need to process the block (which is everything before the final \t, \n, \r, \\, \").
      • Print the processed part, then delete it, leaving the blank line to be handled in the next cycle. This is what the P and D commands do.
    • If it’s not the end of a block, just branch back to the beginning to append another line.
    • This creates a loop that “slurps” lines into the pattern space. Once you have the whole block, you can do substitutions.

A simpler Hold Space approach:

  1. Create a script.sed file. You will run it with sed -f script.sed addresses.txt.
  2. For lines that are NOT blank:
    • Append the line to the hold space. Use the H command.
    • Delete the line from the pattern space so it’s not printed. Use d.
  3. For lines that ARE blank (this is our trigger):
    • First, use x to swap the hold space (which contains \t, \n, \r, \\, \"line1\t, \n, \r, \\, \"line2\t, \n, \r, \\, \"line3`) and the pattern space.
    • Now the pattern space has your collected address block.
    • Perform substitutions to replace the newlines with “, “. The first \t, \n, \r, \\, \" will be at the beginning. s/\t, \n, \r, \\, \"// removes the first one. s/\t, \n, \r, \, "/, /g replaces the rest.
    • The line is now formatted and will be printed automatically.
    • The hold space now contains the blank line, which is fine. It will be overwritten on the next cycle.

Learning milestones:

  1. You can use H and g to append a line and print the entire buffer → You understand the basics of storing state.
  2. You can trigger an action on a blank line → You know how to use patterns to control script flow.
  3. You successfully replace \t, \n, \r, \\, \" characters in the pattern space → You’ve mastered multi-line substitution.
  4. You can explain the difference between h and H, and g and G → You have a solid mental model of the hold space.

Project 5: Reversing a File (Line by Line)

  • File: LEARN_SED_COMMAND.md
  • Main Programming Language: sed (Bash/Shell)
  • Alternative Programming Languages: tac (the real tool for this), Python, Perl
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Advanced sed / Hold Space Mastery
  • Software or Tool: sed
  • Main Book: N/A, this is a classic puzzle found in online forums.

What you’ll build: A sed script that reverses the order of lines in a file, printing the last line first and the first line last.

Why it teaches sed: This is the canonical “expert sed” problem. It is impossible without a complete understanding of how the pattern space, hold space, and command flow interact. It forces you to think about how to accumulate the entire file in a buffer and only print it at the very end.

Core challenges you’ll face:

  • How to avoid printing each line as it’s read → maps to using -n and controlling all output with p
  • How to accumulate the entire file → maps to repeatedly appending to the hold space
  • How to reverse the order → maps to a clever trick of prepending, not appending
  • When to print the final result → maps to using the $ address to trigger a final action on the last line

Key Concepts:

  • Suppressing Output: The -n flag.
  • Hold Space Manipulation: G, h, g.
  • End-of-file Address: The $ address.

Difficulty: Advanced Time estimate: Weekend Prerequisites: Project 4. You should be comfortable with the hold space.

Real world outcome: Given a file file.txt:

A
B
C

Your script sed -n -f reverse.sed file.txt will output:

C
B
A

Implementation Hints:

This is a puzzle. Think about the state at each step.

  1. The Goal: At the end of the script ($ line), we want the hold space to contain C\t, \n, \r, \\, \"B\t, \n, \r, \, "A.
  2. Line 1 (“A”):
    • The pattern space contains “A”.
    • We need to store it. h will copy “A” to the hold space.
    • Hold space: “A”
  3. Line 2 (“B”):
    • The pattern space contains “B”.
    • We need to add this to the hold space, but before “A”.
    • G appends the hold space to the pattern space. Pattern space: “B\t, \n, \r, \, "`A”.
    • h then copies this combined buffer back to the hold space. Hold space: “B\t, \n, \r, \, "`A”.
  4. Line 3 (“C”, the last line $):
    • The pattern space contains “C”.
    • G appends the hold space. Pattern space: “C\t, \n, \r, \, "B\t, \n, \r, \\, \"A”.
    • Now we have our final reversed buffer, but it’s in the pattern space.
    • We need to print it. The p command will do this.
  5. Putting it together: The script looks surprisingly simple.
    • For every line except the last (!$): Append to hold space in reverse order (G;h).
    • For the last line ($): Append the buffer (G) and then print (p).
    • Remember to use -n to prevent printing on every line.
    • A more elegant solution exists with only two commands. 1!G;h;$p. Can you figure out why that works? (Hint: what happens on line 1?). A three-command solution is {1!G;h;$p}.

Learning milestones:

  1. You can write a script to collect the whole file into the hold space → You understand accumulation.
  2. You figure out the G;h trick to prepend lines → You’ve had the “aha!” moment of advanced sed.
  3. You can control printing to only happen on the very last line → You master the -n and p flags combined with the $ address.
  4. You can write the classic sed '1!G;h;$p' one-liner from memory → You are now a sed wizard.

Summary

Project Difficulty Key Learning
1. Config File Updater Beginner Basic substitution (s), in-place editing (-i).
2. Log File Cleaner Beginner Regex capturing groups (\t, \n, \r, \\, \") for reformatting.
3. Markdown to HTML Intermediate Writing multi-command scripts, command order.
4. Multi-Line Address Parser Advanced The Hold Space (H, g, x) for multi-line logic.
5. Reversing a File Advanced Mastery of the hold space and advanced flow control.