MACOS AUTOMATION PROJECTS
macOS Automation Mastery: Learning Through Building
Goal: Master the macOS automation ecosystem by peeling back the layers of the operating system. You will move from simple scripts to complex system extensions, understanding how macOS manages processes, events, input, and inter-application communication (IPC). By the end, you will not just use tools like Alfred or Rectangle—you will know how to build them.
Core Concept Analysis
macOS is unique in how it exposes its internal machinery to users. Unlike Windows (registry-heavy) or Linux (file-heavy), macOS relies heavily on Apple Events and the Accessibility API for automation.
To truly understand what you are doing, you must visualize the system as a stack of layers that you can hook into:
┌─────────────────────────────────────────────────────────────┐
│ User Session (Aqua) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Apps │ │ Scripts │ │ Background │ │
│ │ (GUI/Cocoa) │ │ (JXA/Bash) │ │ Daemons │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
├──────────┼──────────────────┼──────────────────┼────────────┤
│ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Accessibility│ │ Apple Events │ │ launchd │ │
│ │ API │◀──│ (OSA) │ │ (Scheduler) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
├─────────┼──────────────────┼──────────────────┼─────────────┤
│ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Quartz Events│ │ Kernel │ │ File System │ │
│ │ (Input Taps) │ │ (XNU) │ │ (APFS) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
The Automation Hierarchy
- Quartz Event Taps (Low Level): This is where Karabiner lives. It sits right above the hardware driver. When you press a key, the kernel sees it, passes it to Quartz, and Karabiner intercepts it before any app sees it. This is why you can remap “Caps Lock” to “Hyper Key” globally.
- Accessibility API (UI Level): This is where Hammerspoon (mostly) and your “UI Inspector” project live. Apps publish a tree of their UI elements (Buttons, Windows, Text Fields). Automation tools “crawl” this tree to click buttons that don’t have keyboard shortcuts.
- Apple Events / OSA (Application Level): This is where AppleScript and JXA live. It’s a formal IPC (Inter-Process Communication) protocol. You don’t “simulate a click” on “Save”; you send a
savecommand object to the application. It’s more robust than UI clicking but requires the app to support it. - launchd (System Level): This is macOS’s
systemdorinit. It starts everything, from your dock to your background scripts.
Concept Summary Table
| Concept Cluster | What You Need to Internalize |
|---|---|
| Event Taps (CGEventTap) | How to intercept, modify, and suppress raw keyboard/mouse events before the OS processes them. |
| The Accessibility Tree | UI is a tree structure (Window -> SplitGroup -> Button). Automating means traversing this tree. |
| Open Scripting Architecture (OSA) | The bridge that allows languages (JavaScript, AppleScript) to send “Events” (Objects/Verbs) to applications. |
| Process Management (launchd) | How the OS manages background tasks, keeps them alive, and triggers them based on paths or time. |
| Coordinate Systems | Screen geometry (0,0 is usually bottom-left in Cocoa, top-left in Quartz/Carbon). Handling multiple displays. |
| The Pasteboard (Clipboard) | It’s not just text. It’s a buffer that holds multiple data types (RTF, String, FileURL) simultaneously. |
Deep Dive Reading by Concept
| Concept | Book | Chapter |
|---|---|---|
| Apple Events & OSA | AppleScript: The Definitive Guide (Matt Neuburg) | Ch. 2 “The AppleScript Model”, Ch. 19 “Scripting Applications” |
| System Services & launchd | macOS Internals, Vol I: User Mode (Jonathan Levin) | Section on launchd and XPC (Advanced) |
| Shell & CLI Integration | Wicked Cool Shell Scripts (Dave Taylor) | Ch. 1 “The Missing Code Library”, Ch. 8 “OS X Scripts” |
| Lua Scripting (Hammerspoon) | Programming in Lua (Roberto Ierusalimschy) | Ch. 1-6 (Basics), Ch. 24 (C API - to understand how it binds) |
| UI & Event Handling | macOS Programming for Absolute Beginners (Wallace Wang) | Ch. 5 “Handling Events” (for the native perspective) |
| Input & Vim Philosophy | Practical Vim (Drew Neil) | Ch. 1 “The Vim Way” (Conceptual basis for Project 5) |
Project 1: Window Tiling Manager with Hammerspoon
- Main Programming Language: Lua
- Software or Tool: Hammerspoon
- Difficulty: Intermediate
What you’ll build: A complete window tiling system that responds to hotkeys to snap windows to halves, thirds, quarters, and custom grid positions—like Rectangle or Magnet, but built from scratch.
Real World Outcome
You press Ctrl + Alt + Left, and your chaotic window instantly snaps to fill exactly the left 50% of your screen. You press Ctrl + Alt + Right on another window, and it fills the right 50%. You have created order from chaos with a single keystroke.
-- How it looks in your config:
hs.hotkey.bind({"ctrl", "alt"}, "Left", function()
local win = hs.window.focusedWindow()
local f = win:frame()
local screen = win:screen()
local max = screen:frame()
f.x = max.x
f.y = max.y
f.w = max.w / 2
f.h = max.h
win:setFrame(f)
end)
The Core Question You’re Answering
“How can one program control the geometry and placement of another, completely unrelated program’s window?”
Concepts You Must Understand First
- The Window Object Model: Windows aren’t just pixels; they are objects with properties (
x,y,width,height,id). - Screen Coordinate Systems: macOS handles multiple monitors by placing them on a virtual coordinate plane. One monitor is at
(0,0), another might be at(-1920, 0). - Event Loops: Hammerspoon sits in a loop, waiting for specific keystrokes to trigger your functions.
Questions to Guide Your Design
- Multi-Monitor Logic: If I press “Right” on a window that is already on the right edge of Monitor 1, should it wrap to the left edge of Monitor 2?
- State Management: How do I implement “Restore” (undoing the last snap)? I need to save the window’s previous frame before changing it.
- Animation: Do I want the window to “slide” (animate) or “teleport” (instant)? What are the performance implications?
Thinking Exercise
Draw a rectangle on a piece of paper representing a 1920x1080 screen.
- Coordinate
(0,0)is top-left. - Draw a window at
x: 200, y: 200, w: 800, h: 600. - Calculate the new
x, y, w, hto make it occupy the “Top Right Quadrant”. - Hint:
xwill be1920 / 2,ywill be0.
The Interview Questions They’ll Ask
- “How does macOS handle the Menu Bar’s height when calculating available screen space?” (Hint:
screen:frame()vsscreen:fullFrame()) - “What is the difference between Accessibility APIs and standard Window APIs?”
- “How would you debounce a hotkey event to prevent accidental double-triggering?”
Hints in Layers
- Layer 1: Make a window go to the top-left corner (0,0).
- Layer 2: Make it take up exactly 50% width.
- Layer 3: Abstract the math. Write a function
moveWindow(x_ratio, y_ratio, w_ratio, h_ratio). - Layer 4: Handle multiple screens. Use
win:screen()to get the reference frame.
Books That Will Help
| Topic | Book | Chapter |
| :— | :— | :— |
| Lua Syntax | Programming in Lua | Ch. 1-5 |
| API Reference | Hammerspoon Docs | hs.window, hs.screen |
Project 2: Application Launcher with Fuzzy Search (AppleScript + JXA)
- Main Programming Language: JavaScript (JXA)
- Software or Tool: Script Editor, osascript
- Difficulty: Intermediate
What you’ll build: A Spotlight-alternative launcher that indexes your applications and custom shortcuts, providing fuzzy search through a minimal UI.
Real World Outcome
You run a script from your terminal or a hotkey. A dialog box pops up asking “Run what?”. You type “pho”. Even though you typed “pho”, the script identifies “Adobe Photoshop” and “iPhone Simulator”. You hit Enter, and Photoshop launches.
$ osascript launcher.js
> [Dialog: "Launch app..."]
> User types: "code"
> [Launching Visual Studio Code...]
The Core Question You’re Answering
“How does the OS find applications, and how can I construct a ‘search engine’ for my local file system?”
Concepts You Must Understand First
- The Application Bundle (
.app): An app isn’t a file; it’s a directory masquerading as a file. - JXA (JavaScript for Automation): It’s JavaScript, but with bindings to macOS internal objects like
ApplicationandPath. - Fuzzy Matching: A string algorithm that scores “visual” matches higher than exact matches (e.g., “vs” matches “Visual Studio”).
Questions to Guide Your Design
- Indexing Speed: Iterating through
/Applicationsevery time you press the hotkey is slow. How can you cache the list of apps? - Handling Aliases: Some items in
/Applicationsare symlinks. How do you resolve them? - UI Limitations: Standard AppleScript dialogs are ugly and limited. Can you use a list selection dialog (
choose from list) instead?
Thinking Exercise
Write a pseudo-code function for “Fuzzy Match”:
- Input:
query(“term”),target(“Terminal”) - Algorithm: Does “T” exist in “Terminal”? Yes, at index 0. Does “e” exist after index 0? Yes. Does “r” exist after that?
- Score it based on how close the letters are.
The Interview Questions They’ll Ask
- “What is the Scripting Bridge in macOS?”
- “Why might JXA be preferred over AppleScript for complex logic like string parsing?”
- “How does macOS determine the default application to open a file?”
Hints in Layers
- Layer 1: Use
FileManagerin JXA to list names of files in/Applications. - Layer 2: Filter that list using
string.includes(). - Layer 3: Use
app.includeStandardAdditions = trueto show achooseFromListdialog. - Layer 4: Implement a cache (save the list to a JSON file in
/tmp) to make it instant.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | JXA Fundamentals | JXA Cookbook (Wiki) | Basics, File System | | String Algos | Grokking Algorithms | (General search concepts) |
Project 3: Daily Standup Automator with Shortcuts + Shell Scripts
- Main Programming Language: Shell (Bash/Zsh)
- Software or Tool: Shortcuts, launchd
- Difficulty: Beginner
What you’ll build: An automated workflow that prepares your digital environment for work every morning—opening specific apps, positioning them, and posting a status update.
Real World Outcome
It’s 8:55 AM. You sit down with your coffee. Suddenly, without you touching anything:
- Slack opens and switches to
#daily-standup. - Terminals open to your project directories.
- Browser opens
localhost:3000and Jira. - A notification says: “Work mode engaged. DND on.”
The Core Question You’re Answering
“How do I bridge the gap between user-friendly ‘Shortcuts’ and powerful ‘Shell Scripts’ to orchestrate multiple applications?”
Concepts You Must Understand First
- URL Schemes: How to open an app to a specific page using
slack://orjira://. - The
opencommand: The Swiss Army knife of macOS CLI (open -a "Google Chrome" "http://google.com"). - Shortcuts Actions: How to run a shell script block inside a visual Shortcut.
Questions to Guide Your Design
- Context Awareness: How can I check if it’s a weekend or a holiday before running?
- Wait Times: Apps take time to launch. If I try to arrange windows before the app is open, it will fail. How do I add smart delays?
- Idempotency: If Slack is already open, what happens? (The
opencommand handles this, but your logic might need to adjust).
Thinking Exercise
Map out your perfect morning. Write down every click you make.
- Click Docker icon -> Wait -> Click VS Code -> File -> Open Recent…
- Now translate each “Click” into a CLI command.
open -a Docker,code ~/Projects/MyApp.
The Interview Questions They’ll Ask
- “What is the difference between
openin macOS and executing a binary directly?” - “How do you pass arguments from a Shortcuts workflow into a shell script?”
- “How would you schedule this to run only on weekdays?”
Hints in Layers
- Layer 1: Create a shell script that opens 3 apps.
- Layer 2: Use
osascript -e 'tell application "System Events"...'inside your shell script to arrange the windows. - Layer 3: Create a macOS Shortcut that runs this script (“Run Shell Script” action).
- Layer 4: Add a “Date” check in the Shortcut to skip weekends.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | Shell Automation | Wicked Cool Shell Scripts | Ch. 8 “OS X Scripts” | | CLI Basics | The Linux Command Line | Ch. 1-5 |
Project 4: Clipboard History Manager with JXA
- Main Programming Language: JavaScript (JXA)
- Software or Tool: osascript, Script Editor
- Difficulty: Intermediate
What you’ll build: A background daemon that watches your clipboard. Every time you copy something, it saves it to a list. You can then recall the last 50 items.
Real World Outcome
You are coding. You copy a function name. Then you copy a variable. Then you copy a URL. You realize you need that function name again. Instead of Alt-Tabbing back, you press Cmd+Shift+V. A list appears with your last 10 copies. You arrow down to the function name, hit Enter, and it pastes into your code.
The Core Question You’re Answering
“How does the OS handle data transfer between apps (The Pasteboard), and how can I persist that volatile data?”
Concepts You Must Understand First
- NSPasteboard: The macOS class that manages the clipboard. It has a “change count” that increments every time the clipboard changes.
- Polling vs Events: macOS doesn’t easily push “Clipboard Changed” events to scripts. You often have to “poll” (check repeatedly) the change count.
- Serialization: Storing the history in a way that survives a reboot (JSON/Plain Text).
Questions to Guide Your Design
- Performance: If I check the clipboard every 0.1 seconds, will it drain the battery? (Polling frequency matters).
- Security: I just copied my password from 1Password. Does my script save it to a plain text file? (How to detect “Sensitive” data).
- Data Types: What happens if I copy an image? Does my script crash trying to save binary data as text?
Thinking Exercise
Design the loop:
lastChangeCount = 0- Loop forever:
currentChangeCount = getPasteboardChangeCount()- If
current != last: content = getClipboard()save(content)last = currentsleep(1)
The Interview Questions They’ll Ask
- “What is a daemon vs an agent in macOS?”
- “Why is polling generally considered a bad pattern, and why might we be forced to use it here?”
- “How do you handle ‘Transient’ data types in the pasteboard?”
Hints in Layers
- Layer 1: Write a JXA script that prints the clipboard content to console.
- Layer 2: Wrap it in a
whileloop withdelay(1). Check if content changed. - Layer 3: Save the items to an array in memory.
- Layer 4: Write the array to
~/.clipboard_history.json.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | JXA System Access | JXA Cookbook | “System Events”, “Clipboard” | | File I/O | JavaScript: The Definitive Guide | Ch. 11 (Standard Libs) |
Project 5: Hyper Key System with Karabiner + Goku
- Main Programming Language: EDN (Goku DSL) / JSON
- Software or Tool: Karabiner Elements, Goku
- Difficulty: Intermediate
What you’ll build: You will transform the useless “Caps Lock” key into a “Hyper Key” (Cmd+Ctrl+Opt+Shift). This opens up a new layer of keyboard shortcuts that never conflict with system defaults.
Real World Outcome
- Caps Lock (tapped): Functions as
Escape(great for Vim). - Caps Lock (held): Functions as
Hyper. - Hyper + J: Down arrow (your hands never leave the home row).
- Hyper + C: Launches Chrome.
- Hyper + T: Launches Terminal.
The Core Question You’re Answering
“How can I intercept hardware signals from the keyboard and rewrite them before the OS interprets them?”
Concepts You Must Understand First
- Key Codes: Every key has a numerical ID.
- Modifiers: Shift, Ctrl, Alt, Cmd are flags that modify a key code.
- Complex Modifications: Karabiner’s term for “If A is pressed while B is held, output C”.
- Configuration as Code: Instead of clicking checkboxes, you define your layout in a file (EDN/JSON), allowing version control of your muscle memory.
Questions to Guide Your Design
- Ergonomics: Which keys are easiest to reach while holding Caps Lock with your pinky?
- Layers: Can I have a “Media Layer” where H/J/K/L become VolUp/VolDown?
- Latency: Does intercepting keys introduce lag? (Karabiner is highly optimized, but complex rules can add overhead).
Thinking Exercise
Visualizing the pipeline:
Hardware Keyboard -> USB Signal -> Kernel (IOHID) -> Karabiner (Virtual Device) -> OS (Quartz Event) -> Active App.
You are injecting logic right in the middle.
The Interview Questions They’ll Ask
- “What is the difference between remapping at the hardware level (QMK firmware) vs OS level (Karabiner)?”
- “How does Karabiner distinguish between a ‘tap’ and a ‘hold’?” (The concept of
to_if_alone). - “Why do we need a ‘Virtual Keyboard’ device?”
Hints in Layers
- Layer 1: Install Karabiner. Remap “Caps Lock” to “Right Control” via the UI.
- Layer 2: Install Goku. Create
karabiner.edn. Define the Caps Lock -> Hyper rule. - Layer 3: Add “Navigation Layer” (HJKL -> Arrows).
- Layer 4: Add “App Launching Layer”.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | Modal Editing | Learning the vi and Vim Editors | Ch. 2 “Simple Editing” | | Config DSLs | Goku Documentation | (Online GitHub Repo) |
Project 6: UI Element Inspector and Automator (Accessibility API)
- Main Programming Language: Swift
- Software or Tool: Xcode
- Difficulty: Advanced
What you’ll build: A tool that explores the hierarchy of UI elements of any running application. It’s an X-Ray for apps. You hover over a button, and your tool tells you “This is Button X, nested inside View Y”. You can then script a click on it.
Real World Outcome
You hover your mouse over the “Post” button in a stubborn app that has no shortcut. Your terminal outputs:
[AXButton] Title: "Post" - Position: (400, 300) - Action: AXPress.
You press a key, and your tool programmatically clicks that button via the Accessibility API, bypassing the mouse driver entirely.
The Core Question You’re Answering
“How do screen readers (for the visually impaired) ‘read’ an application’s interface, and how can I use that same channel for automation?”
Concepts You Must Understand First
- AXUIElement: The fundamental object type in the Accessibility API.
- Process IDs (PID): To talk to an app, you first need to find its process ID.
- Permissions: macOS requires explicit “Accessibility” permission in System Settings for any app attempting this.
Questions to Guide Your Design
- Traversal Strategy: The UI is a tree. Do you use Depth-First Search (DFS) or Breadth-First Search (BFS) to find a button named “Save”?
- Performance: Querying the AX API is expensive. How do you avoid freezing the UI while searching?
- Robustness: What if the app updates and the button moves? (Search by Title vs Search by Hierarchy).
Thinking Exercise
Imagine a Window as a DOM (like HTML).
Window -> SplitView -> [Sidebar, ContentArea].
ContentArea -> ScrollView -> Button.
How do you write a path to that button? /Window/SplitView/ContentArea/ScrollView/Button.
The Interview Questions They’ll Ask
- “What is the role of
AXUIElementCreateApplication?” - “Why are Accessibility calls often slow, and how does the
AXTimeoutcome into play?” - “How do you inspect an element under the mouse cursor?” (
AXUIElementCreateSystemWide->AXUIElementCopyElementAtPosition).
Hints in Layers
- Layer 1: Write a Swift CLI that takes a PID and prints the Title of the main window.
- Layer 2: Implement a function to print the children of a given element.
- Layer 3: Make it recursive to print the whole tree.
- Layer 4: Use
CGEventto get mouse position and find the element under it.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | Mac Accessibility | macOS Programming for Absolute Beginners | (Search for Accessibility) | | Tree Algorithms | Algorithms, 4th Edition | Ch. 5 (Trees) |
Project 7: File Organization Daemon with launchd + AppleScript
- Main Programming Language: AppleScript
- Software or Tool: launchd, Script Editor
- Difficulty: Intermediate
What you’ll build: A “Hazel” clone. A background service that watches your Downloads folder. When a file lands there, it checks the extension. PDFs go to /Documents, JPGs go to /Pictures, and DMGs are mounted automatically.
Real World Outcome
You download invoice.pdf. You do nothing. Three seconds later, the file disappears from Downloads and appears in Documents/Invoices/2024/. A notification slides in: “Filed invoice.pdf”.
The Core Question You’re Answering
“How can I set up ‘event listeners’ on the file system itself, so scripts run automatically when files are modified?”
Concepts You Must Understand First
- Folder Actions: The legacy way macOS handles this (easy but limited).
- launchd WatchPaths: The modern system way.
launchdmonitors a path and starts your job when it changes. - MIME Types vs Extensions: Should you trust
.jpg, or check the file header?
Questions to Guide Your Design
- Race Conditions: The browser is still downloading the file (partially written). If you move it now, you corrupt it. How do you wait for the download to finish?
- Name Collisions:
invoice.pdfalready exists in the destination. Do you overwrite? Rename (invoice-1.pdf)? - Logging: Since this runs in the background, how do you debug when it fails? (Standard Out/Error logs).
Thinking Exercise
Write the logic for “Waiting for download”:
- Check file size.
- Sleep 1 second.
- Check file size again.
- If Size1 == Size2, download is done. (Or check for
.downloadextension).
The Interview Questions They’ll Ask
- “What is the structure of a
launchdplist file?” - “What is the difference between
RunAtLoadandWatchPaths?” - “How do you debug a launch agent that refuses to load?” (
launchctl list,launchctl error).
Hints in Layers
- Layer 1: Write an AppleScript that organizes a specific folder once when run.
- Layer 2: Create a
.plistfile in~/Library/LaunchAgentsthat points to your script and watches~/Downloads. - Layer 3: Load it with
launchctl load. - Layer 4: Add logic to handle duplicates and “file in use” errors.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | launchd | macOS Internals | Section on Daemons | | Scripting Files | AppleScript: The Definitive Guide | Ch. 22 “The Finder” |
Project 8: Menu Bar Status App with Hammerspoon
- Main Programming Language: Lua
- Software or Tool: Hammerspoon
- Difficulty: Intermediate
What you’ll build: A custom item in your menu bar (top right of screen) that displays exactly what you care about—Crypto prices, next meeting time, or CPU temperature—and reveals a menu of actions when clicked.
Real World Outcome
In your menu bar, you see: BTC: $95k | 🌡️ 65°C.
You click it. A dropdown appears:
- “Refresh Now”
- “Open Exchange”
- “Kill High CPU Process”
The Core Question You’re Answering
“How can I create native-feeling UI elements (Status Items) using a scripting language?”
Concepts You Must Understand First
- hs.menubar: The Hammerspoon module that creates menu items.
- Asynchronous Callbacks: You can’t block the main thread to fetch Bitcoin prices (HTTP request), or your whole Mac UI will freeze. You need async HTTP.
- Styled Text: Creating attributed strings (bold, colors) for the menu title.
Questions to Guide Your Design
- Real Estate: The menu bar is crowded (especially on MacBooks with notches). How do you keep your info compact?
- Polling Frequency: Updating CPU temp every second is fine. Updating Bitcoin price every second is an API rate limit ban. How do you manage different timers?
- Interactivity: What happens when I click? Can I modify the menu dynamically based on the shift key?
Thinking Exercise
Design the data structure for the menu:
menuData = {
{ title = "Refresh", fn = refreshData },
{ title = "-" }, -- Separator
{ title = "Details", fn = openDetails }
}
The Interview Questions They’ll Ask
- “Why must UI updates happen on the main thread?”
- “How does
hs.styledtextdiffer from standard strings?” - “What is the impact of excessive polling on CPU wake cycles and battery life?”
Hints in Layers
- Layer 1: Create a static menu item that says “Hello”.
- Layer 2: Use
hs.timerto update the text to the current time every minute. - Layer 3: Use
hs.httpto fetch an API and update the text with the result. - Layer 4: Add a click menu to trigger actions.
Books That Will Help
| Topic | Book | Chapter |
| :— | :— | :— |
| Async Logic | Programming in Lua | Ch. 9 (Coroutines/Async concepts) |
| Hammerspoon | Official Docs | hs.menubar, hs.http |
Project 9: Text Expansion Engine with Karabiner + JXA
- Main Programming Language: JavaScript (JXA) + EDN
- Software or Tool: Karabiner, Script Editor
- Difficulty: Advanced
What you’ll build: A system where typing ;;em automatically backspaces and replaces it with your email address. Typing ;;date inserts 2025-01-15.
Real World Outcome
You are in a text field. You type ;;sig.
The system detects the trigger.
It simulates Backspace x 5.
It types out:
“Best Regards,
Douglas
Sent from my custom automation engine.”
The Core Question You’re Answering
“How can I track a buffer of recently pressed keys globally and inject keystrokes in response?”
Concepts You Must Understand First
- Input buffer: You need to remember the last N keys pressed.
- Keystroke Simulation: Generating synthetic key events (pressing ‘a’ via code).
- Race Conditions: You type fast. If you type
;;emand then immediatelySpace, will the expander delete theSpacetoo?
Questions to Guide Your Design
- Trigger Prefix: Why
;;? To avoid accidental triggers (you don’t want to expand “the” every time you type “them”). - Clipboard method vs Typing method: For long text, typing character-by-character is slow. It’s faster to set the Clipboard and simulate
Cmd+V. - Cursor Restoration: If you paste text, the cursor ends up at the end. What if you want to expand a snippet and place the cursor in the middle? (e.g.,
<div>|</div>).
Thinking Exercise
Trace the logic:
- User types
;. Buffer:[;] - User types
;. Buffer:[;, ;] - User types
d. Buffer:[;, ;, d]-> Check matches? No. - User types
a. Buffer:[;, ;, d, a] - User types
t. Buffer:[;, ;, d, a, t] - User types
e. Buffer:[;, ;, d, a, t, e]-> Match found!;;date. - Action: Send
Deletex 6. Send “2024…”.
The Interview Questions They’ll Ask
- “What are the security implications of a global key logger (which this essentially is)?”
- “How do you handle ‘Secure Input’ fields (password boxes) where event taps are disabled?”
- “Why is the Clipboard paste method faster than keystroke simulation?”
Hints in Layers
- Layer 1: Use Karabiner to map a specific key (F6) to paste a string.
- Layer 2: Use a Karabiner “Complex Modification” to detect a sequence (
semicolon,d,a,t,e). - Layer 3: Trigger a shell command that runs a JXA script to calculate the date and paste it.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | Key Events | macOS Internals | Input Processing | | Text Processing | Regular Expressions Cookbook | (For pattern matching) |
Project 10: Browser Automation Suite with JXA + Chrome DevTools
- Main Programming Language: JavaScript (JXA)
- Software or Tool: Google Chrome / Safari
- Difficulty: Advanced
What you’ll build: Scripts that drive your web browser. Open specific tabs, scrape data from a page, fill out forms, and click buttons—all without using Selenium or Puppeteer, just native JXA.
Real World Outcome
You run login-bank.js. Chrome opens. It navigates to your bank. It waits for the load. It fills in your username (fetched securely). It focuses the password field. You just type your password and hit enter. You saved 15 seconds of clicking.
The Core Question You’re Answering
“How can desktop scripts interact with the DOM (Document Object Model) inside a web browser?”
Concepts You Must Understand First
- The
do JavaScriptcommand: The bridge that allows AppleScript/JXA to execute raw JS inside the context of a Chrome tab. - DOM Selectors:
document.querySelector('#id'). - Async Loading: The script runs instantly, but the page takes 2 seconds to load. How do you wait?
Questions to Guide Your Design
- Browser Specifics: Safari and Chrome handle AppleScript differently. Safari has
do JavaScriptindocument, Chrome has it intab. How do you abstract this? - Security: Browsers try to prevent external scripts from hijacking sessions. You might need to enable “Allow JavaScript from Apple Events” in the Developer menu.
- Data Extraction: How do you get the result of the JS execution (e.g., the text of a div) back into your JXA variable?
Thinking Exercise
Write the nested JS:
JXA: chrome.execute("return document.title") -> Chrome runs document.title -> Returns “Welcome” -> JXA receives “Welcome”.
The Interview Questions They’ll Ask
- “Why is
do JavaScriptsafer/lighter than running a full Selenium webdriver?” - “How do you handle pages that are Single Page Applications (SPAs) and load content dynamically?”
- “What is the ‘Same Origin Policy’ and does it affect JXA injection?”
Hints in Layers
- Layer 1: Write a script to open a URL in a new tab.
- Layer 2: Use
executeto runalert('hello')in that tab. - Layer 3: Use
executeto find a form field and set its.value. - Layer 4: Build a “read-it-later” script that grabs the page URL and Title and appends it to a markdown file.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | DOM Scripting | JavaScript: The Definitive Guide | Ch. 15 “Scripting Documents” | | Chrome Scripting | Chrome Dictionary | (Open in Script Editor) |
Project 11: Voice Command System with Shortcuts + Dictation
- Main Programming Language: AppleScript / Shortcuts
- Software or Tool: Voice Control, Shortcuts
- Difficulty: Intermediate
What you’ll build: A personal voice assistant that actually works. You say “Computer, Work Mode”, and it triggers your Project 3 automation.
Real World Outcome
You walk into the room. You say “Wake up”. The Mac wakes from sleep. You say “Set scene coding”. Spotify starts a lo-fi playlist, VS Code opens, and your phone goes to DND.
The Core Question You’re Answering
“How can I hook into the OS’s accessibility voice recognition engine to trigger arbitrary code?”
Concepts You Must Understand First
- Voice Control Commands: macOS has a built-in “Voice Control” feature (Accessibility). You can define custom vocabulary.
- Dictation vs Command: Distinguishing between “Type what I say” and “Do what I say”.
- Shortcuts CLI: Running shortcuts from the terminal (
shortcuts run "My Shortcut").
Questions to Guide Your Design
- False Positives: You don’t want it to trigger when you’re on a Zoom call saying “I need to work”. (Trigger phrases/Wake words).
- Feedback: How do you know it heard you? (Audio chirp or visual flash).
- Latency: Voice processing takes time. How do you make it feel snappy?
Thinking Exercise
Design the command grammar:
-
“[Computer Mac] [Please] [Open Start] [App Name Workflow Name]”
The Interview Questions They’ll Ask
- “How does on-device voice processing differ from cloud-based (Siri) in terms of privacy and latency?”
- “How can you chain a voice command to a shell script?”
Hints in Layers
- Layer 1: Enable Voice Control in System Settings.
- Layer 2: Create a new Command “Work Mode” that runs a specific Shortcut.
- Layer 3: Create a Shortcut that runs a shell script to perform complex actions.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | Voice Interaction | Designing Voice User Interfaces | (General Principles) | | Shortcuts | Take Control of Shortcuts | (Ebook) |
Project 12: Notification Center Automation with JXA
- Main Programming Language: JavaScript (JXA)
- Software or Tool: osascript
- Difficulty: Intermediate
What you’ll build: Scripts that can send rich notifications (with buttons and sounds) and scripts that can toggle “Do Not Disturb” programmatically based on your calendar or CPU usage.
Real World Outcome
Your script finishes a long backup. A notification appears: “Backup Complete. Size: 4GB”. It has two buttons: “View Log” and “Dismiss”. You click “View Log”, and Console.app opens to the specific log file.
The Core Question You’re Answering
“How do I provide user feedback from background scripts and interact with system-wide notification settings?”
Concepts You Must Understand First
- UserNotifications Framework: The modern API for notifications (requires Swift/ObjC usually, but JXA has wrappers).
- Display Notification: The basic AppleScript command (no buttons).
- Do Not Disturb / Focus Modes: These are managed by system state, often accessible via
defaultsor specific APIs.
Questions to Guide Your Design
- Rich Content: Standard
display notificationis text-only. How do I get buttons? (You might need a small helper binary liketerminal-notifieror write a Swift helper). - Urgency: How to ensure a “Server Down” alert breaks through DND?
- Persistence: If the user is away, does the notification stay on screen? (
alertvsbannerstyle).
Thinking Exercise
Compare display alert (blocking, steals focus) vs display notification (non-blocking, passive). When should you use which?
The Interview Questions They’ll Ask
- “How do you handle user interaction (button clicks) from a notification in a stateless script?”
- “What is the XPC mechanism for notification delivery?”
Hints in Layers
- Layer 1:
app.displayNotification("Hello"). - Layer 2: Use
app.displayAlertto get “OK/Cancel” buttons and handle the response. - Layer 3: Install
terminal-notifier(brew) to send notifications with custom actions and URLs. - Layer 4: Script
System Settingsvia UI Scripting to toggle Focus Modes.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | User Interaction | AppleScript: The Definitive Guide | Ch. 15 “User Interaction” |
Project 13: Git Workflow Automator with Shell + Hammerspoon
- Main Programming Language: Shell / Lua
- Software or Tool: Git, Hammerspoon
- Difficulty: Intermediate
What you’ll build: A pervasive Git tool. Your menu bar shows the current branch of the active Finder window or Terminal. A global hotkey opens a “Quick Commit” dialog that auto-stages changes and pushes.
Real World Outcome
You are working in VS Code. You switch tasks. You press Cmd+Opt+G. A popup asks “Message?”. You type “Fix bug”. The tool runs git add ., git commit -m "Fix bug", git push. A notification says “Pushed to main”.
The Core Question You’re Answering
“How can I wrap command-line tools (git) in a GUI to streamline repetitive development tasks?”
Concepts You Must Understand First
- Context Detection: How to know which repo you are in. (If Finder is active -> get folder path. If VS Code is active -> get project path).
- Shell Execution from Lua:
hs.execute("git status"). - Parsing Output: Reading the text output of
git status --porcelainto know if there are changes.
Questions to Guide Your Design
- Safety:
git add .is dangerous. What if you added a secret key by accident? How can your tool warn you? - Authentication:
git pushmight need a password/key. How does your script handle SSH agent or prompts? - Speed: Running
git statuson a huge repo can be slow. Don’t block the UI thread.
Thinking Exercise
Logic flow:
- Get Active Window Title.
- Resolve to Path.
- Check if
.gitexists. - If yes, run
git branch --show-current. - Display result.
The Interview Questions They’ll Ask
- “How do you effectively parse CLI output that is designed for humans vs output designed for machines (plumbing vs porcelain)?”
- “How does the
PATHenvironment variable differ between your interactive shell and a background GUI app like Hammerspoon?”
Hints in Layers
- Layer 1: Create a function that runs
git statusin a fixed directory and prints the result. - Layer 2: Make it dynamic based on the current Finder folder.
- Layer 3: Add a global hotkey to pop up an input box (
hs.dialog.textPrompt) for the commit message. - Layer 4: Parse the branch name and show it in the menu bar.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | Git Internals | Pro Git | Ch. 10 “Git Internals” | | Text Processing | Wicked Cool Shell Scripts | Ch. 2 “User Creation/Management” (Parsing concepts) |
Project 14: PDF Annotation Automator with AppleScript + Preview
- Main Programming Language: AppleScript
- Software or Tool: Preview.app
- Difficulty: Intermediate
What you’ll build: A batch processor for PDFs. Select 10 files, run the script, and it adds a watermark “CONFIDENTIAL”, merges them into one file, and saves it.
Real World Outcome
You have 50 invoices. You need to stamp them “PAID” and email them. You drag them onto your “Stamp & Send” droplet. The script opens each in Preview, adds the annotation, saves, and attaches them to a new email draft.
The Core Question You’re Answering
“How can I automate document manipulation using the native capabilities of default apps?”
Concepts You Must Understand First
- Preview’s Scripting Dictionary: Preview has limited scripting support. You often have to combine it with System Events (UI scripting) or use python/shell tools (like
cpdforsips) for the heavy lifting. - Automator Workflows: Sometimes it’s easier to call an Automator workflow from AppleScript.
- File Iteration: Looping through a list of file aliases.
Questions to Guide Your Design
- Native vs External: Preview might not support “Add Text Annotation” via AppleScript directly. Do you simulate clicks (fragile) or use a CLI tool like
imagemagick/python-pdfrw(robust)? Note: This project encourages exploring the limits of native scripting. - PDF Coordinates: Where is “Bottom Right” on a page that might be rotated?
Thinking Exercise
If Preview doesn’t have an API command for “Watermark”, how do you do it?
- Open File.
- Menu Bar -> Tools -> Annotate -> Text.
- Type “Confidential”.
- Menu Bar -> File -> Save. (This is “UI Scripting”).
The Interview Questions They’ll Ask
- “What are the pros and cons of UI Scripting (simulating menus) vs direct API calls?”
- “How do you handle error recovery if the ‘Save’ dialog pops up unexpectedly?”
Hints in Layers
- Layer 1: Script opening a file in Preview.
- Layer 2: Use System Events to click the “Tools” menu.
- Layer 3: Combine PDFs using a python script (
/System/Library/Automator/.../combine_pdfs.py- it exists on macOS!). - Layer 4: Wrap it in a “Droplet” (an app you can drag files onto).
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | UI Scripting | AppleScript: The Definitive Guide | Ch. 23 “System Events” |
Project 15: macOS Theming Engine with AppleScript + Shell
- Main Programming Language: Shell / AppleScript
- Software or Tool: defaults, System Events
- Difficulty: Advanced
What you’ll build: A “Day/Night” switch on steroids. When triggered, it changes the wallpaper, toggles Dark Mode, changes the system accent color to Blue (Day) or Orange (Night), and changes the font size in Terminal.
Real World Outcome
Sunset triggers. The screen dims. The wallpaper fades from a bright mountain to a starry sky. The harsh blue UI accents turn to a soft amber. Your Terminal theme switches from “Solarized Light” to “Solarized Dark”.
The Core Question You’re Answering
“How does macOS store user preferences (defaults) and how can I modify them on the fly?”
Concepts You Must Understand First
- The
defaultssystem: macOS stores settings in.plistfiles in~/Library/Preferences. - Notification Center Signals: Simply changing the plist often isn’t enough; you have to tell the app to reload its config.
- JXA
Application('System Events').appearancePreferences: The high-level API for some settings.
Questions to Guide Your Design
- Discovery: How do you find the secret key for “Highlight Color”? (Hint:
defaults read -gbefore and after changing it). - Atomic Changes: You want all changes to happen at once, not one by one over 10 seconds.
- Restoration: How do you backup current settings before applying a theme?
Thinking Exercise
Run defaults read com.apple.finder in terminal. Look at the output. Change a setting in Finder Preferences. Run it again. Find the diff. That’s the key you need to script.
The Interview Questions They’ll Ask
- “What is the global domain (
-g) in defaults?” - “Why do some changes require a logout/restart while others are instant?”
- “How does the
cfprefsdprocess relate to preference caching?”
Hints in Layers
- Layer 1: Toggle Dark Mode via JXA.
- Layer 2: Change Wallpaper via System Events.
- Layer 3: Change Highlight Color via
defaults write -g AppleHighlightColor.... - Layer 4: Script Terminal.app to change its specific profile.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | Defaults System | macOS Internals | Configuration | | System Events | AppleScript: The Definitive Guide | Ch. 23 |
Project 16: Meeting Automator with Calendar + AppleScript
- Main Programming Language: AppleScript
- Software or Tool: Calendar.app
- Difficulty: Intermediate
What you’ll build: A script that checks your calendar. If a meeting starts in 5 minutes, it finds the Zoom/Meet link in the notes and opens it. It creates a seamless “Just in Time” join experience.
Real World Outcome
A notification pops up: “Standup in 2 mins”. You click “Join”. The script parses the calendar event description, finds the Google Meet URL (ignoring the rest of the text), and opens it in Chrome.
The Core Question You’re Answering
“How do I extract structured data (links) from unstructured text (event notes) and act on time-based triggers?”
Concepts You Must Understand First
- Calendar Scripting: Accessing
eventsofcalendars. - Regex (Regular Expressions): Finding
https://zoom.us/j/...inside a wall of text. - Date Math: Calculating
startDate - currentDate < 5 minutes.
Questions to Guide Your Design
- Overlapping Meetings: What if you have two meetings at 10 AM? Which one does it pick?
- Zoom vs Google Meet vs Teams: They all have different URL patterns.
- Recurring Events: Does the script handle instances of a recurring event correctly?
Thinking Exercise
Write the Regex for a Zoom link: https:\/\/.*zoom\.us\/j\/\d+.
The Interview Questions They’ll Ask
- “How does AppleScript handle date objects compared to JavaScript?”
- “What are the privacy implications of a script accessing your entire calendar?”
Hints in Layers
- Layer 1: Get the list of today’s events from Calendar.
- Layer 2: Loop through and print the description of the next event.
- Layer 3: Apply Regex to extract the URL.
- Layer 4: Set up a launchd agent to run this check every 5 minutes.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | Calendar API | AppleScript: The Definitive Guide | Ch. 20 “Scripting System Apps” | | Regex | Regular Expressions Cookbook | URL Patterns |
Project 17: Screenshot Workflow Automator
- Main Programming Language: Shell / AppleScript
- Software or Tool: screencapture, sips
- Difficulty: Intermediate
What you’ll build: A replacement for the default screenshot behavior. When you take a screenshot, instead of just saving to Desktop, it prompts you to name it, converts it to a specific format, and copies the file path to clipboard.
Real World Outcome
Cmd+Shift+4. You select an area. A dialog asks: “Name?”. You type “login-bug”. The file is saved as ~/Screenshots/2025-01-15-login-bug.png. The path is in your clipboard, ready to paste into Slack.
The Core Question You’re Answering
“How can I intercept and enhance system-level functions like screen capture?”
Concepts You Must Understand First
screencaptureCLI: The powerful command line tool behind the GUI.sips: Scriptable Image Processing System. Built-in image manipulation (resize, convert).- Folder Actions vs Custom Hotkey: You can either watch the Desktop for new files (Folder Action) or unbind the system screenshot key and bind it to your script (Custom Hotkey).
Questions to Guide Your Design
- Latency: If you use a Folder Action, there is a delay between the file appearing and your script running.
- Retina Displays: Handling standard vs high-DPI screenshots (resizing logic).
- Cleanup: Automatically deleting old screenshots after 30 days.
Thinking Exercise
Command chain:
screencapture -i /tmp/temp.png (Interactive mode) -> mv /tmp/temp.png ~/Screenshots/name.png -> echo path | pbcopy.
The Interview Questions They’ll Ask
- “What is the
sipstool used for?” - “How do you remove the default macOS drop shadow from window screenshots via defaults?” (
defaults write com.apple.screencapture disable-shadow -bool true)
Hints in Layers
- Layer 1: Run
screencapture -c(clipboard) from terminal. - Layer 2: Write a script that captures to a file, then opens that file in Preview.
- Layer 3: Use
sipsto resize the image to 50% width. - Layer 4: Bind the script to
Cmd+Shift+4in Hammerspoon/Karabiner.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | Image CLI | Wicked Cool Shell Scripts | Ch. 11 “Image Management” |
Project 18: Personal Raycast Clone (Complete Launcher)
- Main Programming Language: Swift
- Software or Tool: Xcode, SwiftUI
- Difficulty: Expert
What you’ll build: A full-featured launcher application. A spotlight-like search bar that pops up in the center of the screen, allowing you to search files, run scripts, and calculate math.
Real World Outcome
You press Cmd+Space (mapping over Spotlight). Your custom UI appears. It’s beautiful, translucent, and fast. You type “calc 50*20”. It shows “1000”. You type “scr”. It suggests your “Screenshot Script”. You hit Enter. It runs.
The Core Question You’re Answering
“How do I build a modern, performant macOS desktop application that integrates all the automation concepts I’ve learned?”
Concepts You Must Understand First
- NSPanel: Creating windows that float above others and don’t activate the dock icon (HUD style).
- Global Hotkey Registration: Using
CarbonAPIs or libraries likeHotKeyto listen forCmd+Spaceeven when your app is in the background. - Plugin Architecture: Designing your app so it can run external scripts (Project 1-17) as plugins.
Questions to Guide Your Design
- State: The app needs to be “always running” but “usually hidden”.
- Focus Stealing: When invoked, it must steal input focus immediately. When dismissed, it must return focus to the previous app.
- Sandboxing: If you distribute via App Store, you can’t run shell scripts easily. (You will likely build this as a non-sandboxed app).
Thinking Exercise
Architecture:
- Frontend: SwiftUI View (List of results).
- Backend: Search Controller (matches query to FileSystem, Calculator, ScriptRegistry).
- Trigger: AppDelegate listens for Hotkey ->
window.makeKeyAndOrderFront().
The Interview Questions They’ll Ask
- “What is the difference between an
NSPaneland anNSWindow?” - “How do you implement a fuzzy search algorithm efficiently in Swift?”
- “How do you execute a shell command from a Swift app and capture the output?” (
Processclass).
Hints in Layers
- Layer 1: Build a standard macOS app that prints “Hello”.
- Layer 2: Make the window style “Borderless” and “Floating”.
- Layer 3: Register a global hotkey to toggle visibility.
- Layer 4: Implement a TextField that filters a hardcoded array of strings.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | App Development | macOS Programming for Absolute Beginners | Full Book | | Swift Language | The Swift Programming Language | Language Guide |
Final Capstone Project: Complete macOS Productivity Suite
- Main Programming Language: Swift + Lua + AppleScript + Shell
- Difficulty: Master
What you’ll build: A unified productivity suite. You won’t just run 18 separate scripts. You will build a “Command Center” (Project 18) that orchestrates them all.
Real World Outcome
Your Mac is now a specialized tool tailored to your brain.
- Morning: “Wake up” voice command triggers the Standup Automator.
- Work: Hyper-keys snap windows and launch dev environments.
- Context: The Menu Bar shows only what matters for this project.
- Maintenance: Files organize themselves, clipboard remembers history. You have built your own Operating System layer on top of macOS.
The Core Question You’re Answering
“How do I architect a cohesive system where multiple disparate tools (Swift, Lua, Bash) share state and communicate?”
Concepts You Must Understand First
- IPC (Inter-Process Communication): How does your Swift Launcher tell your Lua Window Manager to “Move Left”? (URL Schemes? Local Socket? AppleScript?).
- Unified Config: Can you have a single JSON config file that all your tools read?
- Deployment: How do you set this up on a new Mac? (A dotfiles installation script).
Questions to Guide Your Design
- Single Source of Truth: Who decides what “Dark Mode” means? The Launcher? The Theming Script?
- Error Handling: If the Lua script crashes, does the whole system go down?
- Modularity: Can I swap out the Launcher (Swift) for Alfred later without breaking the Window Manager?
Thinking Exercise
Draw the dependency graph.
Launcher (Swift) –> calls –> Shell Scripts
Launcher –> calls –> Hammerspoon (Lua) (via open hammerspoon://...)
Hammerspoon –> reads –> Config.json
Shell Scripts –> read –> Config.json
The Interview Questions They’ll Ask
- “Describe the architecture of a plugin-based system.”
- “How do you manage dependencies and versions for a suite of internal tools?”
- “How would you package this suite for distribution to other developers?”
Hints in Layers
- Layer 1: Define a URL scheme for Hammerspoon (
hammerspoon://moveLeft). - Layer 2: Make your Swift Launcher call these URLs.
- Layer 3: Centralize configuration (colors, paths) in
~/.dotfiles/config.json. - Layer 4: Write an
install.shthat symlinks everything and installs dependencies.
Books That Will Help
| Topic | Book | Chapter | | :— | :— | :— | | System Design | Clean Architecture (Robert C. Martin) | Component Coupling | | Distribution | Homebrew Documentation | Creating Taps |