NEOVIM DEEP DIVE LEARNING PROJECTS
Deeply Understanding Neovim from Scratch
Goal: Deeply understand the architecture of modern modal text editors—from low-level terminal control and data structures to high-level plugin architectures, RPC interfaces, and incremental parsing. You will move beyond “using” Neovim to understanding exactly how it works under the hood.
Core Concept Analysis
Neovim is more than just a text editor; it is a high-performance text processing engine with a client-server architecture. To build it (or parts of it), you need to master several distinct computer science domains.
1. The Terminal Interface: Raw Mode & Escape Sequences
Terminals default to “Canonical Mode” (Cooked Mode). When you type, the kernel buffers input until you hit ENTER. It handles backspace, signals (Ctrl+C), and echoing.
Text editors cannot work this way. They need “Raw Mode”:
- No Echo: The editor decides what to draw.
- Byte-by-Byte Input:
getch()returns immediately. - No Processing: Ctrl+C is just a byte (
0x03), not a signal.
You communicate with the terminal using ANSI Escape Sequences:
\x1b[2J (Clear screen), \x1b[12;40H (Move cursor to row 12, col 40).
2. Text Data Structures: The Gap Buffer
Storing text as a simple array of characters (char*) is fatal. Inserting a character at the start requires shifting every other character in memory (O(N)).
Editors use specialized structures. The most common for simple editors is the Gap Buffer:
Memory: [ T | h | i | s | | | | | | | i | s | | t | e | x | t ]
Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
^---------------^
The GAP
- Insertion: Write to the gap, shrink the gap. O(1).
- Deletion: Expand the gap. O(1).
- Cursor Move: Move the gap to the new cursor position (requires copying elements).
Neovim actually uses a Rope (specifically a balanced binary tree of strings) or similar structures for large files to allow efficient splitting and concatenation.
3. Modal Editing: The Finite State Machine
Vim is a state machine. The meaning of the d key depends entirely on the current state.
State: NORMAL
Input: 'd' --> State: OPERATOR_PENDING (Operator = Delete)
Input: 'w' --> Action: delete_word() --> State: NORMAL
This composability (operator + motion) is what gives Vim its power. You don’t implement “delete word”; you implement “delete” and “word motion” separately, and the state machine combines them.
4. Client-Server Architecture (RPC)
Neovim is headless by default. The binary you run is just a core that speaks MessagePack-RPC.
+------------+ +------------------+
| GUI Client | <---- MSGPACK-RPC ----> | Neovim Core |
| (Renderer) | (over Stdio/ | (Text, Logic, |
| | TCP Socket) | Plugins, Lua) |
+------------+ +------------------+
When you type ‘i’ in a GUI:
- Client sends
nvim_input('i')RPC. - Neovim updates internal buffer state.
- Neovim sends
redrawevent RPC to client. - Client draws the character ‘i’.
5. Incremental Parsing (Tree-sitter)
Regex-based syntax highlighting is fragile. Neovim uses Tree-sitter to parse code into a concrete syntax tree (CST).
Incremental Parsing: When you edit a file, Tree-sitter doesn’t re-parse the whole file. It only re-parses the branch of the tree that changed. This allows for real-time AST manipulation (highlighting, folding, selection) even in massive files.
Concept Summary Table
| Concept Cluster | What You Need to Internalize |
|---|---|
| Termios & Raw Mode | How to take control of the terminal driver from the OS. |
| Gap Buffers / Ropes | Efficiently managing dynamic text in memory without O(N) penalties. |
| State Machines | managing complex input logic where context determines action. |
| RPC & Serialization | Decoupling logic from presentation using binary protocols (MsgPack). |
| Incremental Parsing | How to maintain a valid syntax tree while the code is broken during editing. |
| Event Loops | Handling user input, RPC requests, and file I/O asynchronously (Libuv). |
Deep Dive Reading by Concept
| Concept | Book / Resource | Chapter |
|---|---|---|
| Terminal I/O | “The Linux Programming Interface” by Michael Kerrisk | Ch. 62: Terminals |
| Raw Mode Details | “Build Your Own Text Editor” (viewsourcecode.org) | Steps 1-20 |
| Text Buffers | “Data Structures the Fun Way” by Jeremy Kubica | Ch. 8 (Gap Buffers) |
| RPC Architecture | “Designing Data-Intensive Applications” by Martin Kleppmann | Ch. 4 (Encoding & RPC) |
| Parsing | “Compilers: Principles, Techniques, and Tools” (Dragon Book) | Ch. 4 (Syntax Analysis) |
| LSP | “Language Server Protocol and Implementation” by Gunasinghe & Marcus | Ch. 1-3 |
| Event Loops | “The Linux Programming Interface” by Michael Kerrisk | Ch. 63 (Alternative I/O) |
Project 1: Build a Modal Text Editor in C
Goal: Create a terminal-based text editor from scratch that handles raw input, renders to the screen, and implements Vim-like modal editing (Normal/Insert modes).
Real World Outcome
You will have a compiled binary, let’s call it kilo-vim, that you can run in your terminal.
$ ./kilo-vim my_file.txt
The screen clears. You see the file contents.
- Normal Mode: You press
j,k,l,hand the cursor moves. You pressdthenw, and a word disappears. - Insert Mode: You press
i. The status bar changes to-- INSERT --. You type text, and it appears. - Command Mode: You press
:and a prompt appears at the bottom. You typewand Enter, and the file is saved to disk.
This is not a toy—it is a functional tool that modifies files on your disk.
The Core Question You’re Answering
“How does a program take over the terminal to create an interactive interface, and how do we efficiently manage memory for a document that is constantly changing size?”
Concepts You Must Understand First
Stop and research these before coding:
- Standard Input/Output (Stdio): How file descriptors 0, 1, and 2 work.
- VT100 Escape Sequences: How to tell the terminal to “move cursor here” or “change color to red” using just
printf. - Struct Termios: The C structure that controls terminal flags (
ECHO,ICANON). - Gap Buffers: The algorithm for storing text that allows fast insertion/deletion at the cursor.
Questions to Guide Your Design
- The Event Loop: How will your program wait for a keypress? Will it block?
- Screen Refresh: If I type one letter, do I redraw the whole screen? (Start with “yes”, optimize later).
- Coordinate Systems: How do you map the file’s
(row, col)(which might be line 100, char 500) to the screen’s(row, col)(which is only 24x80)? - State Management: Where do you store the “current mode”? How does the keypress handler know if
jmeans “move down” or “type the letter j”?
Thinking Exercise
Manual State Machine Trace:
Draw a state diagram on paper with circles for NORMAL, INSERT, and COMMAND.
- Draw arrows for keypresses (e.g.,
imovesNORMAL->INSERT). - What happens if you press
EscinNORMAL? (Nothing/Beep). - What happens if you press
dinNORMAL? Does it go to a temporaryDELETE_PENDINGstate?
The Interview Questions They’ll Ask
- “Why did you use a Gap Buffer instead of a linked list of lines?”
- “What is the difference between canonical and raw terminal modes?”
- “How would you handle a file larger than RAM?”
- “Explain how the Undo/Redo stack works in your implementation.”
Hints in Layers
- Hint 1: Start by just getting the terminal into raw mode so
Ctrl-Cdoesn’t kill the program and you can read byte-by-byte. - Hint 2: Create a struct
EditorConfigthat holds global state (cursor x, cursor y, screen rows, screen cols). - Hint 3: Use a dynamic array (pointer to pointer
char**) to store rows of text initially. Move to a Gap Buffer only when that becomes slow. - Hint 4: Implement
editorRefreshScreen()that builds a single huge string (buffer) and writes it all at once (write()) to avoid flickering.
Books That Will Help
| Topic | Book | Chapter | |——-|——|———| | Step-by-step Guide | “Build Your Own Text Editor” (Online) | All | | Terminal Control | “The Linux Programming Interface” | Ch. 62 | | Data Structures | “Data Structures the Fun Way” | Ch. 8 |
Project 2: Build a Neovim Plugin from Scratch (Lua)
Goal: Master the Neovim Lua API, the event loop, and the plugin architecture by building a “Focus Mode” plugin that dynamically manipulates windows, highlights, and text decorations.
Real World Outcome
You will have a directory nvim-focus/ that you can add to your Neovim config.
When you run :FocusMode, your editor transforms:
- The active window stays bright.
- All split windows dim (change background color).
- A floating window appears in the corner showing live stats (WPM, time elapsed).
- The current paragraph is highlighted, while others are greyed out.
The Core Question You’re Answering
“How do I hook into the editor’s internal event loop to execute code automatically when things change (cursor moves, text changes, focus changes)?”
Concepts You Must Understand First
- Lua Tables & Functions: Lua is the configuration language.
- Neovim API (
vim.api): The bridge between Lua and C. - Autocommands: The event listeners of Neovim.
- Extmarks & Namespaces: How to decorate text without changing the text itself.
Questions to Guide Your Design
- Event Triggers: Which events do you need? (
CursorMoved,WinEnter,BufWrite)? - State Persistence: Where do you store the “is focus mode on” boolean? (Global var? Module-local var?).
- Window Management: How do you identify “other” windows? (Loop through
vim.api.nvim_list_wins()). - Floating Windows: How do you calculate the position of the stats window so it stays in the corner even if the editor resizes?
Thinking Exercise
API Exploration:
Open Neovim and type :lua print(vim.inspect(vim.api.nvim_get_current_win())).
- What does it return? (A handle ID).
- Now try to get the configuration of that window.
- Write a one-line Lua loop in the command line to print the ID of every open window.
The Interview Questions They’ll Ask
- “Explain the difference between
vim.cmd,vim.fn, andvim.api.” - “How do namespaces work with Extmarks, and why are they useful?”
- “How would you optimize a plugin that runs code on every
CursorMovedevent to avoid lag?”
Hints in Layers
- Hint 1: Create a
lua/focus/init.luafile. This allowsrequire('focus').setup()usage. - Hint 2: Use
vim.api.nvim_create_autocmdto listen forWinEnterandWinLeave. - Hint 3: To “dim” a window, don’t actually change colors. Use
winblendoption for floating windows, or change theNormalNC(Normal Non-Current) highlight group. - Hint 4: For the stats window, look up
nvim_open_winwithrelative='editor'.
Books That Will Help
| Topic | Book | Chapter |
|——-|——|———|
| Lua Language | “Programming in Lua” | Ch. 1-4 |
| Neovim API | :help lua-guide | All |
| Vim Concepts | “Practical Vim” | Ch. 12 (Macros/Auto) |
Project 3: Build a Neovim GUI Client
Goal: Build a standalone graphical application (in C, Rust, or Python) that embeds Neovim and acts as its frontend, communicating strictly via MessagePack-RPC.
Real World Outcome
You run your application: ./my-neovim-gui.
A window opens (using Qt, GTK, or SDL). It renders a grid of characters.
- You type into this window.
- Neovim (running invisibly in the background) receives the input.
- Neovim processes it and tells your GUI “Put character ‘A’ at row 5, col 10, in green”.
- You see the result. You have effectively built your own terminal emulator specialized for Neovim.
The Core Question You’re Answering
“How can we decouple the ‘brain’ of the editor (logic) from its ‘face’ (UI), and how do we synchronize them efficiently over a network socket?”
Concepts You Must Understand First
- Inter-Process Communication (IPC): Spawning a subprocess and talking to its Stdin/Stdout.
- Serialization: Encoding objects to binary (MessagePack).
- The Actor Model: Treating the GUI and Neovim as independent actors exchanging messages.
- Event Loops (GUI): How GUI frameworks handle drawing vs. input processing.
Questions to Guide Your Design
- The Handshake: How do you tell Neovim “I am a UI, please send me draw commands”? (
nvim_ui_attach). - Grid Management: Neovim views the screen as a grid of cells. How will you represent this in memory?
- Font Rendering: Monospace fonts are easy, but what about bold, italic, and emojis?
- Latency: How do you ensure the UI feels responsive?
Thinking Exercise
RPC Mocking:
Don’t write the GUI yet. Write a script that spawns nvim --embed, sends the RPC call to attach, and just prints the messages Neovim sends back.
- See the
grid_resizeevent? - See the
grid_lineevents? - Decode them manually to understand the structure.
The Interview Questions They’ll Ask
- “What are the benefits of MessagePack over JSON for this application?” (Binary, smaller, faster).
- “How would you handle the user resizing the window?” (Send
nvim_ui_try_resizeRPC). - “Explain how you handle asynchronous notifications vs. synchronous requests.”
Hints in Layers
- Hint 1: Use a library for MessagePack. Do not write your own serializer initially.
- Hint 2: Start with
nvim --embed. Connect your standard input/output to it. - Hint 3: The most important event is
grid_line. It tells you to draw text. - Hint 4: Ignore fancy features (mouse, highlights) first. Just get white text on a black background appearing.
Books That Will Help
| Topic | Book | Chapter |
|——-|——|———|
| RPC Concepts | “Designing Data-Intensive Applications” | Ch. 4 |
| Neovim UI Protocol | :help ui | All |
| MsgPack | MsgPack Spec (msgpack.org) | All |
Project 4: Build a Tree-sitter Grammar
Goal: Create a parser for a custom file format or language that integrates with Neovim to provide semantic syntax highlighting, indentation, and folding.
Real World Outcome
You create a file example.xyl (your custom language).
Previously, it was plain white text.
Now:
- Keywords are bold and red.
- Functions are blue.
- Strings are green.
- If you make a syntax error, the highlighting doesn’t break the rest of the file (error recovery).
- You can press
zcto fold a code block perfectly.
The Core Question You’re Answering
“How do we formally define the structure of a language so a computer can understand it, and how do we parse it efficiently enough to run on every keystroke?”
Concepts You Must Understand First
- Context-Free Grammars (CFG): Rules like
IfStatement -> "if" "(" Expression ")" Block. - LR Parsing: Left-to-right, Rightmost derivation.
- Abstract Syntax Trees (AST): The tree representation of code.
- S-Expressions: The Lisp-like format used for Tree-sitter queries.
Questions to Guide Your Design
- Ambiguity: Does
x * ymean “x multiplied by y” or “pointer x declared as y”? How do you resolve conflicts? - Tokenization: What are your keywords? What are your identifiers?
- Hierarchy: How do you nest structures? (A function contains statements, a statement contains expressions).
Thinking Exercise
Grammar Design: Write a simple grammar for a “todo list” file format on paper.
- [x] Task done @urgent
- [ ] Task todo
- What is a
task? - What is a
status_box? - What is a
tag? - Write the rule:
Task -> "- " StatusBox " " Description Tag*
The Interview Questions They’ll Ask
- “What is the difference between Lexical Analysis and Semantic Analysis?”
- “Why is Tree-sitter faster than a standard Regex highlighter?”
- “How does GLR parsing handle ambiguity?”
Hints in Layers
- Hint 1: Use the
tree-sitter-clitool. It generates the C parser code for you. - Hint 2: Start with just one rule (e.g.,
program: $ => repeat($.statement)). - Hint 3: Use
tree-sitter parseto debug your grammar before trying it in Neovim. - Hint 4: In Neovim, you need a
queries/highlights.scmfile to map your AST nodes (like(function_name)) to highlight groups (@function).
Books That Will Help
| Topic | Book | Chapter |
|——-|——|———|
| Parsing Theory | “Compilers: Principles, Techniques, and Tools” | Ch. 4 |
| Tree-sitter | Official Documentation | All |
| Query Syntax | :help treesitter-query | All |
Project 5: Build an LSP Server
Goal: Implement the Language Server Protocol for a language. You will build a server that Neovim starts, and which provides intelligent features like “Go to Definition” and “Autocomplete”.
Real World Outcome
You open a file in your custom language.
- You type
MyFunc. You see a red squiggle. Hovering says “Undefined function”. - You define
MyFunc. The squiggle vanishes. - You type
My. A popup menu appears suggestingMyFunc. - You Ctrl-Click
MyFunc. The cursor jumps to the definition.
The Core Question You’re Answering
“How do we standardize developer tools so that every editor (VS Code, Neovim, Emacs) doesn’t need to write its own plugin for every language (Python, Go, Rust)?”
Concepts You Must Understand First
- JSON-RPC: The protocol LSP uses.
- LSP Lifecycle:
initialize->initialized->textDocument/didOpen-> … ->shutdown. - Synchronization: Keeping the server’s version of the text in sync with the editor’s version.
Questions to Guide Your Design
- State Management: The server is a long-running process. It needs to remember the content of every open file.
- Performance: If the user types fast, do you re-analyze on every keystroke? (Debouncing).
- Analysis: How do you find a “definition”? (You need a symbol table).
Thinking Exercise
JSON-RPC Handshake:
Write down the JSON objects for an LSP handshake.
Request: {"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {...}}
Response: {"jsonrpc": "2.0", "id": 1, "result": {"capabilities": {"textDocumentSync": 1, ...}}}
Why are “capabilities” important?
The Interview Questions They’ll Ask
- “Explain the M x N problem that LSP solves.”
- “How does Incremental Text Synchronization work in LSP?”
- “What is the difference between
textDocument/completionandcompletionItem/resolve?”
Hints in Layers
- Hint 1: Use a library for JSON-RPC if possible, but writing a simple parser for
Content-Length: ...\r\n\r\n{json}is a good exercise. - Hint 2: Start with just diagnostics. Hardcode a rule “if line contains ‘ERROR’, send a diagnostic”.
- Hint 3: Use
tail -fon a log file to debug your server, because Stdout is used for RPC communication.
Books That Will Help
| Topic | Book | Chapter | |——-|——|———| | LSP Spec | Microsoft LSP Specification | All | | Protocol Design | “Language Server Protocol and Implementation” | Ch. 1-3 | | JSON-RPC | JSON-RPC 2.0 Spec | All |
Final Comprehensive Project: Build “NeoVim Lite”
What you’ll build: A complete modal text editor that implements Vim’s core feature set: multiple buffers, split windows, a command-line mode with Ex commands, a plugin system (Lua-based), syntax highlighting via Tree-sitter, and an RPC API for external control.
The Core Question You’re Answering
“Can you synthesize all the isolated concepts—memory management, state machines, parsing, network protocols, and API design—into a single, cohesive, robust system?”
Concepts You Must Understand First
- All of the above. This is the capstone.
Questions to Guide Your Design
- Architecture: Will you have a single thread? A separate thread for the plugin host?
- Plugin API: How much power do you give plugins? Can they crash the editor?
- Testing: How do you test a text editor automatically? (Headless mode).
Thinking Exercise
System Architecture Diagram:
Draw the boxes.
[Input Loop] -> [Keymap Resolver] -> [Command Executor] -> [Buffer Manager] -> [Renderer]
Where does the Lua Host fit in? Where does the RPC Server fit in?
The Interview Questions They’ll Ask
- “Describe the most difficult bug you encountered while integrating the Lua runtime.”
- “How did you handle the complexity of nested split windows?”
- “If you had to rewrite the buffer data structure for concurrency, what would you choose?”
Hints in Layers
- Hint 1: Build the core first (Buffer + Window).
- Hint 2: Add the TUI layer next.
- Hint 3: Add the Lua engine early, so you can write your own standard library in Lua instead of C.
- Hint 4: Don’t try to be compatible with Vimscript. It’s too hard. Stick to Lua.
Books That Will Help
| Topic | Book | Chapter | |——-|——|———| | System Design | “The Craft of Text Editing” (Craig Finseth) | All | | Architecture | “Beautiful Code” (Chapter on Emacs/Vim) | Relevant Ch. | | Lua C API | “Programming in Lua” | Part IV |