LEARN DOM AND BROWSER ENGINES
Learn DOM and Browser Engines: From Zero to Rendering Master
Goal: Deeply understand how a web browser takes HTML and CSS and turns it into pixels on the screen by building your own simple browser engine from scratch.
Why Learn Browser Internals?
The Document Object Model (DOM) and rendering engine are the heart of the web, yet for most developers, they are a complete black box. We manipulate the DOM with JavaScript and style it with CSS, but we don’t understand how it works. Building your own engine demystifies this process entirely.
After completing these projects, you will:
- Understand how HTML is parsed into a tree structure.
- Know how CSS is parsed and applied to elements.
- Grasp the core concepts of layout and the box model.
- Be able to build a program that renders a simple webpage to an image.
- Never look at a
divthe same way again.
Core Concept Analysis
The Browser Rendering Pipeline
┌──────────────────┐ ┌──────────────────┐
│ HTML File │ │ CSS Files │
└──────────────────┘ └──────────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ HTML Parser │ │ CSS Parser │
└──────────────────┘ └──────────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ DOM Tree │ │ CSSOM Tree │
│ (The "Content") │ │ (The "Styles") │
└──────────────────┘ └──────────────────┘
│ │
└───────────▼───────────┘
Style Calculation
│
▼
┌──────────────────────────────────┐
│ Style Tree │
│ (DOM Nodes with Computed Styles) │
└──────────────────────────────────┘
│
▼
Layout / Reflow
│
▼
┌──────────────────────────────────┐
│ Layout Tree │
│ (Nodes with Geometry & Position)│
└──────────────────────────────────┘
│
▼
Paint
│
▼
┌──────────────────────────────────┐
│ Screen Pixels │
└──────────────────────────────────┘
Key Concepts Explained
1. The DOM Tree
The DOM is a tree of Node objects. An HTML document is parsed into this structure. Element nodes have children, Text nodes contain content.
// Simplified Node structure
struct Node {
NodeType type; // ELEMENT_NODE, TEXT_NODE
union {
ElementData element;
char* text;
} data;
struct Node* parent;
struct Node* first_child;
struct Node* next_sibling;
};
2. CSSOM (CSS Object Model)
A tree structure representing the parsed CSS. It contains rules, selectors, and properties.
/* style.css */
body { font-size: 14px; }
h1 { font-size: 32px; color: #333; }
.note { background: #ffffd0; }
This becomes a data structure that can be queried to find the styles for a given element.
3. The Style Tree
This is a tree parallel to the DOM, where each node contains the computed style for the corresponding DOM element. This is the result of applying CSS rules from the CSSOM, considering specificity and the cascade.
4. The Layout Tree (or Render Tree)
This tree contains only the elements that will be rendered. It calculates the geometry of each element—its position, width, height, padding, borders, and margins. This process is called layout or reflow. It’s where the box model comes to life.
5. The Box Model
Every element in the layout tree is a box.
┌──────────────────────────┐
│ Margin │
│ ┌────────────────────┐ │
│ │ Border │ │
│ │ ┌──────────────┐ │ │
│ │ │ Padding │ │ │
│ │ │ ┌──────────┐ │ │ │
│ │ │ │ Content │ │ │ │
│ │ │ └──────────┘ │ │ │
│ │ └──────────────┘ │ │
│ └────────────────────┘ │
└──────────────────────────┘
Project List
The following 10 projects will guide you through building a simple browser engine.
Project 1: Simple HTML Parser & DOM Tree
- File: LEARN_DOM_AND_BROWSER_ENGINES.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Python, Go
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Parsing / Data Structures
- Software or Tool: A simple text editor
- Main Book: “The C Programming Language” by Kernighan & Ritchie
What you’ll build: A command-line tool that reads a file with simple HTML and constructs an in-memory DOM tree. It will then print a visual representation of this tree to the console.
Why it teaches DOM fundamentals: This is the first, most crucial step. It forces you to think about the web as a structured document, not just text. You’ll implement the core Node and Element data structures that every browser uses.
Core challenges you’ll face:
- Designing the tree data structure → maps to understanding nodes, children, and siblings
- Parsing tags and attributes → maps to basic string manipulation and tokenization
- Handling nested elements → maps to recursive descent parsing
- Representing text content → maps to distinguishing element nodes from text nodes
Resources for key challenges:
- Let’s build a browser engine! Part 1: HTML - A fantastic practical guide.
Key Concepts:
- Tree Data Structures: “Algorithms, Fourth Edition” Ch. 4 - Sedgewick & Wayne
- Recursive Descent Parsing: “Compilers: Principles, Techniques, and Tools” (Dragon Book) Ch. 4
- C Structs and Pointers: “The C Programming Language” Ch. 6
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Solid C programming, especially pointers, structs, and dynamic memory allocation.
Real world outcome:
$ ./html_parser test.html
# Input: <html><body><h1>Title</h1><p>Hello</p></body></html>
# Output:
html
body
h1
"Title"
p
"Hello"
Implementation Hints:
Start with a very simple HTML subset: no self-closing tags, no comments, no malformed syntax.
// Data structures to consider
typedef struct Node { ... } Node;
typedef struct ElementData {
char* tag_name;
// You'll need a dynamic array or linked list for attributes
} ElementData;
// Your main parser function might look like this:
Node* parse(char* input);
Questions to guide your implementation:
- How do you represent a node that can be either an element (with a tag) or text? (Hint:
unioninside astruct). - How do you manage memory for all the nodes and strings you’re creating?
- When you see an opening tag like
<p>, how do you parse all its children until you find the corresponding</p>? - How do you link nodes together using parent/child/sibling pointers?
Learning milestones:
- Parse a single element →
<h1>Title</h1>becomes a single element node with one text child. - Parse nested elements → The tree structure correctly reflects nesting.
- Parse attributes →
<p class="note">stores theclassattribute. - Pretty-print the tree → You can traverse your data structure correctly.
Project 2: Simple CSS Parser
- File: LEARN_DOM_AND_BROWSER_ENGINES.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Python
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Parsing
- Software or Tool: A simple text editor
- Main Book: “The C Programming Language” by Kernighan & Ritchie
What you’ll build: A tool that reads a simple CSS file and parses it into a data structure representing a stylesheet—a list of rules. Each rule consists of selectors and a list of declarations (property-value pairs).
Why it teaches DOM fundamentals: It’s the other half of the core equation: content (HTML) and presentation (CSS). This project teaches you how styling information is structured before it gets applied to the DOM.
Core challenges you’ll face:
- Parsing selectors → maps to handling tag names, classes, and IDs
- Parsing declarations → maps to splitting property: value; pairs
- Structuring the data → maps to designing structs for rules, selectors, and declarations
- Handling whitespace and comments → maps to robust parsing
Resources for key challenges:
- Let’s build a browser engine! Part 2: CSS - Continues the excellent series.
Key Concepts:
- CSS Syntax: MDN CSS Syntax
- Tokenization: “Language Implementation Patterns” Ch. 2 - Terence Parr
Difficulty: Intermediate Time estimate: 1 week Prerequisites: Project 1, C programming.
Real world outcome:
$ ./css_parser style.css
# Input:
# h1, h2 { color: #333; }
# .note { background: yellow; }
# Console output of your data structure:
Rule:
Selectors:
- h1
- h2
Declarations:
- color: #333
Rule:
Selectors:
- .note
Declarations:
- background: yellow
Implementation Hints:
Focus on simple selectors first: tag names (h1), classes (.note), and IDs (#main). Don’t worry about descendant or child selectors yet.
Your data structures might look like this:
struct Declaration { char* property; char* value; };
struct Rule { Selector* selectors; Declaration* declarations; };
struct Stylesheet { Rule* rules; };
The parsing process is a state machine:
- Read until you find a
{. Everything before it is a selector list. - Inside the braces, read until you find a
:. That’s a property. - Read until you find a
;. That’s the value. - Repeat until you find a
}.
Learning milestones:
- Parse a single rule →
p { color: red; }is parsed correctly. - Parse multiple declarations → Handles multiple property-value pairs in a rule.
- Parse multiple selectors →
h1, h2, h3 { ... }is parsed into three selectors for one rule. - Handle classes and IDs → Your selector parser distinguishes between
p,.note, and#main.
Project 3: The Style Tree
- File: LEARN_DOM_AND_BROWSER_ENGINES.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Python
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 3: Advanced
- Knowledge Area: Data Structures / Algorithms
- Software or Tool: Your HTML and CSS parsers
- Main Book: “Introduction to Algorithms” (CLRS)
What you’ll build: A program that takes a DOM tree and a stylesheet and produces a style tree. This new tree has a one-to-one correspondence with the DOM tree, but each node contains the computed CSS properties for that element.
Why it teaches DOM fundamentals: This is where content and style meet. You’ll implement the logic that decides which CSS rules apply to which DOM nodes, the very essence of the “Cascade” in Cascading Style Sheets.
Core challenges you’ll face:
- Matching selectors to nodes → maps to checking an element’s tag name, classes, and ID
- Computing the final style → maps to applying multiple rules to the same node
- Handling specificity (simple version) → maps to ID > class > tag
- Creating a parallel data structure → maps to traversing the DOM and building the style tree
Resources for key challenges:
Key Concepts:
- CSS Specificity: MDN Specificity
- Tree Traversal: “Grokking Algorithms” Ch. 6 - Aditya Bhargava
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Projects 1 & 2.
Real world outcome:
Given an HTML node for <p class="note"> and CSS rules for p and .note, your program will output the combined style.
Styled Node for: p.note
- color: blue (from 'p' rule)
- background: yellow (from '.note' rule)
- display: block (default)
Implementation Hints:
For each node in the DOM tree, you need to find all the CSS rules that match it.
// Pseudo-code for styling a node
function style_node(dom_node):
create styled_node
styled_node.dom_node = dom_node
// For simplicity, you can just use a hash map for properties
create property_map
for rule in stylesheet:
if any selector in rule matches dom_node:
for declaration in rule:
// For now, let's just overwrite. Later, you'll add specificity.
property_map[declaration.property] = declaration.value
styled_node.properties = property_map
return styled_node
Specificity can be implemented by giving each selector a score. An ID is 100 points, a class 10, a tag 1. The rule with the highest-scoring selector wins.
Learning milestones:
- Match tag selectors →
pelements get styles fromp {}. - Match class and ID selectors →
.noteand#mainwork. - Apply multiple rules → An element gets styles from all matching rules.
- Implement simple specificity → A rule with an ID selector correctly overrides a rule with a class selector.
Project 4: The Layout Tree & Box Model
- File: LEARN_DOM_AND_BROWSER_ENGINES.md
- Main Programming Language: C
- Alternative Programming Languages: Rust
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 3: Advanced
- Knowledge Area: Layout Engines / Computer Graphics
- Software or Tool: Your Style Tree builder
- Main Book: “Computer Graphics from Scratch” by Gabriel Gambetta
What you’ll build: A program that traverses the style tree and generates a layout tree. Each node in this tree is a box with concrete dimensions—x, y coordinates, width, height, margin, border, and padding. For now, you will only implement display: block.
Why it teaches DOM fundamentals: This is the magic of “reflow”. You’re turning abstract style properties into concrete geometry. This project forces you to understand the CSS Box Model from first principles.
Core challenges you’ll face:
- Implementing the box model → maps to calculating content, padding, border, and margin boxes
- Block layout logic → maps to stacking boxes vertically
- Calculating automatic height → maps to a box’s height is the sum of its children’s heights
- Passing down width constraints → maps to child boxes are constrained by their parent’s content width
Resources for key challenges:
Key Concepts:
- The Box Model: MDN Box Model
- Visual Formatting Model: MDN Visual Formatting Model
Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Project 3.
Real world outcome: Your program will take a style tree and print a textual representation of the layout tree.
# For a <body> with a <p> inside
LayoutBox for body (block) at (0,0) size 800x600
- Margin: 8, Border: 0, Padding: 0
LayoutBox for p (block) at (8,8) size 784x50
- Margin: 16, Border: 0, Padding: 0
Implementation Hints:
This is a recursive process. The layout function for a parent node is responsible for calling the layout functions for its children.
// Pseudo-code
function layout(styled_node, containing_block):
create layout_box
layout_box.type = styled_node.display_type()
// Width calculation
calculate layout_box.width based on containing_block.width
// Position calculation
layout_box.x = containing_block.x + margin/border/padding
layout_box.y = containing_block.y + margin/border/padding
// Recursive step for children
current_y = layout_box.y
for child in styled_node.children:
child_box = layout(child, layout_box)
child_box.y = current_y
current_y += child_box.height + child_box.margins
// Height calculation
calculate layout_box.height based on children's heights
return layout_box
Start with fixed-size fonts and don’t worry about text rendering yet. A block of text can have a fixed height for now.
Learning milestones:
- A single block box is laid out → A
divfills its container’s width. - Padding, border, and margin are applied → The content box is correctly inset.
- Nested block boxes are stacked vertically → Children appear below one another.
- Automatic height works → A parent’s height grows to contain its children.
Project 5: Painting to Pixels
- File: LEARN_DOM_AND_BROWSER_ENGINES.md
- Main Programming Language: C
- Alternative Programming Languages: Rust
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 3: Advanced
- Knowledge Area: Computer Graphics / Rendering
- Software or Tool: A PNG writing library (e.g.,
stb_image_write) - Main Book: “Fundamentals of Computer Graphics”
What you’ll build: A function that traverses the layout tree and “paints” the boxes to a pixel buffer, which is then saved as a PNG image.
Why it teaches DOM fundamentals: This is the final step: rasterization. You see the tangible result of all the previous abstract data structures. You are making a “screenshot” with your own engine.
Core challenges you’ll face:
- Creating a pixel buffer → maps to a simple array of color values
- Drawing rectangles → maps to filling a region of the pixel buffer with a color
- Traversing the layout tree → maps to a recursive painting function
- Handling colors → maps to parsing hex codes (#RRGGBB) into RGB values
Resources for key challenges:
- Let’s build a browser engine! Part 5: Painting
- stb_image_write.h - A simple, single-file C library for writing PNGs.
Key Concepts:
- Framebuffers: “Computer Graphics from Scratch” Ch. 2
- Color Models (RGB): “Fundamentals of Computer Graphics”
Difficulty: Advanced Time estimate: 1 week Prerequisites: Project 4.
Real world outcome:
A PNG file (output.png) showing a series of nested, colored rectangles that represent your HTML document.
Implementation Hints:
Your canvas is just a struct Color* array, where width * height is the total size.
struct Color { uint8_t r, g, b, a; };
typedef struct { Color* pixels; int width; int height; } Canvas;
// Pseudo-code
function paint(layout_box, canvas):
// Paint background
fill_rect(canvas, layout_box.x, layout_box.y, layout_box.width, layout_box.height, layout_box.background_color)
// Paint borders
// ...
for child in layout_box.children:
paint(child, canvas)
The painting order matters. You must paint parents before their children so that child backgrounds are drawn on top of parent backgrounds.
Learning milestones:
- A blank canvas is created → You can create a PNG of a single color.
- A single box is painted → A
divwith a background color appears. - Nested boxes are painted correctly → Children are drawn on top of their parents.
- Borders are visible → You can draw rectangles for borders around the content+padding box.
Project 6: Adding Text Rendering
- File: LEARN_DOM_AND_BROWSER_ENGINES.md
- Main Programming Language: C
- Alternative Programming Languages: Rust
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: Typography / Computer Graphics
- Software or Tool: A font loading library (e.g.,
stb_truetype) - Main Book: “Code: The Hidden Language of Computer Hardware and Software” by Charles Petzold
What you’ll build: Extend your engine to parse, lay out, and render text. This involves integrating a font library, calculating text dimensions, and blitting glyphs to the canvas.
Why it teaches DOM fundamentals: Text is the most fundamental content on the web. Handling it forces you to deal with font metrics, glyphs, and the interaction between text nodes and element boxes.
Core challenges you’ll face:
- Loading a font file → maps to using a library like FreeType or stb_truetype
- Laying out glyphs → maps to getting character dimensions and advancing the “pen” position
- Modifying block layout → maps to calculating height based on text lines
- Painting glyphs → maps to drawing individual character bitmaps onto your canvas
Resources for key challenges:
- stb_truetype.h - A simple C library for loading fonts and rendering glyphs.
Key Concepts:
- Font Metrics: Glyphs, ascent, descent, kerning.
- Bitmap Blitting: Drawing a small bitmap onto a larger one.
Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Project 5.
Real world outcome:
Your output.png now shows rendered text inside the boxes. The “Hello” from your <p>Hello</p> tag is now visible.
Implementation Hints:
This is a major step.
- Parsing: Modify your HTML parser to create
Textnodes. - Styling: The
Style Treeshould now haveTextnodes, but they will inherit styles from their parent element. - Layout: This is the hardest part. When a
LayoutBoxcontains text, its height is no longer just the sum of its children’s heights. It’s determined by the number of lines of text. For now, don’t implement line-wrapping. Assume a single line of text. - Painting: When painting a
LayoutBoxthat contains text, use your font library to render the glyphs one by one at the correct position on the canvas.
Learning milestones:
- Load a .ttf font file.
- Calculate the width of a string.
- A block’s height is determined by its text content.
- Text is rendered to the PNG.
Project 7: Implementing Inline Layout
- File: LEARN_DOM_AND_BROWSER_ENGINES.md
- Main Programming Language: C
- Alternative Programming Languages: Rust
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: Layout Engines
- Software or Tool: Your existing engine
- Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann (for thinking about complex data flows)
What you’ll build: Extend your layout engine to support display: inline. This allows multiple elements to sit next to each other on the same line.
Why it teaches DOM fundamentals: This is the other major layout mode besides block. Understanding the interaction between block and inline is key to understanding CSS layout. You’ll be implementing “line boxes” and text wrapping.
Core challenges you’ll face:
- Creating an “inline formatting context” → maps to managing a horizontal flow of items
- Generating line boxes → maps to grouping inline elements into lines that fit the container width
- Splitting inline boxes → maps to a single
<span>might be split across multiple lines - Handling text and elements together → maps to mixing
Textnodes andElementnodes on the same line
Resources for key challenges:
Key Concepts:
- Inline Formatting Contexts: MDN Block and Inline Layout
- Line Breaking Algorithms: Knuth-Plass line-breaking algorithm (though you’ll implement a much simpler greedy version).
Difficulty: Expert Time estimate: 3-4 weeks Prerequisites: Project 6.
Real world outcome:
Your engine can now render HTML like <h1>Hello <em>World</em>!</h1> with “World” styled differently but remaining on the same line as “Hello”. Text will also wrap to the next line if it exceeds the container’s width.
Implementation Hints: The layout algorithm for a block box needs to change. Instead of just stacking child boxes, it must:
- Create an
InlineFormattingContext. - Iterate through its children. If a child is
block, process it as before. - If a child is
inline, add it to the currentLineBox. - Keep adding inline items to the
LineBoxuntil it’s full. - When the line is full, “break” it, add it to the block’s list of lines, and start a new
LineBox. - The block’s height is now the sum of the heights of its
LineBoxes and block children.
Learning milestones:
- Two
spanelements appear on the same line. - Text next to an element is on the same line.
- Long lines of text and inline elements wrap correctly.
- An inline element can be split across two lines.
Project 8: A querySelector Clone
- File: LEARN_DOM_AND_BROWSER_ENGINES.md
- Main Programming Language: C
- Alternative Programming Languages: Python, Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 3: Advanced
- Knowledge Area: Algorithms / API Design
- Software or Tool: Your DOM Tree
- Main Book: “Data Structures and Algorithms in C++” by Michael T. Goodrich
What you’ll build: Add a function to your DOM API, find_nodes(root, selector), that takes a selector string (e.g., div#main p.note) and returns a list of DOM nodes that match.
Why it teaches DOM fundamentals: It bridges the gap between the static DOM structure and the dynamic way it’s manipulated by JavaScript. This is the core of DOM traversal APIs.
Core challenges you’ll face:
- Parsing selector chains → maps to splitting
div p.noteintodiv,p, and.note - Implementing descendant combinators → maps to searching the entire subtree for a match
- Combining checks → maps to an element must match the tag, ID, and all classes in a simple selector
- Recursive searching → maps to depth-first traversal of the DOM
Key Concepts:
- Depth-First Search (DFS): “Introduction to Algorithms” (CLRS) Ch. 22.3
- String Parsing: “The C Programming Language” Ch. 5
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Project 1.
Real world outcome:
A function you can call from a main test program that correctly finds and prints the tags of nodes matching a complex selector.
// Example Usage
Node* body = find_nodes(root, "body")[0];
Node* paragraphs = find_nodes(body, "p.highlight");
// ... do something with the found nodes
Implementation Hints:
Start by supporting only a single simple selector (p, .note, #main).
Then, add support for chained simple selectors (p.note#main-para).
Finally, add the descendant combinator (the space).
A search for div p starting from the html node would work like this:
- Find all
divelements in the document. - For each
divfound, start a new search within its subtree for allpelements. - Collect all the results.
Learning milestones:
- Find nodes by tag name.
- Find nodes by class or ID.
- Find nodes matching a tag AND a class (e.g.,
p.note). - Find nodes matching a descendant selector (e.g.,
div p).
Project 9: The Toy Browser
- File: LEARN_DOM_AND_BROWSER_ENGINES.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Go
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: Systems Integration
- Software or Tool: All your previous projects
- Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron
What you’ll build: A single, command-line executable that takes an HTML file and a CSS file as input, runs them through your entire pipeline (parsing, styling, layout, painting), and outputs a single output.png file.
Why it teaches DOM fundamentals: This project integrates everything. It forces you to think about the entire pipeline, memory management, and how data flows from one stage to the next. You’ve built a browser engine.
Core challenges you’ll face:
- Creating a pipeline → maps to chaining all your previous projects together
- Memory management → maps to making sure you free all the trees you create (DOM, CSSOM, Style, Layout)
- Error handling → maps to what happens if the HTML or CSS is invalid?
- Command-line interface → maps to parsing arguments and reading files
Key Concepts:
- Systems Design: “Designing Data-Intensive Applications” Ch. 1
- Memory Management in C:
malloc,free, and tools like Valgrind.
Difficulty: Expert Time estimate: 1 week Prerequisites: All previous projects.
Real world outcome:
$ ./browser index.html style.css
Rendering page to output.png...
Done.
And output.png is a recognizable rendering of your simple webpage.
Implementation Hints:
Your main function will be a sequence of calls:
int main(int argc, char** argv) {
// 1. Read HTML file and CSS file contents into strings.
// 2. Parse HTML to create DOM tree.
Node* dom_tree = parse_html(html_string);
// 3. Parse CSS to create stylesheet.
Stylesheet* stylesheet = parse_css(css_string);
// 4. Create style tree.
StyledNode* style_tree = create_style_tree(dom_tree, stylesheet);
// 5. Create layout tree.
LayoutBox* layout_tree = create_layout_tree(style_tree, initial_containing_block);
// 6. Paint to canvas.
Canvas* canvas = create_canvas(width, height);
paint(layout_tree, canvas);
// 7. Save canvas to PNG.
save_png(canvas, "output.png");
// 8. Free ALL allocated memory!
// ...
}
Learning milestones:
- The program compiles and links successfully.
- It correctly renders your test page from file inputs.
- It runs without memory leaks (checked with Valgrind).
- You can change the HTML/CSS and see the output image change.
Project 10: The Grand Finale - Basic JavaScript Integration
- File: LEARN_DOM_AND_BROWSER_ENGINES.md
- Main Programming Language: C
- Alternative Programming Languages: Rust
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 5: Master
- Knowledge Area: API Bindings / Interpreters
- Software or Tool: An embeddable JS engine (QuickJS, Duktape)
- Main Book: “Language Implementation Patterns” by Terence Parr
What you’ll build: Integrate a lightweight JavaScript engine into your browser. Expose your DOM tree to the JavaScript environment, allowing scripts in <script> tags to run. Implement one function, document.getElementById, and the ability to change a style property (element.style.color = 'red').
Why it teaches DOM fundamentals: This completes the trifecta: HTML, CSS, and JavaScript. You’ll understand that the DOM is not just a static structure but a live API, and how scripting makes the web dynamic.
Core challenges you’ll face:
- Embedding a JS engine → maps to linking the library and initializing a VM
- Creating C-to-JS bindings → maps to exposing your C functions to the JS world
- Wrapping your C structs → maps to representing a
Node*as a JavaScript object - Triggering a re-render → maps to after a script changes a style, you must re-run layout and paint
Resources for key challenges:
Key Concepts:
- Foreign Function Interface (FFI): How one language calls another.
- Event Loop (conceptual): The idea that scripts run, make changes, and then the browser re-renders.
Difficulty: Master Time estimate: 1 month+ Prerequisites: Project 9, patience.
Real world outcome: Your browser can now render an HTML file containing JavaScript that dynamically changes the color of an element after the initial render.
<!-- test.html -->
<html>
<head><link rel="stylesheet" href="style.css"></head>
<body>
<p id="p1">Hello World</p>
<script>
var p = document.getElementById('p1');
p.style.color = 'red';
</script>
</body>
</html>
The output PNG should show red text, even if the CSS originally made it blue.
Implementation Hints:
- Extend your HTML parser to recognize
<script>tags. Don’t execute them yet, just store their content. - After the initial layout and paint, initialize the JS VM.
- Write a C function
js_getDocumentById(id). This function will use your existing DOM traversal logic and return a “wrapped”Node*object to JS. - Expose this C function to the JS global scope as
document.getElementById. - When a script modifies a property like
element.style.color, this should update the C struct for theStyledNode. - After the script finishes, trigger a new call to
create_layout_treeandpaintto render the changes.
Learning milestones:
- A
<script>tag’s code is executed. document.getElementByIdreturns a usable object to JS.- Changing a style property in JS is reflected in your C data structures.
- The page is re-rendered with the dynamic style changes.
Summary
| Project | Main Language | Difficulty |
|---|---|---|
| 1. Simple HTML Parser & DOM Tree | C | Intermediate |
| 2. Simple CSS Parser | C | Intermediate |
| 3. The Style Tree | C | Advanced |
| 4. The Layout Tree & Box Model | C | Advanced |
| 5. Painting to Pixels | C | Advanced |
| 6. Adding Text Rendering | C | Expert |
| 7. Implementing Inline Layout | C | Expert |
8. A querySelector Clone |
C | Advanced |
| 9. The Toy Browser | C | Expert |
| 10. Basic JavaScript Integration | C | Master |