LEARN PYTHON DEEP DIVE
Learn Python: From Zero to Python Master
Goal: Deeply understand Python—from basic syntax and data structures to advanced topics like metaprogramming, concurrency, web development, data science, and building production-ready applications.
Why Learn Python?
Python is one of the most versatile and widely-used programming languages in the world. It’s the language of choice for:
- Data Science & Machine Learning: NumPy, Pandas, Scikit-learn, TensorFlow, PyTorch
- Web Development: Django, Flask, FastAPI
- Automation & Scripting: System administration, DevOps, task automation
- Scientific Computing: Research, simulations, data analysis
- Backend Development: APIs, microservices, serverless functions
- Cybersecurity: Penetration testing, exploit development, forensics
After completing these projects, you will:
- Write clean, idiomatic Python code (Pythonic code)
- Understand Python’s object model and memory management
- Build web applications and APIs
- Process and analyze data effectively
- Create command-line tools and automation scripts
- Understand concurrency patterns (asyncio, threading, multiprocessing)
- Package and distribute your own Python libraries
Core Concept Analysis
The Python Ecosystem
┌─────────────────────────────────────────────────────────────────────────┐
│ PYTHON INTERPRETER │
│ │
│ CPython (reference) │ PyPy (JIT) │ MicroPython (embedded) │
└─────────────────────────────────────────────────────────────────────────┘
│
┌──────────────────────┼──────────────────────┐
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ STANDARD LIB │ │ THIRD-PARTY │ │ YOUR CODE │
│ │ │ │ │ │
│ • collections │ │ • requests │ │ • modules │
│ • itertools │ │ • pandas │ │ • packages │
│ • functools │ │ • django │ │ • scripts │
│ • pathlib │ │ • numpy │ │ • applications │
│ • asyncio │ │ • pytest │ │ │
└──────────────────┘ └──────────────────┘ └──────────────────┘
Key Concepts Explained
1. Python’s Object Model
Everything in Python is an object. Understanding this is fundamental.
┌─────────────────────────────────────────────────────────────────────────┐
│ OBJECT HIERARCHY │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ type ──────────────────┐ │
│ │ │ │
│ ▼ ▼ │
│ object metaclass │
│ │ │
│ ├── int │
│ ├── str │
│ ├── list │
│ ├── dict │
│ ├── function │
│ └── YourClass │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Key concepts:
- Identity:
id(obj)- unique identifier (memory address in CPython) - Type:
type(obj)- object’s class - Value: The data the object holds
2. Data Structures
┌─────────────────────────────────────────────────────────────────────────┐
│ BUILT-IN DATA STRUCTURES │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ SEQUENCES (ordered, indexed): │
│ ┌──────────┬─────────────────────────────────────────┐ │
│ │ list │ Mutable, heterogeneous [1, "a", 3.14]│ │
│ │ tuple │ Immutable (1, 2, 3) │ │
│ │ str │ Immutable, text "hello" │ │
│ │ bytes │ Immutable, binary b"\x00\x01" │ │
│ └──────────┴─────────────────────────────────────────┘ │
│ │
│ MAPPINGS (key-value): │
│ ┌──────────┬─────────────────────────────────────────┐ │
│ │ dict │ Mutable, ordered (3.7+) {"a": 1} │ │
│ └──────────┴─────────────────────────────────────────┘ │
│ │
│ SETS (unordered, unique): │
│ ┌──────────┬─────────────────────────────────────────┐ │
│ │ set │ Mutable {1, 2, 3} │ │
│ │ frozenset│ Immutable frozenset([1])│ │
│ └──────────┴─────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
3. Functions and Callables
┌─────────────────────────────────────────────────────────────────────────┐
│ CALLABLE OBJECTS │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ def my_function(arg): # Regular function │
│ return arg * 2 │
│ │
│ lambda x: x * 2 # Anonymous function │
│ │
│ class MyClass: # Class with __call__ │
│ def __call__(self): │
│ return "called" │
│ │
│ # Closures │
│ def outer(x): │
│ def inner(y): │
│ return x + y # x is captured │
│ return inner │
│ │
│ # Decorators │
│ @decorator │
│ def func(): # func = decorator(func) │
│ pass │
│ │
└─────────────────────────────────────────────────────────────────────────┘
4. Object-Oriented Programming
┌─────────────────────────────────────────────────────────────────────────┐
│ OOP IN PYTHON │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ class Animal: # Base class │
│ def __init__(self, name): # Constructor │
│ self.name = name # Instance attribute │
│ │
│ def speak(self): # Instance method │
│ raise NotImplementedError │
│ │
│ @classmethod # Class method │
│ def create(cls, name): │
│ return cls(name) │
│ │
│ @staticmethod # Static method │
│ def info(): │
│ return "Animals are living beings" │
│ │
│ @property # Property (getter) │
│ def display_name(self): │
│ return f"Animal: {self.name}" │
│ │
│ class Dog(Animal): # Inheritance │
│ def speak(self): # Method override │
│ return "Woof!" │
│ │
└─────────────────────────────────────────────────────────────────────────┘
5. Iterators and Generators
┌─────────────────────────────────────────────────────────────────────────┐
│ ITERATION PROTOCOL │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Iterable ──────────▶ Iterator ──────────▶ StopIteration │
│ __iter__() __next__() │
│ │
│ # Generator function (lazy evaluation) │
│ def count_up(n): │
│ i = 0 │
│ while i < n: │
│ yield i # Pause and return value │
│ i += 1 │
│ │
│ # Generator expression │
│ squares = (x**2 for x in range(10)) │
│ │
│ # Memory comparison: │
│ [x**2 for x in range(1_000_000)] # Creates entire list in memory │
│ (x**2 for x in range(1_000_000)) # Generates one at a time │
│ │
└─────────────────────────────────────────────────────────────────────────┘
6. Context Managers
┌─────────────────────────────────────────────────────────────────────────┐
│ CONTEXT MANAGERS │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ # Using with statement │
│ with open("file.txt") as f: │
│ data = f.read() │
│ # File automatically closed, even if exception occurs │
│ │
│ # Custom context manager (class-based) │
│ class Timer: │
│ def __enter__(self): │
│ self.start = time.time() │
│ return self │
│ │
│ def __exit__(self, exc_type, exc_val, exc_tb): │
│ self.elapsed = time.time() - self.start │
│ return False # Don't suppress exceptions │
│ │
│ # Using contextlib │
│ @contextmanager │
│ def timer(): │
│ start = time.time() │
│ yield │
│ print(f"Elapsed: {time.time() - start}") │
│ │
└─────────────────────────────────────────────────────────────────────────┘
7. Concurrency Models
┌─────────────────────────────────────────────────────────────────────────┐
│ PYTHON CONCURRENCY │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Model │ Use Case │ GIL Impact │
│ ───────────────────────────────────────────────────────────────────── │
│ threading │ I/O-bound tasks │ Limited by GIL │
│ multiprocessing │ CPU-bound tasks │ Bypasses GIL (processes) │
│ asyncio │ High I/O concurrency │ Single thread, event loop │
│ │
│ # Threading (shared memory, GIL) │
│ thread = threading.Thread(target=func, args=(arg,)) │
│ thread.start() │
│ │
│ # Multiprocessing (separate memory) │
│ process = multiprocessing.Process(target=func, args=(arg,)) │
│ process.start() │
│ │
│ # Asyncio (coroutines) │
│ async def main(): │
│ await some_async_operation() │
│ asyncio.run(main()) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Project List
The following 16 projects will teach you Python from fundamentals to mastery.
Project 1: Python REPL from Scratch
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: C, Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Interpreters / REPL
- Software or Tool: Python standard library
- Main Book: “Fluent Python” by Luciano Ramalho
What you’ll build: A custom Read-Eval-Print-Loop (REPL) that evaluates Python expressions, supports command history, multi-line input, and provides helpful error messages with syntax highlighting.
Why it teaches Python: Building a REPL forces you to understand Python’s eval, exec, compile, exception handling, and introspection capabilities. You’ll learn how Python parses and executes code.
Core challenges you’ll face:
- Parsing multi-line statements → maps to understanding Python syntax and code objects
- Handling exceptions gracefully → maps to exception hierarchy and traceback manipulation
- Implementing command history → maps to readline integration and state management
- Syntax highlighting → maps to tokenization and the
tokenizemodule
Resources for key challenges:
- Python Code Objects - Understanding compiled code
- “Fluent Python” Chapter 24 - Class Metaprogramming
- GNU Readline - Command history
Key Concepts:
- Code Objects: Python Data Model documentation
- eval vs exec: “Fluent Python” Ch. 9 - Luciano Ramalho
- Exception Handling: “Effective Python” Item 65 - Brett Slatkin
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic Python syntax, understanding of functions and exceptions
Real world outcome:
$ python my_repl.py
>>> 2 + 2
4
>>> def greet(name):
... return f"Hello, {name}!"
...
>>> greet("World")
'Hello, World!'
>>> undefined_var
NameError: name 'undefined_var' is not defined
File "<repl>", line 1
>>> history
1: 2 + 2
2: def greet(name):
3: greet("World")
>>> exit()
Implementation Hints:
Start with the basic REPL loop:
- Read input with
input()orreadline - Check if statement is complete (handle
...for multi-line) - Compile with
compile()to check syntax - Execute with
exec()or evaluate witheval() - Print the result
- Handle exceptions and show tracebacks
Questions to guide your implementation:
- How do you detect if a statement needs more lines (like
deforif)? - What’s the difference between
eval()andexec()? - How do you maintain variable scope between inputs?
- How do you implement tab completion?
Learning milestones:
- Basic REPL works → Can evaluate single expressions
- Multi-line support → Handle function definitions
- Error handling → Show helpful tracebacks
- History and completion → Professional UX
Project 2: File Synchronization Tool
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: Go, Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Filesystem / Hashing
- Software or Tool: pathlib, hashlib, watchdog
- Main Book: “Automate the Boring Stuff” by Al Sweigart
What you’ll build: A tool that synchronizes files between two directories, detecting changes using hashes, handling conflicts, and providing dry-run and verbose modes.
Why it teaches Python: You’ll master file I/O, hashing, recursive directory traversal, and command-line argument parsing. This is practical Python at its best.
Core challenges you’ll face:
- Efficient file comparison → maps to hashing with hashlib, chunk reading
- Recursive traversal → maps to pathlib, os.walk, generators
- Conflict resolution → maps to timestamps, user interaction
- CLI design → maps to argparse or click
Resources for key challenges:
- pathlib documentation
- “Automate the Boring Stuff” Chapters 9-10
- watchdog library - File system events
Key Concepts:
- Path Operations: pathlib documentation
- Hashing: hashlib documentation
- CLI Arguments: argparse tutorial
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic file I/O, command-line basics
Real world outcome:
$ python filesync.py ~/source ~/backup --verbose
Scanning source directory...
Found 1,247 files (523 MB)
Scanning destination...
Found 1,198 files (498 MB)
Changes detected:
[NEW] documents/report.pdf
[MODIFIED] photos/vacation/img_001.jpg
[DELETED] old_notes.txt (exists only in backup)
Proceed with sync? [y/N] y
Copying documents/report.pdf... done
Copying photos/vacation/img_001.jpg... done
Sync complete! 2 files copied, 25 MB transferred.
$ python filesync.py ~/source ~/backup --dry-run
Would copy: documents/new_file.txt (1.2 MB)
Would skip: photos/same_file.jpg (unchanged)
Dry run complete. 1 file would be copied.
Implementation Hints:
Design your tool in phases:
- Scan both directories, build file manifests (path → hash, size, mtime)
- Compare manifests to find new, modified, deleted files
- Present changes to user
- Execute synchronization with progress reporting
Key implementation questions:
- How do you efficiently hash large files without loading them into memory?
- How do you handle symbolic links?
- What’s your strategy for conflict resolution?
- How do you make the sync resumable if interrupted?
Learning milestones:
- Scan and hash files → Understand pathlib and hashlib
- Compare directories → Set operations on paths
- Copy with progress → shutil and progress bars
- Add watch mode → Real-time sync with watchdog
Project 3: Web Scraper with Rate Limiting
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript (Node.js), Go
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Web / HTTP / Parsing
- Software or Tool: requests, BeautifulSoup, aiohttp
- Main Book: “Web Scraping with Python” by Ryan Mitchell
What you’ll build: A robust web scraper that respects robots.txt, implements rate limiting, handles retries, and extracts structured data from websites.
Why it teaches Python: Web scraping combines HTTP requests, HTML parsing, async programming, and data persistence. It’s a complete Python project.
Core challenges you’ll face:
- Making HTTP requests → maps to requests library, sessions, headers
- Parsing HTML → maps to BeautifulSoup, CSS selectors, XPath
- Rate limiting → maps to time.sleep, token bucket, async
- Data persistence → maps to JSON, CSV, SQLite
Resources for key challenges:
- requests documentation
- Beautiful Soup documentation
- “Web Scraping with Python” by Ryan Mitchell
Key Concepts:
- HTTP Protocol: requests documentation
- HTML Parsing: BeautifulSoup documentation
- Async Scraping: aiohttp documentation
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic Python, understanding of HTTP and HTML
Real world outcome:
$ python scraper.py --url "https://news.ycombinator.com" --output stories.json
Checking robots.txt... OK
Fetching page 1... 200 OK
Parsed 30 stories
Fetching page 2... 200 OK (rate limited: 1s delay)
Parsed 30 stories
...
Saved 150 stories to stories.json
$ cat stories.json | head -20
[
{
"title": "Show HN: I built a Python scraper",
"url": "https://example.com/article",
"points": 142,
"comments": 45,
"author": "username",
"timestamp": "2024-01-15T10:30:00Z"
},
...
]
Implementation Hints:
Architecture for a robust scraper:
- URL queue (deque or priority queue)
- Fetcher with rate limiting and retries
- Parser that extracts structured data
- Storage layer (JSON, CSV, or database)
- Respect robots.txt (use
urllib.robotparser)
Key questions:
- How do you handle pagination?
- How do you avoid getting blocked (headers, delays, proxies)?
- How do you handle JavaScript-rendered content?
- How do you resume after interruption?
Learning milestones:
- Basic fetching → Get pages with requests
- Parse HTML → Extract data with BeautifulSoup
- Rate limiting → Polite scraping
- Async scraping → Concurrent with aiohttp
Project 4: REST API with Authentication
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript (Node.js), Go
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 2: Intermediate
- Knowledge Area: Web Development / APIs
- Software or Tool: FastAPI, SQLAlchemy, JWT
- Main Book: “Building APIs with FastAPI” (FastAPI documentation)
What you’ll build: A complete REST API with user authentication (JWT), CRUD operations, input validation, database persistence, and automatic OpenAPI documentation.
Why it teaches Python: Modern Python web development with type hints, async/await, dependency injection, and ORMs. FastAPI is the future of Python APIs.
Core challenges you’ll face:
- Routing and endpoints → maps to FastAPI decorators, path parameters
- Input validation → maps to Pydantic models, type hints
- Authentication → maps to JWT tokens, password hashing
- Database operations → maps to SQLAlchemy ORM, migrations
Resources for key challenges:
Key Concepts:
- REST Principles: FastAPI tutorial
- Type Hints: “Fluent Python” Ch. 8
- Async Python: “Python Concurrency with asyncio” by Matthew Fowler
Difficulty: Intermediate Time estimate: 2-3 weeks Prerequisites: Basic Python, understanding of HTTP and databases
Real world outcome:
$ uvicorn main:app --reload
INFO: Uvicorn running on http://127.0.0.1:8000
# In another terminal:
$ curl http://localhost:8000/docs
# Opens Swagger UI with interactive API documentation
$ curl -X POST http://localhost:8000/auth/register \
-H "Content-Type: application/json" \
-d '{"email": "user@example.com", "password": "secure123"}'
{"id": 1, "email": "user@example.com", "created_at": "2024-01-15T10:00:00Z"}
$ curl -X POST http://localhost:8000/auth/login \
-d "username=user@example.com&password=secure123"
{"access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9...", "token_type": "bearer"}
$ curl http://localhost:8000/items \
-H "Authorization: Bearer eyJ0eXAiOiJKV1Q..."
[{"id": 1, "name": "Item 1", "price": 9.99}]
Implementation Hints:
Project structure:
api/
├── main.py # FastAPI app, routes
├── models.py # SQLAlchemy models
├── schemas.py # Pydantic schemas
├── database.py # DB connection
├── auth.py # JWT handling
└── crud.py # Database operations
Key implementation questions:
- How do you hash passwords securely (bcrypt)?
- How do you create and verify JWT tokens?
- How do you protect routes with authentication?
- How do you handle database migrations (Alembic)?
Learning milestones:
- Basic CRUD API → Create, read, update, delete items
- Input validation → Pydantic models
- Database integration → SQLAlchemy ORM
- Authentication → JWT tokens and protected routes
Project 5: Command-Line Task Manager (TODO App)
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: Rust, Go
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 1: Beginner
- Knowledge Area: CLI Applications / Data Persistence
- Software or Tool: Typer or Click, Rich, SQLite
- Main Book: “Automate the Boring Stuff” by Al Sweigart
What you’ll build: A beautiful command-line task manager with colored output, due dates, priorities, tags, and persistent storage.
Why it teaches Python: Perfect for learning CLI frameworks, data persistence, and creating user-friendly terminal applications with rich output.
Core challenges you’ll face:
- CLI argument parsing → maps to Typer or Click, subcommands
- Pretty terminal output → maps to Rich library, colors, tables
- Data persistence → maps to SQLite, JSON, or YAML
- Date handling → maps to datetime, parsing, formatting
Resources for key challenges:
Key Concepts:
- CLI Design: Click/Typer documentation
- Rich Output: Rich library documentation
- Data Persistence: SQLite documentation
Difficulty: Beginner Time estimate: 1 week Prerequisites: Basic Python syntax
Real world outcome:
$ todo add "Learn Python" --priority high --due "2024-01-20"
✓ Task added: "Learn Python" (due Jan 20)
$ todo add "Build web app" --tag work --tag python
✓ Task added: "Build web app"
$ todo list
┏━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ ID ┃ Task ┃ Priority ┃ Due ┃ Tags ┃
┡━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ 1 │ Learn Python │ 🔴 HIGH │ Jan 20 │ │
│ 2 │ Build web app │ ⚪ NORMAL│ │ work,python │
└────┴─────────────────┴──────────┴───────────┴─────────────┘
$ todo complete 1
✓ Marked "Learn Python" as complete!
$ todo list --completed
┏━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ ID ┃ Task ┃ Completed ┃
┡━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ 1 │ Learn Python │ 2024-01-15 10:30 │
└────┴─────────────────┴───────────────────┘
Implementation Hints:
Use Typer for modern CLI design:
# Typer provides automatic --help, type validation, and tab completion
app = typer.Typer()
@app.command()
def add(task: str, priority: str = "normal", due: str = None):
...
@app.command()
def list(completed: bool = False):
...
Key questions:
- Where do you store the database (XDG_DATA_HOME)?
- How do you handle different date formats?
- How do you implement filters (by tag, priority, date)?
- How do you add shell completion?
Learning milestones:
- Basic add/list → CRUD operations
- Rich output → Tables and colors
- Filters and sorting → Query capabilities
- Persistent storage → SQLite database
Project 6: Data Analysis Pipeline
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: R, Julia
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 2: Intermediate
- Knowledge Area: Data Science / ETL
- Software or Tool: Pandas, NumPy, Matplotlib
- Main Book: “Python for Data Analysis” by Wes McKinney
What you’ll build: A complete data pipeline that ingests CSV/JSON data, cleans it, transforms it, performs analysis, and generates visualizations and reports.
Why it teaches Python: Data analysis is one of Python’s strengths. You’ll master Pandas DataFrames, vectorized operations, and data visualization.
Core challenges you’ll face:
- Data loading and cleaning → maps to Pandas read_, handling missing values*
- Data transformation → maps to groupby, merge, pivot, apply
- Statistical analysis → maps to aggregations, correlations
- Visualization → maps to Matplotlib, Seaborn plots
Resources for key challenges:
- Pandas documentation
- “Python for Data Analysis” by Wes McKinney
- Seaborn tutorial
Key Concepts:
- DataFrame Operations: “Python for Data Analysis” Ch. 5-8
- Vectorization: NumPy and Pandas docs
- Visualization: Matplotlib documentation
Difficulty: Intermediate Time estimate: 2 weeks Prerequisites: Basic Python, understanding of data concepts
Real world outcome:
$ python analyze.py sales_data.csv --output report/
Loading data... 50,000 rows loaded
Cleaning data...
- Removed 127 duplicates
- Fixed 45 missing values
- Converted date columns
Analyzing...
Revenue by Region:
North America: $1,234,567
Europe: $987,654
Asia Pacific: $756,432
Top Products:
1. Widget Pro $234,567 (12,345 units)
2. Gadget Plus $198,765 (9,876 units)
3. Tool Basic $156,789 (15,678 units)
Generating visualizations...
- report/revenue_by_month.png
- report/top_products.png
- report/region_comparison.png
Report saved to report/summary.html
Implementation Hints:
Pipeline stages:
- Extract: Load data from CSV, JSON, Excel, or API
- Transform: Clean, normalize, aggregate
- Load: Save to database or files
- Visualize: Generate charts and reports
Key questions:
- How do you handle large datasets that don’t fit in memory (chunking)?
- How do you deal with messy real-world data?
- How do you make your pipeline reproducible?
- How do you create interactive visualizations?
Learning milestones:
- Load and explore → Understand your data
- Clean and transform → Prepare for analysis
- Analyze → Extract insights
- Visualize and report → Communicate findings
Project 7: Testing Framework from Scratch
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript, Ruby
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 3: Advanced
- Knowledge Area: Testing / Metaprogramming
- Software or Tool: Python introspection, AST
- Main Book: “Python Testing with pytest” by Brian Okken
What you’ll build: A minimal testing framework like pytest, with test discovery, assertions, fixtures, parameterization, and colored output.
Why it teaches Python: Building a test framework requires deep understanding of Python’s introspection, decorators, context managers, and module system.
Core challenges you’ll face:
- Test discovery → maps to importlib, inspecting modules for test functions
- Assertion introspection → maps to AST manipulation, better error messages
- Fixtures → maps to decorators, dependency injection, scope
- Parameterization → maps to decorators that generate multiple tests
Resources for key challenges:
- pytest internals
- “Fluent Python” Chapter 24 - Class Metaprogramming
- Python AST documentation
Key Concepts:
- Introspection: “Fluent Python” Ch. 23
- Decorators: “Fluent Python” Ch. 9
- Context Managers: “Fluent Python” Ch. 18
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Good understanding of Python functions, decorators, classes
Real world outcome:
# test_example.py
from mytest import fixture, parametrize
@fixture
def database():
db = connect()
yield db
db.close()
def test_addition():
assert 2 + 2 == 4
def test_with_fixture(database):
assert database.query("SELECT 1") == 1
@parametrize("a,b,expected", [(1,1,2), (2,3,5), (0,0,0)])
def test_add(a, b, expected):
assert a + b == expected
def test_failure():
assert [1, 2, 3] == [1, 2, 4] # Should show diff
$ python -m mytest test_example.py
test_example.py
✓ test_addition (0.001s)
✓ test_with_fixture (0.023s)
✓ test_add[1-1-2] (0.001s)
✓ test_add[2-3-5] (0.001s)
✓ test_add[0-0-0] (0.001s)
✗ test_failure (0.001s)
AssertionError: Lists differ:
- [1, 2, 3]
+ [1, 2, 4]
^--- Index 2
4 passed, 1 failed in 0.027s
Implementation Hints:
Core components:
- Test Collector: Find all
test_*.pyfiles andtest_*functions - Test Runner: Execute tests, capture results
- Assertions: Rewrite or intercept for better messages
- Fixtures: Decorator that registers setup/teardown
- Reporter: Output results with colors
Key questions:
- How do you discover test functions in a module?
- How do you inject fixtures into test function arguments?
- How do you provide helpful assertion messages without writing them?
- How do you handle exceptions in tests?
Learning milestones:
- Basic discovery and run → Find and run test_* functions
- Better assertions → Show diffs for failures
- Fixtures → Setup/teardown with dependency injection
- Parameterization → Generate multiple tests
Project 8: Async Web Crawler
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: Go, Node.js
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Async Programming / Networking
- Software or Tool: aiohttp, asyncio, BeautifulSoup
- Main Book: “Using Asyncio in Python” by Caleb Hattingh
What you’ll build: A concurrent web crawler that can crawl thousands of pages efficiently using asyncio, with configurable depth, domain filtering, and data extraction.
Why it teaches Python: Mastering asyncio is essential for high-performance Python. This project teaches you coroutines, event loops, semaphores, and concurrent patterns.
Core challenges you’ll face:
- Concurrent HTTP requests → maps to aiohttp sessions, connection pooling
- Controlling concurrency → maps to semaphores, limiting requests
- URL frontier management → maps to async queues, deduplication
- Graceful shutdown → maps to signal handling, task cancellation
Resources for key challenges:
- asyncio documentation
- aiohttp documentation
- “Using Asyncio in Python” by Caleb Hattingh
Key Concepts:
- Coroutines: “Using Asyncio in Python” Ch. 3
- Event Loop: asyncio documentation
- Async Patterns: “Fluent Python” Ch. 21
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Solid Python, understanding of async/await
Real world outcome:
$ python crawler.py --url "https://docs.python.org" --depth 2 --concurrency 20
Starting crawl of https://docs.python.org
Concurrency: 20 workers
Max depth: 2
Crawling...
[00:01] Crawled: 50 | Queue: 234 | Errors: 0 | 45.2 pages/sec
[00:02] Crawled: 152 | Queue: 567 | Errors: 2 | 51.0 pages/sec
[00:03] Crawled: 298 | Queue: 432 | Errors: 3 | 48.7 pages/sec
...
[00:15] Crawled: 1,247 | Queue: 0 | Errors: 12 | 83.1 pages/sec
Crawl complete!
Total pages: 1,247
Unique URLs: 1,235
Errors: 12 (0.96%)
Time: 15.2s
Average: 82.0 pages/sec
Results saved to crawl_results.json
Implementation Hints:
Async crawler architecture:
async def main():
queue = asyncio.Queue()
seen = set()
semaphore = asyncio.Semaphore(concurrency)
await queue.put(start_url)
workers = [asyncio.create_task(worker(queue, seen, semaphore))
for _ in range(concurrency)]
await queue.join() # Wait until queue is empty
for w in workers:
w.cancel()
Key questions:
- How do you prevent visiting the same URL twice?
- How do you respect robots.txt asynchronously?
- How do you handle timeouts and retries?
- How do you cleanly shut down when interrupted?
Learning milestones:
- Basic async fetching → Understand coroutines
- Concurrent workers → Multiple tasks
- Queue-based architecture → Producer-consumer pattern
- Production features → Rate limiting, error handling
Project 9: Python Package Creator
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: N/A
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Packaging / Distribution
- Software or Tool: setuptools, poetry, twine
- Main Book: Python Packaging User Guide
What you’ll build: A complete Python package with proper structure, dependencies, tests, documentation, and published to PyPI (test or real).
Why it teaches Python: Understanding Python packaging is essential for professional development. You’ll learn about modules, packages, entry points, and distribution.
Core challenges you’ll face:
- Package structure → maps to
__init__.py, imports, namespaces - Dependency management → maps to pyproject.toml, requirements
- Build system → maps to setuptools, poetry, flit
- Distribution → maps to PyPI, wheels, twine
Resources for key challenges:
Key Concepts:
- Package Structure: Packaging User Guide
- pyproject.toml: PEP 517/518/621
- Entry Points: setuptools documentation
Difficulty: Intermediate Time estimate: 1 week Prerequisites: Basic Python modules
Real world outcome:
# Package structure
mypackage/
├── pyproject.toml
├── README.md
├── LICENSE
├── src/
│ └── mypackage/
│ ├── __init__.py
│ ├── core.py
│ └── cli.py
├── tests/
│ ├── test_core.py
│ └── test_cli.py
└── docs/
└── index.md
# Build and test
$ poetry build
Building mypackage (0.1.0)
- Building sdist
- Built mypackage-0.1.0.tar.gz
- Building wheel
- Built mypackage-0.1.0-py3-none-any.whl
$ poetry publish --repository testpypi
Publishing mypackage (0.1.0) to testpypi
- Uploading mypackage-0.1.0-py3-none-any.whl 100%
- Uploading mypackage-0.1.0.tar.gz 100%
# Now anyone can install it!
$ pip install -i https://test.pypi.org/simple/ mypackage
$ mypackage --help # CLI works!
Implementation Hints:
Modern pyproject.toml:
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
[tool.poetry]
name = "mypackage"
version = "0.1.0"
description = "A useful package"
authors = ["Your Name <you@example.com>"]
[tool.poetry.dependencies]
python = "^3.9"
requests = "^2.28"
[tool.poetry.scripts]
mypackage = "mypackage.cli:main"
Key questions:
- What’s the difference between src/ layout and flat layout?
- How do you handle optional dependencies?
- How do you create console scripts (entry points)?
- How do you version your package?
Learning milestones:
- Create structure → Proper package layout
- Add dependencies → pyproject.toml
- Build package → Wheel and sdist
- Publish → Upload to PyPI
Project 10: ORM from Scratch
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: Ruby, TypeScript
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: Databases / Metaprogramming
- Software or Tool: Python descriptors, metaclasses, sqlite3
- Main Book: “Fluent Python” by Luciano Ramalho
What you’ll build: A simple Object-Relational Mapper that maps Python classes to database tables, supports CRUD operations, queries, and relationships.
Why it teaches Python: Building an ORM requires mastery of descriptors, metaclasses, and the Python data model. It’s one of the most instructive advanced projects.
Core challenges you’ll face:
- Model definition → maps to metaclasses, descriptors
- SQL generation → maps to string formatting, escaping
- Query building → maps to method chaining, lazy evaluation
- Relationships → maps to foreign keys, lazy loading
Resources for key challenges:
- “Fluent Python” Chapters 23-24 (Descriptors and Metaclasses)
- SQLAlchemy architecture
- Python Data Model
Key Concepts:
- Descriptors: “Fluent Python” Ch. 23
- Metaclasses: “Fluent Python” Ch. 24
- SQL Basics: Any SQL tutorial
Difficulty: Expert Time estimate: 3-4 weeks Prerequisites: Solid Python OOP, understanding of databases
Real world outcome:
# Define models
from myorm import Model, Field, ForeignKey
class User(Model):
table_name = "users"
name = Field(str)
email = Field(str, unique=True)
age = Field(int, nullable=True)
class Post(Model):
table_name = "posts"
title = Field(str)
content = Field(str)
author = ForeignKey(User)
# Create tables
User.create_table()
Post.create_table()
# CRUD operations
user = User(name="Alice", email="alice@example.com")
user.save()
print(user.id) # Auto-generated
# Queries
users = User.filter(age__gt=18).order_by("name").limit(10)
for user in users:
print(user.name)
# Relationships
post = Post(title="Hello", content="World", author=user)
post.save()
print(post.author.name) # Lazy loads the user
Implementation Hints:
ORM architecture:
- Field Descriptor: Manages attribute access and validation
- ModelMeta: Metaclass that registers fields and creates table schema
- Model Base: Provides save(), delete(), and query methods
- Query Builder: Chainable methods that build SQL
Key questions:
- How do descriptors work (
__get__,__set__)? - How do metaclasses customize class creation?
- How do you prevent SQL injection?
- How do you implement lazy loading for relationships?
Learning milestones:
- Basic model and save → Descriptors for fields
- Query building → Filter, order_by, limit
- Relationships → ForeignKey with lazy loading
- Migrations → Schema changes
Project 11: Static Site Generator
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: Ruby, JavaScript
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Templates / Markdown / Web
- Software or Tool: Jinja2, Markdown, pathlib
- Main Book: “Flask Web Development” (Jinja2 chapters) by Miguel Grinberg
What you’ll build: A static site generator like Hugo or Jekyll that converts Markdown files to HTML, supports templates, and generates a complete website.
Why it teaches Python: Combines template engines, Markdown parsing, file system operations, and web concepts. Great for understanding how web frameworks work under the hood.
Core challenges you’ll face:
- Markdown parsing → maps to markdown library, frontmatter
- Template rendering → maps to Jinja2 templates, inheritance
- Asset handling → maps to copying files, paths
- Live reload → maps to watchdog, websockets
Resources for key challenges:
Key Concepts:
- Template Engines: Jinja2 documentation
- Markdown: Python-Markdown documentation
- File Operations: pathlib documentation
Difficulty: Intermediate Time estimate: 2 weeks Prerequisites: Basic Python, HTML/CSS knowledge
Real world outcome:
# Project structure
site/
├── content/
│ ├── index.md
│ ├── about.md
│ └── posts/
│ ├── first-post.md
│ └── second-post.md
├── templates/
│ ├── base.html
│ ├── page.html
│ └── post.html
├── static/
│ ├── css/style.css
│ └── images/
└── config.yaml
$ python ssg.py build
Loading configuration...
Parsing content...
Found 4 pages, 2 posts
Rendering templates...
Copying static files...
Build complete! Output in ./dist/
$ python ssg.py serve
Starting development server...
Watching for changes...
Server running at http://localhost:8000
# Edit a file...
[12:34:56] Detected change in content/about.md
[12:34:56] Rebuilding... done (0.23s)
[12:34:56] Browser refreshed
Implementation Hints:
Generator pipeline:
- Parse config file
- Scan content directory for Markdown files
- Parse frontmatter (title, date, tags) and content
- Render each page through Jinja2 templates
- Copy static files
- Write output to dist/
Key questions:
- How do you parse YAML frontmatter from Markdown?
- How do you implement template inheritance?
- How do you generate navigation and post lists?
- How do you implement live reload?
Learning milestones:
- Parse and convert → Markdown to HTML
- Templates → Jinja2 rendering
- Full site → Navigation, posts, assets
- Dev server → Live reload
Project 12: Chat Application with WebSockets
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript, Go
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Real-time / WebSockets
- Software or Tool: FastAPI, websockets, asyncio
- Main Book: FastAPI WebSocket documentation
What you’ll build: A real-time chat application with multiple rooms, user presence, message history, and a web frontend.
Why it teaches Python: Real-time applications require understanding of WebSockets, async programming, and state management. This combines frontend and backend.
Core challenges you’ll face:
- WebSocket connections → maps to bidirectional communication
- Room management → maps to connection pools, broadcasting
- Message persistence → maps to database, history
- User presence → maps to heartbeats, online status
Resources for key challenges:
Key Concepts:
- WebSocket Protocol: RFC 6455
- Async Connection Handling: asyncio documentation
- Broadcasting: Connection manager patterns
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Async Python, basic frontend
Real world outcome:
# Terminal 1 - Server
$ python chat_server.py
INFO: Chat server running on ws://localhost:8000/ws
# Browser at http://localhost:8000
┌────────────────────────────────────────────────────────┐
│ 💬 Python Chat - Room: general │
├────────────────────────────────────────────────────────┤
│ Online: Alice, Bob, Charlie (3) │
├────────────────────────────────────────────────────────┤
│ [10:30] Alice: Hello everyone! │
│ [10:31] Bob: Hi Alice! How's it going? │
│ [10:31] Charlie: 👋 │
│ [10:32] Alice: Working on a Python project! │
│ [10:32] System: Dave joined the room │
│ │
├────────────────────────────────────────────────────────┤
│ Type a message... [Send] │
└────────────────────────────────────────────────────────┘
Implementation Hints:
Server architecture:
class ConnectionManager:
def __init__(self):
self.rooms: dict[str, set[WebSocket]] = {}
async def connect(self, websocket: WebSocket, room: str):
await websocket.accept()
self.rooms.setdefault(room, set()).add(websocket)
async def broadcast(self, room: str, message: str):
for connection in self.rooms.get(room, []):
await connection.send_text(message)
Key questions:
- How do you handle disconnections gracefully?
- How do you implement user authentication for WebSockets?
- How do you scale to multiple server instances (Redis pub/sub)?
- How do you handle reconnection on the client side?
Learning milestones:
- Basic echo → Understand WebSocket lifecycle
- Broadcasting → Send to multiple clients
- Rooms → Organize connections
- Full chat → Users, history, presence
Project 13: Machine Learning Pipeline
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: R, Julia
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 3: Advanced
- Knowledge Area: Machine Learning / Data Science
- Software or Tool: scikit-learn, pandas, joblib
- Main Book: “Hands-On Machine Learning” by Aurélien Géron
What you’ll build: A complete ML pipeline with data preprocessing, feature engineering, model training, evaluation, and deployment as an API.
Why it teaches Python: Machine learning is one of Python’s killer applications. This project covers the full ML workflow from data to production.
Core challenges you’ll face:
- Data preprocessing → maps to pandas, handling missing values
- Feature engineering → maps to transformers, pipelines
- Model training → maps to scikit-learn estimators, cross-validation
- Model deployment → maps to serialization, API serving
Resources for key challenges:
- scikit-learn documentation
- “Hands-On Machine Learning” by Aurélien Géron
- MLflow for tracking
Key Concepts:
- ML Pipelines: scikit-learn Pipeline documentation
- Cross-Validation: “Hands-On ML” Ch. 2
- Model Persistence: joblib documentation
Difficulty: Advanced Time estimate: 3-4 weeks Prerequisites: Python, basic statistics, pandas
Real world outcome:
$ python train.py --data housing.csv --model random_forest
Loading data... 20,640 samples
Preprocessing...
- Handling 207 missing values
- Encoding 2 categorical features
- Scaling 8 numerical features
Training with 5-fold cross-validation...
Fold 1: RMSE = 48,234
Fold 2: RMSE = 47,891
Fold 3: RMSE = 49,102
Fold 4: RMSE = 48,567
Fold 5: RMSE = 47,999
Mean RMSE: 48,359 (+/- 456)
Training final model on full dataset...
Model saved to models/housing_rf_20240115.joblib
$ python serve.py --model models/housing_rf_20240115.joblib
Model loaded successfully
API running at http://localhost:8000
$ curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"longitude": -122.23, "latitude": 37.88, "rooms": 5, ...}'
{"prediction": 352000, "confidence": 0.87}
Implementation Hints:
ML pipeline structure:
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
preprocessing = ColumnTransformer([
('num', StandardScaler(), numerical_cols),
('cat', OneHotEncoder(), categorical_cols)
])
pipeline = Pipeline([
('preprocess', preprocessing),
('model', RandomForestRegressor())
])
# Train
pipeline.fit(X_train, y_train)
# Save
joblib.dump(pipeline, 'model.joblib')
Key questions:
- How do you handle data leakage in preprocessing?
- How do you select features?
- How do you tune hyperparameters?
- How do you version and track experiments?
Learning milestones:
- Data exploration → Understand your data
- Preprocessing pipeline → Repeatable transformations
- Model training → Cross-validation, tuning
- Deployment → Serve predictions via API
Project 14: Debugger/Profiler Tool
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: C (for extension)
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: Debugging / Profiling / Internals
- Software or Tool: sys.settrace, cProfile, memory_profiler
- Main Book: “High Performance Python” by Gorelick & Ozsvald
What you’ll build: A debugging/profiling tool that traces function calls, measures execution time, tracks memory usage, and provides insights into code behavior.
Why it teaches Python: Understanding Python’s tracing and profiling APIs gives you deep insight into how the interpreter works.
Core challenges you’ll face:
- Tracing execution → maps to sys.settrace, frame objects
- Timing code → maps to cProfile, time.perf_counter
- Memory tracking → maps to tracemalloc, gc
- Visualization → maps to flame graphs, call trees
Resources for key challenges:
- sys.settrace documentation
- “High Performance Python” by Gorelick & Ozsvald
- tracemalloc documentation
Key Concepts:
- Frame Objects: Python Data Model
- Profiling: “High Performance Python” Ch. 2
- Memory: “High Performance Python” Ch. 11
Difficulty: Expert Time estimate: 3-4 weeks Prerequisites: Strong Python, understanding of execution model
Real world outcome:
$ python -m mytrace slow_script.py
Tracing execution...
Call Tree:
main (slow_script.py:45)
├── load_data (slow_script.py:12) - 2.34s, 3 calls
│ └── parse_json (slow_script.py:8) - 1.89s, 150 calls
├── process_data (slow_script.py:28) - 5.67s, 1 call
│ ├── transform (slow_script.py:20) - 3.45s, 150 calls
│ └── validate (slow_script.py:25) - 2.22s, 150 calls
└── save_results (slow_script.py:40) - 0.45s, 1 call
Hotspots:
1. process_data 5.67s (66.8%)
2. load_data 2.34s (27.6%)
3. save_results 0.45s ( 5.3%)
Memory Profile:
Peak usage: 234.5 MB
Top allocations:
1. list at process_data:32 - 89.2 MB
2. dict at load_data:15 - 45.6 MB
3. str at parse_json:10 - 23.4 MB
Implementation Hints:
Basic tracer:
def trace_calls(frame, event, arg):
if event == 'call':
code = frame.f_code
print(f"Calling {code.co_name} in {code.co_filename}:{frame.f_lineno}")
return trace_calls
sys.settrace(trace_calls)
Key questions:
- How do you measure time accurately (perf_counter)?
- How do you track memory without affecting performance too much?
- How do you generate useful visualizations?
- How do you profile only specific functions?
Learning milestones:
- Basic tracing → See function calls
- Timing → Measure function duration
- Memory → Track allocations
- Visualization → Flame graphs, reports
Project 15: Plugin System
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: Java, C#
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Architecture / Extensibility
- Software or Tool: importlib, entry_points, ABC
- Main Book: “Architecture Patterns with Python” by Percival & Gregory
What you’ll build: A plugin system that allows dynamically loading, registering, and executing plugins, with dependency management and configuration.
Why it teaches Python: Plugin systems require understanding of dynamic imports, abstract base classes, and software architecture. Many real projects need this pattern.
Core challenges you’ll face:
- Dynamic imports → maps to importlib, import
- Plugin discovery → maps to entry_points, directory scanning
- Plugin contracts → maps to ABC, protocols
- Lifecycle management → maps to initialization, cleanup, events
Resources for key challenges:
- importlib documentation
- Entry points
- “Architecture Patterns with Python” by Percival & Gregory
Key Concepts:
- Abstract Base Classes: “Fluent Python” Ch. 13
- Dynamic Loading: importlib documentation
- Entry Points: setuptools documentation
Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Solid Python OOP, understanding of packages
Real world outcome:
# Core application
from myplugins import PluginManager
manager = PluginManager()
manager.discover() # Find installed plugins
# List available plugins
for plugin in manager.plugins:
print(f"{plugin.name} v{plugin.version}: {plugin.description}")
# Output:
# markdown-extra v1.0: Extended Markdown syntax
# code-highlight v2.1: Syntax highlighting for code blocks
# toc-generator v1.2: Auto-generate table of contents
# Use a plugin
content = manager.get("markdown-extra").process(raw_markdown)
# Plugin definition (in separate package)
from myplugins import Plugin, hook
class MarkdownExtra(Plugin):
name = "markdown-extra"
version = "1.0"
description = "Extended Markdown syntax"
@hook("process")
def process(self, content: str) -> str:
# Add extra processing
return processed_content
Implementation Hints:
Plugin architecture:
- Plugin Base Class: Define the contract (methods, hooks)
- Registry: Keep track of loaded plugins
- Discovery: Find plugins via entry points or directories
- Loader: Import and instantiate plugins
- Event System: Allow plugins to hook into events
Key questions:
- How do you define a plugin interface (ABC vs Protocol)?
- How do you handle plugin dependencies?
- How do you sandbox untrusted plugins?
- How do you handle plugin conflicts?
Learning milestones:
- Basic loading → Import plugins dynamically
- Discovery → Find plugins automatically
- Hooks → Event-based extension points
- Configuration → Per-plugin settings
Project 16: Complete Application: Personal Finance Tracker
- File: LEARN_PYTHON_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript, Go
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 3: Advanced
- Knowledge Area: Full-Stack / Complete Application
- Software or Tool: FastAPI, SQLAlchemy, React or HTMX, Chart.js
- Main Book: All previous books combined
What you’ll build: A complete personal finance application with bank import, categorization, budgeting, reports, and visualizations—combining CLI, API, and web interface.
Why it teaches Python: This capstone project integrates everything: data processing, web development, databases, APIs, testing, and deployment.
Core challenges you’ll face:
- Data import → maps to CSV parsing, bank format handling
- Categorization → maps to rules engine, possibly ML
- API design → maps to REST endpoints, authentication
- Visualization → maps to charts, reports
- Testing → maps to unit, integration, e2e tests
Time estimate: 4-6 weeks Prerequisites: All previous projects or equivalent experience
Real world outcome:
# CLI for quick access
$ finance import statements/bank_2024.csv
Imported 234 transactions from Jan 1 to Jan 31
$ finance report monthly
January 2024 Summary
────────────────────────────────
Income: $5,234.56
Expenses: $3,456.78
Savings: $1,777.78 (34%)
Top Categories:
Housing $1,200.00 (35%)
Food $567.89 (16%)
Transport $234.56 (7%)
...
# Web interface
$ finance serve
Server running at http://localhost:8000
# Beautiful dashboard with:
# - Transaction list with search/filter
# - Spending pie charts
# - Monthly trends line graph
# - Budget progress bars
# - Category management
Implementation Hints:
Architecture:
finance/
├── cli/ # Click-based CLI
├── api/ # FastAPI backend
├── web/ # Frontend (templates or React)
├── core/ # Business logic
│ ├── models.py # SQLAlchemy models
│ ├── import.py # Bank import parsers
│ ├── categorize.py # Categorization rules
│ └── reports.py # Report generation
├── tests/
└── main.py
This project combines:
- Project 4: REST API with FastAPI
- Project 5: CLI with Typer
- Project 6: Data analysis with Pandas
- Project 10: Database with ORM
- Project 7: Testing
Learning milestones:
- Data model → Transactions, categories, budgets
- Import pipeline → Parse various bank formats
- API + CLI → Multiple interfaces
- Web dashboard → Visualizations and UX
- Deployment → Docker, production-ready
Project Comparison Table
| # | Project | Difficulty | Time | Key Skill | Fun |
|---|---|---|---|---|---|
| 1 | Python REPL | ⭐⭐ | 1-2 weeks | Introspection | ⭐⭐⭐ |
| 2 | File Sync Tool | ⭐⭐ | 1-2 weeks | File I/O | ⭐⭐⭐ |
| 3 | Web Scraper | ⭐⭐ | 1-2 weeks | HTTP/Parsing | ⭐⭐⭐⭐ |
| 4 | REST API | ⭐⭐ | 2-3 weeks | Web Dev | ⭐⭐⭐⭐ |
| 5 | CLI Task Manager | ⭐ | 1 week | CLI Design | ⭐⭐⭐ |
| 6 | Data Pipeline | ⭐⭐ | 2 weeks | Data Analysis | ⭐⭐⭐⭐ |
| 7 | Testing Framework | ⭐⭐⭐ | 2-3 weeks | Metaprogramming | ⭐⭐⭐⭐ |
| 8 | Async Web Crawler | ⭐⭐⭐ | 2-3 weeks | Async | ⭐⭐⭐⭐⭐ |
| 9 | Package Creator | ⭐⭐ | 1 week | Packaging | ⭐⭐ |
| 10 | ORM from Scratch | ⭐⭐⭐⭐ | 3-4 weeks | Metaprogramming | ⭐⭐⭐⭐ |
| 11 | Static Site Generator | ⭐⭐ | 2 weeks | Templates | ⭐⭐⭐⭐ |
| 12 | Chat with WebSockets | ⭐⭐⭐ | 2-3 weeks | Real-time | ⭐⭐⭐⭐⭐ |
| 13 | ML Pipeline | ⭐⭐⭐ | 3-4 weeks | Machine Learning | ⭐⭐⭐⭐⭐ |
| 14 | Debugger/Profiler | ⭐⭐⭐⭐ | 3-4 weeks | Internals | ⭐⭐⭐⭐ |
| 15 | Plugin System | ⭐⭐⭐ | 2 weeks | Architecture | ⭐⭐⭐ |
| 16 | Finance Tracker | ⭐⭐⭐ | 4-6 weeks | Full-Stack | ⭐⭐⭐⭐⭐ |
Recommended Learning Path
Phase 1: Foundations (3-4 weeks)
Build a solid foundation with practical projects:
- Project 5: CLI Task Manager - Learn CLI basics
- Project 2: File Synchronization Tool - Master file operations
- Project 3: Web Scraper - Understand HTTP and parsing
Phase 2: Web Development (4-5 weeks)
Learn to build web applications:
- Project 4: REST API - Modern API development
- Project 11: Static Site Generator - Templates and content
- Project 8: Async Web Crawler - Master async Python
Phase 3: Data & Analysis (3-4 weeks)
Data processing and machine learning:
- Project 6: Data Analysis Pipeline - Pandas mastery
- Project 13: Machine Learning Pipeline - Full ML workflow
Phase 4: Advanced Python (4-6 weeks)
Deep Python knowledge:
- Project 1: Python REPL - Introspection and execution
- Project 7: Testing Framework - Advanced metaprogramming
- Project 10: ORM from Scratch - Descriptors and metaclasses
Phase 5: Architecture & Integration (4-6 weeks)
Building production systems:
- Project 9: Package Creator - Distribution
- Project 12: Chat with WebSockets - Real-time apps
- Project 15: Plugin System - Extensible architecture
- Project 14: Debugger/Profiler - Python internals
Phase 6: Capstone (4-6 weeks)
Put it all together:
- Project 16: Personal Finance Tracker - Complete application
Summary
| # | Project | Main Language |
|---|---|---|
| 1 | Python REPL from Scratch | Python |
| 2 | File Synchronization Tool | Python |
| 3 | Web Scraper with Rate Limiting | Python |
| 4 | REST API with Authentication | Python |
| 5 | Command-Line Task Manager | Python |
| 6 | Data Analysis Pipeline | Python |
| 7 | Testing Framework from Scratch | Python |
| 8 | Async Web Crawler | Python |
| 9 | Python Package Creator | Python |
| 10 | ORM from Scratch | Python |
| 11 | Static Site Generator | Python |
| 12 | Chat Application with WebSockets | Python |
| 13 | Machine Learning Pipeline | Python |
| 14 | Debugger/Profiler Tool | Python |
| 15 | Plugin System | Python |
| 16 | Personal Finance Tracker (Capstone) | Python |
Resources
Essential Books
- “Fluent Python, 2nd Edition” by Luciano Ramalho - The definitive advanced Python book
- “Effective Python, 2nd Edition” by Brett Slatkin - 90 specific ways to write better Python
- “Python Cookbook” by David Beazley - Recipes for common tasks
- “Automate the Boring Stuff” by Al Sweigart - Practical automation
- “High Performance Python” by Gorelick & Ozsvald - Optimization techniques
- “Architecture Patterns with Python” by Percival & Gregory - Production patterns
Online Resources
- Python Documentation: https://docs.python.org/3/
- Real Python: https://realpython.com/
- Python Weekly: https://www.pythonweekly.com/
- Talk Python to Me: https://talkpython.fm/
Tools
- PyCharm/VS Code: IDEs with excellent Python support
- pytest: Testing framework
- black/ruff: Code formatters and linters
- mypy: Static type checker
- poetry: Dependency management
Practice Platforms
- Exercism Python Track: https://exercism.org/tracks/python
- LeetCode: https://leetcode.com/
- Project Euler: https://projecteuler.net/
- Advent of Code: https://adventofcode.com/
Total Estimated Time: 6-9 months of dedicated study
After completion: You’ll be able to build any type of Python application—web services, data pipelines, automation tools, ML systems, and more. You’ll understand Python deeply, from its object model to its concurrency patterns, and you’ll be ready for professional Python development.