Project 2: KV Client Library

Build a Redis-like key-value client library with explicit ownership and error contracts.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 1-2 weeks
Language C
Prerequisites Socket basics, string handling
Key Topics Ownership, opaque handles, error APIs

1. Learning Objectives

By completing this project, you will:

  1. Design an opaque handle API for network clients.
  2. Make ownership rules explicit for returned data.
  3. Handle partial reads/writes over TCP.
  4. Provide consistent error reporting.

2. Theoretical Foundation

2.1 Core Concepts

  • Opaque handles: Hide internal socket state.
  • Ownership contracts: Explicitly define who frees memory.
  • Protocol framing: Read full responses over a stream.

2.2 Why This Matters

Network clients fail when interfaces hide ownership or errors. A clean C API makes misuse difficult and debugging predictable.

2.3 Historical Context / Background

Redis clients like hiredis are designed around opaque types and explicit memory ownership to prevent leaks and misuse.

2.4 Common Misconceptions

  • “recv returns all data”: It often returns partial data.
  • “Returning char* is enough”: Ownership must be explicit.

3. Project Specification

3.1 What You Will Build

A kvclient library exposing:

  • kv_connect, kv_disconnect
  • kv_set, kv_get, kv_delete
  • kv_get_error
  • kv_free_string for owned strings

3.2 Functional Requirements

  1. Connect to a server over TCP.
  2. Implement a simple text protocol (or Redis protocol).
  3. Return owned strings from kv_get.
  4. Provide per-connection error messages.

3.3 Non-Functional Requirements

  • Safety: No ambiguous ownership.
  • Reliability: Handle reconnect and timeouts gracefully.
  • Usability: Clear error messages.

3.4 Example Usage / Output

kv_handle *db = kv_connect("localhost", 6379);
kv_set(db, "user:1", "Ada");
char *name = kv_get(db, "user:1");
printf("%s\n", name);
kv_free_string(name);
kv_disconnect(db);

3.5 Real World Outcome

Users can write client code with zero ambiguity about who owns returned strings. The API behaves predictably under network failures.


4. Solution Architecture

4.1 High-Level Design

client API -> socket I/O -> protocol parser -> responses

4.2 Key Components

Component Responsibility Key Decisions
Opaque handle Store socket and error Hidden struct
Protocol encoder Build requests Simple text protocol
Protocol parser Parse replies Robust framing
Error store Report last error Per-handle buffer

4.3 Data Structures

typedef struct kv_handle kv_handle;

typedef enum {
    KV_OK = 0,
    KV_ERR_CONN = -1,
    KV_ERR_PROTO = -2,
    KV_ERR_MEM = -3
} kv_status;

4.4 Algorithm Overview

Key Algorithm: Request/response

  1. Serialize command.
  2. Send with loop handling partial writes.
  3. Read until delimiter (e.g., \n).
  4. Parse response, allocate copy for caller.

Complexity Analysis:

  • Time: O(n) per command
  • Space: O(n) for responses

5. Implementation Guide

5.1 Development Environment Setup

cc -Wall -Wextra -O2 -g -o test_kv test_kv.c kvclient.c

5.2 Project Structure

kvclient/
├── src/
│   ├── kvclient.c
│   └── kvclient.h
├── tests/
│   └── test_kv.c
└── README.md

5.3 The Core Question You’re Answering

“When a function returns a pointer, who owns it, and how do you make that obvious?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Ownership conventions
    • How do you signal ownership in C APIs?
  2. Partial I/O
    • Why must send/recv be looped?
  3. Error reporting
    • Return codes vs error strings.

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. Will kv_get return NULL on not-found or only on error?
  2. Should kv_get_error return borrowed memory?
  3. Will kv_disconnect be idempotent?

5.6 Thinking Exercise

Ownership Trace

Draw the heap after two kv_get calls. Are both results valid simultaneously?

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Why use opaque handles?”
  2. “How do you handle partial reads?”
  3. “How do you communicate ownership to users?”

5.8 Hints in Layers

Hint 1: Use a simple text protocol Start with SET key value and GET key.

Hint 2: Allocate new strings for results Avoid shared internal buffers.

Hint 3: Add error buffer per handle Store last error in the handle.

5.9 Books That Will Help

Topic Book Chapter
Sockets “The Linux Programming Interface” Ch. 56-59
Ownership “Effective C” Ch. 6

5.10 Implementation Phases

Phase 1: Foundation (3-4 days)

Goals:

  • Connect and send commands

Tasks:

  1. Implement kv_connect.
  2. Send raw command strings.

Checkpoint: Server receives data.

Phase 2: Core Functionality (4-6 days)

Goals:

  • Implement get/set/delete

Tasks:

  1. Parse responses.
  2. Allocate returned strings.

Checkpoint: Round-trip key-value works.

Phase 3: Polish & Edge Cases (2-4 days)

Goals:

  • Error handling and timeouts

Tasks:

  1. Add error storage.
  2. Handle disconnects and timeouts.

Checkpoint: Clean error messages.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Ownership Borrowed vs owned Owned results Prevent misuse
Error API Return codes + error string Yes Clear diagnostics

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Integration Tests Live server redis-server
Error Tests Server down Connection fail
Memory Tests Leak check Valgrind

6.2 Critical Test Cases

  1. Server down: kv_connect fails with error.
  2. Not found: Distinct behavior from error.
  3. Multiple gets: Both results valid.

6.3 Test Data

user:1 -> "Ada"

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Reusing result buffer Corrupted results Allocate new string
Not looping recv Truncated replies Loop until delimiter
Ownership ambiguity Leaks Provide kv_free_string

7.2 Debugging Strategies

  • Use strace/dtruss to inspect socket calls.
  • Log raw protocol messages.

7.3 Performance Traps

String concatenation for commands can be O(n^2). Use a fixed buffer or builder.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add kv_exists.
  • Add integer values.

8.2 Intermediate Extensions

  • Add connection pooling.
  • Add timeouts with select.

8.3 Advanced Extensions

  • Add async interface with callbacks.
  • Add TLS support.

9. Real-World Connections

9.1 Industry Applications

  • Caching layers: Redis/memcached clients.
  • Microservices: Shared API boundaries.
  • hiredis: Redis client library.

9.3 Interview Relevance

Ownership and network I/O are classic systems interview topics.


10. Resources

10.1 Essential Reading

  • “The Linux Programming Interface” - Ch. 56-59
  • “Effective C” - Ch. 6

10.2 Video Resources

  • Socket programming lectures

10.3 Tools & Documentation

  • man 2 connect, man 2 send, man 2 recv
  • Plugin System: ABI boundaries.
  • libhttp-lite: Network API design.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain ownership rules.
  • I can handle partial I/O.
  • I can design opaque handles.

11.2 Implementation

  • Commands work reliably.
  • Errors are clear and consistent.
  • Memory is leak-free.

11.3 Growth

  • I can add async support.
  • I can explain this project in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Connect and perform get/set.

Full Completion:

  • Errors, ownership, and timeouts handled.

Excellence (Going Above & Beyond):

  • Async interface and TLS support.

This guide was generated from SPRINT_4_BOUNDARIES_INTERFACES_PROJECTS.md. For the complete learning path, see the parent directory.