Project 5: Simple Linked List Database

Project 5: Simple Linked List Database

Sprint: 2 - Data & Invariants Difficulty: Intermediate Time Estimate: 1-2 weeks Prerequisites: Project 1, basic linked list concept


Overview

What youโ€™ll build: An in-memory database storing records as linked list nodes with indexes (also linked lists), supporting insert, delete, find, and iterationโ€”with bulletproof invariant checking.

Why it teaches Data & Invariants: Linked lists get a bad reputation because people implement them without invariants. A proper doubly-linked list has:

  • head->prev == NULL
  • tail->next == NULL
  • For all nodes: node->next->prev == node (if next exists)
  • Count matches actual traversal length

Your database adds more: index consistency, no dangling references, transaction-like semantics.

The Core Question Youโ€™re Answering:

โ€œHow do you make a data structure thatโ€™s impossible to corrupt?โ€


Learning Objectives

By the end of this project, you will be able to:

  1. Implement a doubly-linked list with complete invariants
  2. Design invariant checking that runs in debug mode
  3. Handle iterator invalidation correctly
  4. Maintain index consistency when data changes
  5. Serialize data structures to disk safely
  6. Debug complex pointer relationships

Theoretical Foundation

Doubly-Linked List Invariants

INVARIANT SET for Doubly-Linked List:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 1. head == NULL โŸบ tail == NULL                                โ”‚
โ”‚    (Empty list has both NULL, non-empty has both non-NULL)    โ”‚
โ”‚                                                                โ”‚
โ”‚ 2. head != NULL โŸน head->prev == NULL                          โ”‚
โ”‚    (First node has no predecessor)                            โ”‚
โ”‚                                                                โ”‚
โ”‚ 3. tail != NULL โŸน tail->next == NULL                          โ”‚
โ”‚    (Last node has no successor)                               โ”‚
โ”‚                                                                โ”‚
โ”‚ 4. For all nodes n: n->next != NULL โŸน n->next->prev == n     โ”‚
โ”‚    (Bidirectional consistency)                                โ”‚
โ”‚                                                                โ”‚
โ”‚ 5. For all nodes n: n->prev != NULL โŸน n->prev->next == n     โ”‚
โ”‚    (Bidirectional consistency, reverse direction)             โ”‚
โ”‚                                                                โ”‚
โ”‚ 6. count == number of nodes reachable from head               โ”‚
โ”‚    (Count is accurate)                                        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Visual Representation

Doubly-Linked List with 3 nodes:

NULL โ†โ”€โ”€ [Alice] โ†โ†’ [Bob] โ†โ†’ [Carol] โ”€โ”€โ†’ NULL
           โ†‘                    โ†‘
          head                 tail

Invariants visible:
โ€ข head->prev == NULL โœ“
โ€ข tail->next == NULL โœ“
โ€ข Alice->next == Bob โœ“
โ€ข Bob->prev == Alice โœ“
โ€ข Bob->next == Carol โœ“
โ€ข Carol->prev == Bob โœ“

Iterator Invalidation

PROBLEM: Deleting while iterating

for (Node* n = list->head; n != NULL; n = n->next) {
    if (should_delete(n)) {
        delete_node(list, n);  // n->next is now garbage!
    }
}

n = n->next reads freed memory โ†’ CRASH or corruption

SOLUTION 1: Save next before delete
for (Node* n = list->head; n != NULL; ) {
    Node* next = n->next;  // Save before delete
    if (should_delete(n)) {
        delete_node(list, n);
    }
    n = next;  // Use saved pointer
}

SOLUTION 2: Collect then delete
Node* to_delete[MAX];
int count = 0;
for (Node* n = list->head; n != NULL; n = n->next) {
    if (should_delete(n)) to_delete[count++] = n;
}
for (int i = 0; i < count; i++) {
    delete_node(list, to_delete[i]);
}

Project Specification

Core API

// List management
ContactList* list_create(void);
void list_destroy(ContactList* list);

// CRUD operations
Record* list_add(ContactList* list, const char* name, const char* email,
                 const char* phone, const char* title);
Record* list_find_by_id(ContactList* list, int id);
Record* list_find_by_email(ContactList* list, const char* email);
bool list_delete(ContactList* list, int id);
bool list_update(ContactList* list, int id, const char* field, const char* value);

// Iteration
Record* list_first(ContactList* list);
Record* list_next(Record* current);
size_t list_count(ContactList* list);

// Invariant checking
bool list_check_invariants(ContactList* list);

// Persistence
bool list_save(ContactList* list, const char* filename);
ContactList* list_load(const char* filename);

Expected Output

$ ./contactdb
ContactDB v1.0 - Type 'help' for commands
Invariant checking: ENABLED

> add "Alice Johnson" "alice@techcorp.com" "555-0123" "Engineer"
[INVARIANT CHECK] After insert: list_length=1 โœ“
Added record #1: Alice Johnson

> add "Bob Smith" "bob@startup.io" "555-0456" "Designer"
[INVARIANT CHECK] After insert: list_length=2 โœ“ bidirectional consistency โœ“
Added record #2: Bob Smith

> list
ID  | Name            | Email                | Phone      | Title
----|-----------------|----------------------|------------|----------
1   | Alice Johnson   | alice@techcorp.com   | 555-0123   | Engineer
2   | Bob Smith       | bob@startup.io       | 555-0456   | Designer

> check-invariants
Running comprehensive invariant check...
  โœ“ Main list: length=2, forward=2, backward=2
  โœ“ Head invariant: head->prev == NULL
  โœ“ Tail invariant: tail->next == NULL
  โœ“ Bidirectional: all nodes satisfy node->next->prev == node
  โœ“ Email index: 2 entries, all valid
All invariants satisfied โœ“

> delete 1
[INVARIANT CHECK] After delete: list_length=1 โœ“
Deleted record #1

> save contacts.db
Saved 1 records to contacts.db

> exit
$ valgrind --leak-check=full ./contactdb < commands.txt
==12345== All heap blocks were freed -- no leaks are possible

Solution Architecture

Data Structures

typedef struct Record {
    int id;
    char* name;
    char* email;
    char* phone;
    char* title;
    struct Record* next;
    struct Record* prev;
} Record;

typedef struct ContactList {
    Record* head;
    Record* tail;
    int count;
    int next_id;  // Auto-increment ID
} ContactList;

Implementation Guide

Phase 1: Basic List Operations (2-3 hours)

ContactList* list_create(void) {
    ContactList* list = malloc(sizeof(ContactList));
    list->head = NULL;
    list->tail = NULL;
    list->count = 0;
    list->next_id = 1;
    return list;
}

Record* list_add(ContactList* list, const char* name, const char* email,
                 const char* phone, const char* title) {
    Record* rec = malloc(sizeof(Record));
    rec->id = list->next_id++;
    rec->name = strdup(name);
    rec->email = strdup(email);
    rec->phone = strdup(phone);
    rec->title = strdup(title);
    rec->next = NULL;
    rec->prev = list->tail;

    if (list->tail) {
        list->tail->next = rec;
    } else {
        list->head = rec;
    }
    list->tail = rec;
    list->count++;

    assert(list_check_invariants(list));
    return rec;
}

void list_destroy(ContactList* list) {
    Record* current = list->head;
    while (current) {
        Record* next = current->next;
        free(current->name);
        free(current->email);
        free(current->phone);
        free(current->title);
        free(current);
        current = next;
    }
    free(list);
}

Phase 2: Delete Operation (2-3 hours)

bool list_delete(ContactList* list, int id) {
    Record* target = list_find_by_id(list, id);
    if (!target) return false;

    // Handle all cases
    if (list->head == list->tail) {
        // Only node
        list->head = list->tail = NULL;
    } else if (target == list->head) {
        // First node
        list->head = target->next;
        list->head->prev = NULL;
    } else if (target == list->tail) {
        // Last node
        list->tail = target->prev;
        list->tail->next = NULL;
    } else {
        // Middle node
        target->prev->next = target->next;
        target->next->prev = target->prev;
    }

    // Free memory
    free(target->name);
    free(target->email);
    free(target->phone);
    free(target->title);
    free(target);
    list->count--;

    assert(list_check_invariants(list));
    return true;
}

Phase 3: Invariant Checker (2-3 hours)

bool list_check_invariants(ContactList* list) {
    // INV 1: Empty consistency
    if ((list->head == NULL) != (list->tail == NULL)) {
        fprintf(stderr, "INVARIANT VIOLATION: head/tail NULL mismatch\n");
        return false;
    }

    // Empty list is valid
    if (list->head == NULL) {
        return list->count == 0;
    }

    // INV 2: Head prev is NULL
    if (list->head->prev != NULL) {
        fprintf(stderr, "INVARIANT VIOLATION: head->prev != NULL\n");
        return false;
    }

    // INV 3: Tail next is NULL
    if (list->tail->next != NULL) {
        fprintf(stderr, "INVARIANT VIOLATION: tail->next != NULL\n");
        return false;
    }

    // INV 4 & 5: Bidirectional consistency + count
    int forward_count = 0;
    for (Record* r = list->head; r != NULL; r = r->next) {
        forward_count++;
        if (r->next != NULL && r->next->prev != r) {
            fprintf(stderr, "INVARIANT VIOLATION: bidirectional at id=%d\n", r->id);
            return false;
        }
    }

    // INV 6: Count matches
    if (forward_count != list->count) {
        fprintf(stderr, "INVARIANT VIOLATION: count mismatch (%d vs %d)\n",
                forward_count, list->count);
        return false;
    }

    return true;
}

Phase 4: Persistence (2-3 hours)

bool list_save(ContactList* list, const char* filename) {
    FILE* f = fopen(filename, "w");
    if (!f) return false;

    fprintf(f, "%d\n", list->count);
    for (Record* r = list->head; r != NULL; r = r->next) {
        fprintf(f, "%d\n%s\n%s\n%s\n%s\n",
                r->id, r->name, r->email, r->phone, r->title);
    }

    fclose(f);
    return true;
}

ContactList* list_load(const char* filename) {
    FILE* f = fopen(filename, "r");
    if (!f) return NULL;

    ContactList* list = list_create();

    int count;
    fscanf(f, "%d\n", &count);

    char name[256], email[256], phone[64], title[128];
    int id;

    for (int i = 0; i < count; i++) {
        fscanf(f, "%d\n", &id);
        fgets(name, sizeof(name), f); name[strcspn(name, "\n")] = 0;
        fgets(email, sizeof(email), f); email[strcspn(email, "\n")] = 0;
        fgets(phone, sizeof(phone), f); phone[strcspn(phone, "\n")] = 0;
        fgets(title, sizeof(title), f); title[strcspn(title, "\n")] = 0;

        Record* r = list_add(list, name, email, phone, title);
        r->id = id;
        if (id >= list->next_id) list->next_id = id + 1;
    }

    fclose(f);
    return list;
}

Common Pitfalls

Pitfall 1: Forgetting Edge Cases in Delete

// WRONG: Only handles middle node
target->prev->next = target->next;  // Crashes if target is head!
target->next->prev = target->prev;  // Crashes if target is tail!

// CORRECT: Handle all cases
if (target == list->head && target == list->tail) {
    list->head = list->tail = NULL;
} else if (target == list->head) {
    list->head = target->next;
    list->head->prev = NULL;
} else if (target == list->tail) {
    list->tail = target->prev;
    list->tail->next = NULL;
} else {
    target->prev->next = target->next;
    target->next->prev = target->prev;
}

Pitfall 2: Iterator Invalidation

// WRONG: Using node after delete
for (Record* r = list->head; r != NULL; r = r->next) {
    if (should_delete(r)) {
        list_delete(list, r->id);  // r is now freed!
        // r->next reads freed memory
    }
}

// CORRECT: Save next pointer
for (Record* r = list->head; r != NULL; ) {
    Record* next = r->next;  // Save before potential delete
    if (should_delete(r)) {
        list_delete(list, r->id);
    }
    r = next;
}

Interview Preparation

Common Questions

  1. โ€œWhat invariants must hold for a doubly-linked list?โ€
    • head/tail NULL consistency
    • head->prev == NULL, tail->next == NULL
    • Bidirectional: node->next->prev == node
    • Count matches traversal
  2. โ€œIterator invalidation during deletion?โ€
    • Save next pointer before deleting
    • Or collect items to delete, then delete
    • Document which operations invalidate iterators
  3. โ€œHow would you serialize a linked list?โ€
    • Write count first
    • Write each record sequentially
    • Rebuild list on load (pointers are runtime)

Self-Assessment Checklist

  • All CRUD operations work correctly
  • Invariant checker catches bugs
  • Save/load roundtrip preserves data
  • No iterator invalidation bugs
  • Valgrind clean

Next Project: P06: HTTP Server (Capstone) - Combine all concepts in a production-grade network server.