Project 7: TCP Chat Server

Build a multi-client chat server with rooms, private messages, and I/O multiplexing to master network programming fundamentals.

Quick Reference

Attribute Value
Difficulty Advanced
Time Estimate 2-3 weeks
Language C++
Prerequisites Basic networking concepts, file descriptors, C-style APIs
Key Topics socket(), bind(), listen(), accept(), select()/poll(), message framing

1. Learning Objectives

By completing this project, you will:

  • Master the socket API: Create, bind, listen, accept, and close sockets
  • Understand I/O multiplexing: Use select() or poll() to handle multiple clients in one thread
  • Design application protocols: Create a simple text-based protocol with commands
  • Handle partial reads/writes: Deal with TCP stream semantics (no message boundaries)
  • Implement message framing: Use delimiters or length-prefixing
  • Manage client state: Track nicknames, rooms, and connection buffers
  • Write cross-platform network code: Handle POSIX vs Winsock differences

2. Theoretical Foundation

2.1 Core Concepts

The Socket Lifecycle:

                    SERVER                              CLIENT
                      │                                    │
               ┌──────▼──────┐                             │
               │  socket()   │  Create socket              │
               └──────┬──────┘                             │
                      │                                    │
               ┌──────▼──────┐                             │
               │   bind()    │  Bind to address:port       │
               └──────┬──────┘                             │
                      │                                    │
               ┌──────▼──────┐                             │
               │  listen()   │  Mark as passive socket     │
               └──────┬──────┘                             │
                      │                             ┌──────▼──────┐
                      │                             │  socket()   │
                      │                             └──────┬──────┘
                      │                                    │
                      │◄───────── TCP 3-way ──────────────┤
               ┌──────▼──────┐    handshake        ┌──────▼──────┐
               │  accept()   │◄────────────────────│  connect()  │
               └──────┬──────┘                     └──────┬──────┘
                      │                                   │
                      │         Data exchange             │
               ┌──────▼──────┐                     ┌──────▼──────┐
               │ recv/send() │◄───────────────────►│ recv/send() │
               └──────┬──────┘                     └──────┬──────┘
                      │                                   │
               ┌──────▼──────┐                     ┌──────▼──────┐
               │   close()   │                     │   close()   │
               └─────────────┘                     └─────────────┘

I/O Multiplexing with select():

┌─────────────────────────────────────────────────────────────────────┐
│                    Single-Threaded Server                            │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                      select() call                           │    │
│  │                                                              │    │
│  │   fd_set read_fds:                                          │    │
│  │   ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┐               │    │
│  │   │ fd3 │ fd4 │ fd5 │ fd6 │ fd7 │ ... │ fdN │               │    │
│  │   │  ✓  │     │  ✓  │     │  ✓  │     │     │  ← ready      │    │
│  │   └─────┴─────┴─────┴─────┴─────┴─────┴─────┘               │    │
│  │                                                              │    │
│  │   select() blocks until at least one fd has data            │    │
│  │   Returns: which fds are ready for reading                   │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                                                                      │
│  After select returns:                                               │
│    - If server_fd ready → accept() new connection                   │
│    - If client_fd ready → recv() data from that client              │
│                                                                      │
│  No threads needed! One thread handles all clients.                  │
└─────────────────────────────────────────────────────────────────────┘

TCP Stream Semantics (The Buffering Problem):

Sender sends:                    Receiver might get:

"Hello\n"  ───────────────────►  "Hel"        (partial)
"World\n"                        "lo\nWorld\n" (combined)

OR:

"Hello\n"  ───────────────────►  "Hello\nWorld\n"  (combined)
"World\n"

OR:

"Hello\n"  ───────────────────►  "Hello\n"    (as sent)
"World\n"                        "World\n"

TCP is a STREAM, not a MESSAGE protocol!
You MUST handle partial reads and combine them.

Message Framing Solution:

Option 1: Newline Delimited (simple, text-only)
┌────────────────────────────────────────┐
│ "NICK Alice\n" │ "JOIN general\n" │ ...│
└────────────────────────────────────────┘

Option 2: Length Prefixed (robust, binary-safe)
┌────────────────────────────────────────────────┐
│ [4 bytes: len] │ [len bytes: payload] │ ...    │
│     0x000B     │ "NICK Alice"         │        │
└────────────────────────────────────────────────┘

2.2 Why This Matters

Chat servers are the foundation of real-time communication:

  • Slack, Discord: Modern chat applications
  • IRC: The original chat protocol (still used!)
  • WebSocket servers: Real-time web applications
  • Game servers: Player communication

Understanding socket programming teaches:

  • How the internet actually works
  • Why HTTP/WebSocket/gRPC exist (they solve these same problems)
  • How to debug network issues
  • Foundation for distributed systems

2.3 Historical Context

Socket programming dates to BSD Unix 4.2 (1983). The socket API was designed to be protocol-agnostic - the same functions work for TCP, UDP, Unix domain sockets, and more.

Key milestones:

  • 1983: BSD 4.2 introduces sockets
  • 1988: POSIX standardizes socket API
  • 1992: Winsock brings sockets to Windows
  • 1999: Linux adds epoll (better than select)
  • 2007: Mac OS X adds kqueue (BSD equivalent)

The chat server pattern led to:

  • IRC (1988): Internet Relay Chat
  • XMPP (1999): Jabber/Google Talk
  • WebSocket (2011): Real-time web
  • Modern systems: Slack, Discord, Teams

2.4 Common Misconceptions

Misconception 1: “recv() returns one complete message” Reality: recv() returns whatever bytes are available, which may be partial, combined, or fragmented. You must buffer and parse.

Misconception 2: “send() sends all bytes” Reality: send() may send fewer bytes than requested. Check return value and retry with remaining bytes.

Misconception 3: “select() tells you how many bytes are available” Reality: select() only tells you that at least 1 byte is available. You must call recv() to find out more.

Misconception 4: “Closing a socket is instant” Reality: close() may block if linger is enabled and there’s unsent data. TCP also has TIME_WAIT state.

Misconception 5: “Port numbers above 1024 don’t need root” Reality: Correct! Ports 1-1023 are privileged on Unix. Use ports 1024-65535 for development.


3. Project Specification

3.1 What You Will Build

A chat server that:

  • Accepts multiple simultaneous client connections
  • Supports nicknames, chat rooms, and private messages
  • Uses select() or poll() for I/O multiplexing
  • Handles partial reads/writes correctly
  • Works with telnet or netcat as a client

3.2 Functional Requirements

Requirement Description
FR-1 Accept connections on a configurable port
FR-2 Support /nick command to set username
FR-3 Support /join command to enter a room
FR-4 Broadcast messages to all users in a room
FR-5 Support /msg for private messages
FR-6 Support /rooms to list available rooms
FR-7 Support /quit to disconnect
FR-8 Handle client disconnection gracefully

3.3 Non-Functional Requirements

Requirement Description
NFR-1 Handle at least 100 concurrent connections
NFR-2 No blocking operations in main loop
NFR-3 Graceful handling of malformed input
NFR-4 No memory leaks on client disconnect
NFR-5 Cross-platform (Linux, macOS, optionally Windows)

3.4 Example Usage / Output

Terminal 1 - Server:

$ ./chat_server 8080
Chat server listening on port 8080...
[12:00:01] Client connected from 127.0.0.1:54321
[12:00:03] Client connected from 127.0.0.1:54322
[12:00:05] Anonymous1 set nickname to Alice
[12:00:07] Anonymous2 set nickname to Bob
[12:00:10] Alice joined room: general
[12:00:12] Bob joined room: general
[12:00:15] [general] Alice: Hello everyone!
[12:00:18] [general] Bob: Hi Alice!
[12:00:45] Bob disconnected

Terminal 2 - Client (Alice):

$ nc localhost 8080
Welcome to the chat server!
Use /nick <name> to set your nickname.
Use /join <room> to join a room.
Use /msg <user> <message> for private messages.
Use /rooms to list rooms.
Use /quit to disconnect.

/nick Alice
Your nickname is now: Alice

/join general
Joined room: general

Hello everyone!
[Bob]: Hi Alice!

/msg Bob Want to chat privately?
Private message sent to Bob.

[Private from Bob]: Sure!

/quit
Goodbye!

Terminal 3 - Client (Bob):

$ nc localhost 8080
Welcome to the chat server!
...

/nick Bob
Your nickname is now: Bob

/join general
Joined room: general
Alice is in this room.

[Alice]: Hello everyone!
Hi Alice!

[Private from Alice]: Want to chat privately?
/msg Alice Sure!

3.5 Real World Outcome

$ ./chat_server 8080 &
$ ./stress_test --clients 100 --messages 1000
Stress Test Results:
  Clients connected: 100/100
  Messages sent: 100,000
  Messages received: 100,000
  Average latency: 0.3ms
  Max latency: 12ms
  Dropped connections: 0
  Memory usage: 8.2 MB (stable)

$ valgrind ./chat_server 8080
  # Run test clients
  # All clients disconnect
==12345== LEAK SUMMARY:
==12345==    definitely lost: 0 bytes
==12345==    indirectly lost: 0 bytes

4. Solution Architecture

4.1 High-Level Design

┌─────────────────────────────────────────────────────────────────────┐
│                         Chat Server                                  │
│                                                                      │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │                    Server Class                                │  │
│  │                                                                │  │
│  │  int server_socket_                  ← Listening socket        │  │
│  │  int port_                                                     │  │
│  │  std::map<int, Client> clients_      ← fd -> client info       │  │
│  │  std::map<std::string, Room> rooms_  ← room name -> room       │  │
│  │  bool running_                                                 │  │
│  │                                                                │  │
│  │  void run()                          ← Main event loop         │  │
│  │  void acceptConnection()                                       │  │
│  │  void handleClientData(int fd)                                 │  │
│  │  void broadcastToRoom(room, message)                           │  │
│  └───────────────────────────────────────────────────────────────┘  │
│                                                                      │
│  ┌─────────────────────┐  ┌─────────────────────┐                   │
│  │    Client Struct    │  │    Room Struct      │                   │
│  │                     │  │                     │                   │
│  │  int fd             │  │  std::string name   │                   │
│  │  std::string nick   │  │  std::set<int> fds  │                   │
│  │  std::string room   │  │                     │                   │
│  │  std::string buffer │  │                     │                   │
│  └─────────────────────┘  └─────────────────────┘                   │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

4.2 Key Components

Component Responsibility
Server Main loop, socket management, command routing
Client Per-connection state (nickname, room, read buffer)
Room Group of clients, message broadcasting
Command Parser Parse /nick, /join, /msg, etc.

4.3 Data Structures

struct Client {
    int fd;
    std::string nickname;
    std::string current_room;
    std::string read_buffer;   // Accumulated partial reads
    std::string write_buffer;  // Pending writes (if using non-blocking)
    sockaddr_in address;
    time_t connected_at;
};

struct Room {
    std::string name;
    std::set<int> member_fds;
    time_t created_at;
};

class Server {
    int server_fd_;
    std::map<int, Client> clients_;
    std::map<std::string, Room> rooms_;
    bool running_;
};

4.4 Algorithm Overview

Main Event Loop:

while running:
    // Build fd_set for select
    FD_ZERO(&read_fds)
    FD_SET(server_fd, &read_fds)
    max_fd = server_fd

    for each client fd in clients:
        FD_SET(fd, &read_fds)
        max_fd = max(max_fd, fd)

    // Wait for activity (blocks until something happens)
    select(max_fd + 1, &read_fds, NULL, NULL, NULL)

    // Check for new connections
    if FD_ISSET(server_fd, &read_fds):
        acceptConnection()

    // Check each client for data
    for each client fd in clients:
        if FD_ISSET(fd, &read_fds):
            handleClientData(fd)

Handle Client Data:

bytes = recv(fd, buffer, sizeof(buffer), 0)

if bytes <= 0:
    // Client disconnected
    removeClient(fd)
    return

// Append to client's read buffer
client.read_buffer += buffer

// Process complete lines
while (pos = client.read_buffer.find('\n')) != npos:
    line = client.read_buffer.substr(0, pos)
    client.read_buffer.erase(0, pos + 1)
    processCommand(fd, line)

5. Implementation Guide

5.1 Development Environment Setup

Required headers (POSIX):

#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <sys/select.h>  // or <poll.h>

For Windows (Winsock):

#include <winsock2.h>
#include <ws2tcpip.h>
#pragma comment(lib, "ws2_32.lib")

Compile:

# Linux/macOS
g++ -std=c++17 -Wall -Wextra -o chat_server chat_server.cpp

# With debugging
g++ -std=c++17 -g -fsanitize=address -o chat_server chat_server.cpp

5.2 Project Structure

chat_server/
├── include/
│   ├── server.hpp
│   ├── client.hpp
│   └── room.hpp
├── src/
│   ├── server.cpp
│   ├── main.cpp
│   └── commands.cpp
├── tests/
│   ├── test_server.cpp
│   └── stress_test.cpp
├── CMakeLists.txt
└── README.md

5.3 The Core Question You’re Answering

“How do I handle multiple simultaneous network connections without creating a thread per connection?”

This is fundamental to building scalable network servers. Thread-per-connection doesn’t scale (memory, context switching). I/O multiplexing allows one thread to handle thousands of connections.

5.4 Concepts You Must Understand First

  1. What is a file descriptor?
    • Integer handle to kernel resource (file, socket, pipe)
    • Operations: read(), write(), close()
  2. What does bind() do vs listen()?
    • bind(): Associate socket with address:port
    • listen(): Mark socket as passive (accepting connections)
  3. Why is accept() blocking by default?
    • Waits until client connects
    • Returns new socket fd for that specific client
  4. What is fd_set and how does select() use it?
    • Bitmask of file descriptors
    • select() modifies it to show which are ready
  5. Why can’t you just call recv() in a loop?
    • It blocks if no data available
    • Would freeze handling other clients

5.5 Questions to Guide Your Design

Architecture:

  • Single-threaded with select/poll, or multi-threaded?
  • How to handle slow clients (ones that receive slowly)?
  • Should rooms persist after all users leave?

Protocol Design:

  • What delimiter to use? Newline? NULL byte? Length prefix?
  • How to handle messages with newlines in them?
  • Maximum message length?

Error Handling:

  • What if client sends invalid command?
  • What if send() can’t send all bytes?
  • What if client buffer grows too large (DoS)?

5.6 Thinking Exercise

Trace through this scenario:

T=0:   Server starts, listening on fd=3
T=1:   Client A connects, accept() returns fd=4
T=2:   Client B connects, accept() returns fd=5
T=3:   Client A sends: "/nick Ali" (partial, no newline yet)
T=4:   Client B sends: "/nick Bob\n"
T=5:   Client A sends: "ce\n" (completing the nickname)
T=6:   Client A sends: "/join lobby\n/msg Bob hi\n" (two commands in one recv)

Questions:

  1. After T=3, what’s in Client A’s buffer? Can we process anything yet?
  2. At T=4, how do we know Bob’s message is complete?
  3. At T=6, how many commands do we process?
  4. What if the server fd and client fds are all ready at once?

5.7 Hints in Layers

Hint 1 - Starting Point (Conceptual): Start with just accepting connections and echoing back whatever clients send. Get that working first, then add command parsing, then rooms.

Hint 2 - Next Level (More Specific): The select() loop should look like:

while (running_) {
    fd_set read_fds;
    FD_ZERO(&read_fds);
    FD_SET(server_fd_, &read_fds);
    int max_fd = server_fd_;

    for (const auto& [fd, client] : clients_) {
        FD_SET(fd, &read_fds);
        if (fd > max_fd) max_fd = fd;
    }

    int ready = select(max_fd + 1, &read_fds, nullptr, nullptr, nullptr);
    if (ready < 0) {
        perror("select");
        break;
    }

    // Check server socket
    if (FD_ISSET(server_fd_, &read_fds)) {
        acceptConnection();
    }

    // Check client sockets
    for (auto it = clients_.begin(); it != clients_.end(); ) {
        int fd = it->first;
        if (FD_ISSET(fd, &read_fds)) {
            if (!handleClientData(fd)) {
                // Client disconnected
                it = clients_.erase(it);
                continue;
            }
        }
        ++it;
    }
}

Hint 3 - Technical Details (Socket Setup):

int createServerSocket(int port) {
    int fd = socket(AF_INET, SOCK_STREAM, 0);
    if (fd < 0) {
        throw std::runtime_error("socket() failed");
    }

    // Allow reuse of address (important for quick restart)
    int opt = 1;
    setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));

    sockaddr_in addr{};
    addr.sin_family = AF_INET;
    addr.sin_addr.s_addr = INADDR_ANY;
    addr.sin_port = htons(port);

    if (bind(fd, (sockaddr*)&addr, sizeof(addr)) < 0) {
        throw std::runtime_error("bind() failed");
    }

    if (listen(fd, SOMAXCONN) < 0) {
        throw std::runtime_error("listen() failed");
    }

    return fd;
}

Hint 4 - Message Buffering:

bool handleClientData(int fd) {
    char buf[1024];
    ssize_t n = recv(fd, buf, sizeof(buf), 0);

    if (n <= 0) {
        // n == 0: client closed connection
        // n < 0: error (check errno)
        return false;
    }

    clients_[fd].read_buffer.append(buf, n);

    // Process complete lines
    std::string& buffer = clients_[fd].read_buffer;
    size_t pos;
    while ((pos = buffer.find('\n')) != std::string::npos) {
        std::string line = buffer.substr(0, pos);
        buffer.erase(0, pos + 1);

        // Remove \r if present (Windows telnet)
        if (!line.empty() && line.back() == '\r') {
            line.pop_back();
        }

        processCommand(fd, line);
    }

    // Prevent buffer from growing too large (DoS protection)
    if (buffer.size() > 4096) {
        sendToClient(fd, "ERROR: Message too long\n");
        buffer.clear();
    }

    return true;
}

5.8 The Interview Questions They’ll Ask

  1. “What’s the difference between select(), poll(), and epoll()?”
    • select: Limited to FD_SETSIZE (often 1024), copies fd_set each call
    • poll: No fd limit, still O(n) to check all fds
    • epoll: O(1) after setup, kernel maintains interest list
  2. “Why use SO_REUSEADDR?”
    • Allows binding to a port in TIME_WAIT state
    • Essential for quick server restarts during development
  3. “What happens if send() returns less than requested?”
    • Not all bytes were sent. You must retry with remaining bytes.
    • Consider using non-blocking sockets and tracking write buffers.
  4. “How would you handle 10,000 concurrent connections?”
    • Use epoll (Linux) or kqueue (macOS)
    • Consider multiple threads with work distribution
    • May need to increase system file descriptor limits
  5. “How do you prevent a slow client from blocking others?”
    • Use non-blocking I/O
    • Track write buffers per client
    • Disconnect clients with huge pending writes

5.9 Books That Will Help

Topic Book & Chapter
Socket API “TCP/IP Sockets in C” Ch. 1-4 - Donahoo & Calvert
I/O Multiplexing “The Linux Programming Interface” Ch. 63 - Kerrisk
TCP/IP Protocol “TCP/IP Illustrated, Vol. 1” - Stevens
Network byte order “Unix Network Programming” Ch. 3 - Stevens
High-performance I/O “Linux System Programming” Ch. 4 - Robert Love

5.10 Implementation Phases

Phase 1: Basic Server (Day 1-3)

  • Create listening socket
  • Accept connections
  • Echo received data back
  • Handle disconnect

Phase 2: Multiple Clients (Day 4-6)

  • Implement select() loop
  • Handle multiple clients simultaneously
  • Add message buffering for partial reads
  • Test with multiple telnet sessions

Phase 3: Commands (Day 7-10)

  • Implement /nick command
  • Implement /join and room management
  • Implement message broadcasting to room
  • Implement /rooms listing

Phase 4: Polish (Day 11-14)

  • Add /msg private messaging
  • Add /quit command
  • Handle edge cases (no room, duplicate nicks)
  • Write stress tests

5.11 Key Implementation Decisions

Decision Recommended Choice Reasoning
I/O Model select() for learning, poll() for production select() is simpler, poll() has no fd limit
Message delimiter Newline (\n) Works with telnet/nc, simple parsing
Blocking mode Blocking with select Simpler than non-blocking
Thread model Single-threaded Sufficient for learning, avoids race conditions

6. Testing Strategy

Manual Testing:

# Terminal 1: Server
./chat_server 8080

# Terminal 2: Client 1
nc localhost 8080
/nick Alice
/join general
Hello!

# Terminal 3: Client 2
nc localhost 8080
/nick Bob
/join general
# Should see "Hello!" from Alice

Automated Testing:

TEST(ChatServer, AcceptsConnection) {
    Server server(8080);
    std::thread t([&]{ server.run(); });

    int client = connectTo("localhost", 8080);
    ASSERT_GE(client, 0);

    // Read welcome message
    std::string msg = readLine(client);
    EXPECT_TRUE(msg.find("Welcome") != std::string::npos);

    close(client);
    server.stop();
    t.join();
}

7. Common Pitfalls & Debugging

Problem Symptom Root Cause Fix
“Address already in use” Can’t restart server Port in TIME_WAIT Use SO_REUSEADDR
Messages combined Two commands treated as one Not buffering properly Buffer and find delimiters
Server hangs Stops responding Blocking recv() on wrong fd Use select() correctly
Memory leak Growing memory Not removing disconnected clients Clean up on disconnect
Garbled data Random characters Not handling partial recv Append to buffer properly

Debug with netstat:

# See listening sockets
netstat -an | grep LISTEN

# See established connections
netstat -an | grep ESTABLISHED

# See TIME_WAIT (why you need SO_REUSEADDR)
netstat -an | grep TIME_WAIT

8. Extensions & Challenges

  1. TLS Support: Add encryption with OpenSSL
  2. WebSocket: Implement WebSocket protocol for browser clients
  3. Persistence: Save chat history to database
  4. Authentication: Add username/password login
  5. Rate Limiting: Prevent spam/DoS
  6. File Transfer: Allow users to share files

9. Real-World Connections

  • IRC: Uses very similar architecture
  • Discord: Modern chat with similar concepts
  • Nginx: Uses epoll for high concurrency
  • Redis: Single-threaded with epoll, handles millions of connections

10. Resources


11. Self-Assessment Checklist

  • Server accepts connections on specified port
  • Multiple clients can connect simultaneously
  • /nick command works
  • /join command works
  • Messages broadcast to room members
  • /msg private messages work
  • Clean disconnect handling
  • No memory leaks (Valgrind clean)
  • Handles partial reads correctly
  • Works with nc/telnet as client

12. Submission / Completion Criteria

Your implementation is complete when:

  1. Core features work: nick, join, rooms, msg, quit
  2. Handles 50+ simultaneous connections: Stress tested
  3. No resource leaks: Valgrind clean
  4. Handles edge cases: Partial reads, rapid disconnect
  5. Can explain: select() loop and message buffering in interview