Project 7: TCP Chat Server
Build a multi-client chat server with rooms, private messages, and I/O multiplexing to master network programming fundamentals.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 2-3 weeks |
| Language | C++ |
| Prerequisites | Basic networking concepts, file descriptors, C-style APIs |
| Key Topics | socket(), bind(), listen(), accept(), select()/poll(), message framing |
1. Learning Objectives
By completing this project, you will:
- Master the socket API: Create, bind, listen, accept, and close sockets
- Understand I/O multiplexing: Use select() or poll() to handle multiple clients in one thread
- Design application protocols: Create a simple text-based protocol with commands
- Handle partial reads/writes: Deal with TCP stream semantics (no message boundaries)
- Implement message framing: Use delimiters or length-prefixing
- Manage client state: Track nicknames, rooms, and connection buffers
- Write cross-platform network code: Handle POSIX vs Winsock differences
2. Theoretical Foundation
2.1 Core Concepts
The Socket Lifecycle:
SERVER CLIENT
│ │
┌──────▼──────┐ │
│ socket() │ Create socket │
└──────┬──────┘ │
│ │
┌──────▼──────┐ │
│ bind() │ Bind to address:port │
└──────┬──────┘ │
│ │
┌──────▼──────┐ │
│ listen() │ Mark as passive socket │
└──────┬──────┘ │
│ ┌──────▼──────┐
│ │ socket() │
│ └──────┬──────┘
│ │
│◄───────── TCP 3-way ──────────────┤
┌──────▼──────┐ handshake ┌──────▼──────┐
│ accept() │◄────────────────────│ connect() │
└──────┬──────┘ └──────┬──────┘
│ │
│ Data exchange │
┌──────▼──────┐ ┌──────▼──────┐
│ recv/send() │◄───────────────────►│ recv/send() │
└──────┬──────┘ └──────┬──────┘
│ │
┌──────▼──────┐ ┌──────▼──────┐
│ close() │ │ close() │
└─────────────┘ └─────────────┘
I/O Multiplexing with select():
┌─────────────────────────────────────────────────────────────────────┐
│ Single-Threaded Server │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ select() call │ │
│ │ │ │
│ │ fd_set read_fds: │ │
│ │ ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┐ │ │
│ │ │ fd3 │ fd4 │ fd5 │ fd6 │ fd7 │ ... │ fdN │ │ │
│ │ │ ✓ │ │ ✓ │ │ ✓ │ │ │ ← ready │ │
│ │ └─────┴─────┴─────┴─────┴─────┴─────┴─────┘ │ │
│ │ │ │
│ │ select() blocks until at least one fd has data │ │
│ │ Returns: which fds are ready for reading │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ After select returns: │
│ - If server_fd ready → accept() new connection │
│ - If client_fd ready → recv() data from that client │
│ │
│ No threads needed! One thread handles all clients. │
└─────────────────────────────────────────────────────────────────────┘
TCP Stream Semantics (The Buffering Problem):
Sender sends: Receiver might get:
"Hello\n" ───────────────────► "Hel" (partial)
"World\n" "lo\nWorld\n" (combined)
OR:
"Hello\n" ───────────────────► "Hello\nWorld\n" (combined)
"World\n"
OR:
"Hello\n" ───────────────────► "Hello\n" (as sent)
"World\n" "World\n"
TCP is a STREAM, not a MESSAGE protocol!
You MUST handle partial reads and combine them.
Message Framing Solution:
Option 1: Newline Delimited (simple, text-only)
┌────────────────────────────────────────┐
│ "NICK Alice\n" │ "JOIN general\n" │ ...│
└────────────────────────────────────────┘
Option 2: Length Prefixed (robust, binary-safe)
┌────────────────────────────────────────────────┐
│ [4 bytes: len] │ [len bytes: payload] │ ... │
│ 0x000B │ "NICK Alice" │ │
└────────────────────────────────────────────────┘
2.2 Why This Matters
Chat servers are the foundation of real-time communication:
- Slack, Discord: Modern chat applications
- IRC: The original chat protocol (still used!)
- WebSocket servers: Real-time web applications
- Game servers: Player communication
Understanding socket programming teaches:
- How the internet actually works
- Why HTTP/WebSocket/gRPC exist (they solve these same problems)
- How to debug network issues
- Foundation for distributed systems
2.3 Historical Context
Socket programming dates to BSD Unix 4.2 (1983). The socket API was designed to be protocol-agnostic - the same functions work for TCP, UDP, Unix domain sockets, and more.
Key milestones:
- 1983: BSD 4.2 introduces sockets
- 1988: POSIX standardizes socket API
- 1992: Winsock brings sockets to Windows
- 1999: Linux adds epoll (better than select)
- 2007: Mac OS X adds kqueue (BSD equivalent)
The chat server pattern led to:
- IRC (1988): Internet Relay Chat
- XMPP (1999): Jabber/Google Talk
- WebSocket (2011): Real-time web
- Modern systems: Slack, Discord, Teams
2.4 Common Misconceptions
Misconception 1: “recv() returns one complete message” Reality: recv() returns whatever bytes are available, which may be partial, combined, or fragmented. You must buffer and parse.
Misconception 2: “send() sends all bytes” Reality: send() may send fewer bytes than requested. Check return value and retry with remaining bytes.
Misconception 3: “select() tells you how many bytes are available” Reality: select() only tells you that at least 1 byte is available. You must call recv() to find out more.
Misconception 4: “Closing a socket is instant” Reality: close() may block if linger is enabled and there’s unsent data. TCP also has TIME_WAIT state.
Misconception 5: “Port numbers above 1024 don’t need root” Reality: Correct! Ports 1-1023 are privileged on Unix. Use ports 1024-65535 for development.
3. Project Specification
3.1 What You Will Build
A chat server that:
- Accepts multiple simultaneous client connections
- Supports nicknames, chat rooms, and private messages
- Uses select() or poll() for I/O multiplexing
- Handles partial reads/writes correctly
- Works with telnet or netcat as a client
3.2 Functional Requirements
| Requirement | Description |
|---|---|
| FR-1 | Accept connections on a configurable port |
| FR-2 | Support /nick command to set username |
| FR-3 | Support /join command to enter a room |
| FR-4 | Broadcast messages to all users in a room |
| FR-5 | Support /msg for private messages |
| FR-6 | Support /rooms to list available rooms |
| FR-7 | Support /quit to disconnect |
| FR-8 | Handle client disconnection gracefully |
3.3 Non-Functional Requirements
| Requirement | Description |
|---|---|
| NFR-1 | Handle at least 100 concurrent connections |
| NFR-2 | No blocking operations in main loop |
| NFR-3 | Graceful handling of malformed input |
| NFR-4 | No memory leaks on client disconnect |
| NFR-5 | Cross-platform (Linux, macOS, optionally Windows) |
3.4 Example Usage / Output
Terminal 1 - Server:
$ ./chat_server 8080
Chat server listening on port 8080...
[12:00:01] Client connected from 127.0.0.1:54321
[12:00:03] Client connected from 127.0.0.1:54322
[12:00:05] Anonymous1 set nickname to Alice
[12:00:07] Anonymous2 set nickname to Bob
[12:00:10] Alice joined room: general
[12:00:12] Bob joined room: general
[12:00:15] [general] Alice: Hello everyone!
[12:00:18] [general] Bob: Hi Alice!
[12:00:45] Bob disconnected
Terminal 2 - Client (Alice):
$ nc localhost 8080
Welcome to the chat server!
Use /nick <name> to set your nickname.
Use /join <room> to join a room.
Use /msg <user> <message> for private messages.
Use /rooms to list rooms.
Use /quit to disconnect.
/nick Alice
Your nickname is now: Alice
/join general
Joined room: general
Hello everyone!
[Bob]: Hi Alice!
/msg Bob Want to chat privately?
Private message sent to Bob.
[Private from Bob]: Sure!
/quit
Goodbye!
Terminal 3 - Client (Bob):
$ nc localhost 8080
Welcome to the chat server!
...
/nick Bob
Your nickname is now: Bob
/join general
Joined room: general
Alice is in this room.
[Alice]: Hello everyone!
Hi Alice!
[Private from Alice]: Want to chat privately?
/msg Alice Sure!
3.5 Real World Outcome
$ ./chat_server 8080 &
$ ./stress_test --clients 100 --messages 1000
Stress Test Results:
Clients connected: 100/100
Messages sent: 100,000
Messages received: 100,000
Average latency: 0.3ms
Max latency: 12ms
Dropped connections: 0
Memory usage: 8.2 MB (stable)
$ valgrind ./chat_server 8080
# Run test clients
# All clients disconnect
==12345== LEAK SUMMARY:
==12345== definitely lost: 0 bytes
==12345== indirectly lost: 0 bytes
4. Solution Architecture
4.1 High-Level Design
┌─────────────────────────────────────────────────────────────────────┐
│ Chat Server │
│ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ Server Class │ │
│ │ │ │
│ │ int server_socket_ ← Listening socket │ │
│ │ int port_ │ │
│ │ std::map<int, Client> clients_ ← fd -> client info │ │
│ │ std::map<std::string, Room> rooms_ ← room name -> room │ │
│ │ bool running_ │ │
│ │ │ │
│ │ void run() ← Main event loop │ │
│ │ void acceptConnection() │ │
│ │ void handleClientData(int fd) │ │
│ │ void broadcastToRoom(room, message) │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ Client Struct │ │ Room Struct │ │
│ │ │ │ │ │
│ │ int fd │ │ std::string name │ │
│ │ std::string nick │ │ std::set<int> fds │ │
│ │ std::string room │ │ │ │
│ │ std::string buffer │ │ │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
4.2 Key Components
| Component | Responsibility |
|---|---|
| Server | Main loop, socket management, command routing |
| Client | Per-connection state (nickname, room, read buffer) |
| Room | Group of clients, message broadcasting |
| Command Parser | Parse /nick, /join, /msg, etc. |
4.3 Data Structures
struct Client {
int fd;
std::string nickname;
std::string current_room;
std::string read_buffer; // Accumulated partial reads
std::string write_buffer; // Pending writes (if using non-blocking)
sockaddr_in address;
time_t connected_at;
};
struct Room {
std::string name;
std::set<int> member_fds;
time_t created_at;
};
class Server {
int server_fd_;
std::map<int, Client> clients_;
std::map<std::string, Room> rooms_;
bool running_;
};
4.4 Algorithm Overview
Main Event Loop:
while running:
// Build fd_set for select
FD_ZERO(&read_fds)
FD_SET(server_fd, &read_fds)
max_fd = server_fd
for each client fd in clients:
FD_SET(fd, &read_fds)
max_fd = max(max_fd, fd)
// Wait for activity (blocks until something happens)
select(max_fd + 1, &read_fds, NULL, NULL, NULL)
// Check for new connections
if FD_ISSET(server_fd, &read_fds):
acceptConnection()
// Check each client for data
for each client fd in clients:
if FD_ISSET(fd, &read_fds):
handleClientData(fd)
Handle Client Data:
bytes = recv(fd, buffer, sizeof(buffer), 0)
if bytes <= 0:
// Client disconnected
removeClient(fd)
return
// Append to client's read buffer
client.read_buffer += buffer
// Process complete lines
while (pos = client.read_buffer.find('\n')) != npos:
line = client.read_buffer.substr(0, pos)
client.read_buffer.erase(0, pos + 1)
processCommand(fd, line)
5. Implementation Guide
5.1 Development Environment Setup
Required headers (POSIX):
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <sys/select.h> // or <poll.h>
For Windows (Winsock):
#include <winsock2.h>
#include <ws2tcpip.h>
#pragma comment(lib, "ws2_32.lib")
Compile:
# Linux/macOS
g++ -std=c++17 -Wall -Wextra -o chat_server chat_server.cpp
# With debugging
g++ -std=c++17 -g -fsanitize=address -o chat_server chat_server.cpp
5.2 Project Structure
chat_server/
├── include/
│ ├── server.hpp
│ ├── client.hpp
│ └── room.hpp
├── src/
│ ├── server.cpp
│ ├── main.cpp
│ └── commands.cpp
├── tests/
│ ├── test_server.cpp
│ └── stress_test.cpp
├── CMakeLists.txt
└── README.md
5.3 The Core Question You’re Answering
“How do I handle multiple simultaneous network connections without creating a thread per connection?”
This is fundamental to building scalable network servers. Thread-per-connection doesn’t scale (memory, context switching). I/O multiplexing allows one thread to handle thousands of connections.
5.4 Concepts You Must Understand First
- What is a file descriptor?
- Integer handle to kernel resource (file, socket, pipe)
- Operations: read(), write(), close()
- What does bind() do vs listen()?
- bind(): Associate socket with address:port
- listen(): Mark socket as passive (accepting connections)
- Why is accept() blocking by default?
- Waits until client connects
- Returns new socket fd for that specific client
- What is fd_set and how does select() use it?
- Bitmask of file descriptors
- select() modifies it to show which are ready
- Why can’t you just call recv() in a loop?
- It blocks if no data available
- Would freeze handling other clients
5.5 Questions to Guide Your Design
Architecture:
- Single-threaded with select/poll, or multi-threaded?
- How to handle slow clients (ones that receive slowly)?
- Should rooms persist after all users leave?
Protocol Design:
- What delimiter to use? Newline? NULL byte? Length prefix?
- How to handle messages with newlines in them?
- Maximum message length?
Error Handling:
- What if client sends invalid command?
- What if send() can’t send all bytes?
- What if client buffer grows too large (DoS)?
5.6 Thinking Exercise
Trace through this scenario:
T=0: Server starts, listening on fd=3
T=1: Client A connects, accept() returns fd=4
T=2: Client B connects, accept() returns fd=5
T=3: Client A sends: "/nick Ali" (partial, no newline yet)
T=4: Client B sends: "/nick Bob\n"
T=5: Client A sends: "ce\n" (completing the nickname)
T=6: Client A sends: "/join lobby\n/msg Bob hi\n" (two commands in one recv)
Questions:
- After T=3, what’s in Client A’s buffer? Can we process anything yet?
- At T=4, how do we know Bob’s message is complete?
- At T=6, how many commands do we process?
- What if the server fd and client fds are all ready at once?
5.7 Hints in Layers
Hint 1 - Starting Point (Conceptual): Start with just accepting connections and echoing back whatever clients send. Get that working first, then add command parsing, then rooms.
Hint 2 - Next Level (More Specific): The select() loop should look like:
while (running_) {
fd_set read_fds;
FD_ZERO(&read_fds);
FD_SET(server_fd_, &read_fds);
int max_fd = server_fd_;
for (const auto& [fd, client] : clients_) {
FD_SET(fd, &read_fds);
if (fd > max_fd) max_fd = fd;
}
int ready = select(max_fd + 1, &read_fds, nullptr, nullptr, nullptr);
if (ready < 0) {
perror("select");
break;
}
// Check server socket
if (FD_ISSET(server_fd_, &read_fds)) {
acceptConnection();
}
// Check client sockets
for (auto it = clients_.begin(); it != clients_.end(); ) {
int fd = it->first;
if (FD_ISSET(fd, &read_fds)) {
if (!handleClientData(fd)) {
// Client disconnected
it = clients_.erase(it);
continue;
}
}
++it;
}
}
Hint 3 - Technical Details (Socket Setup):
int createServerSocket(int port) {
int fd = socket(AF_INET, SOCK_STREAM, 0);
if (fd < 0) {
throw std::runtime_error("socket() failed");
}
// Allow reuse of address (important for quick restart)
int opt = 1;
setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
sockaddr_in addr{};
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = INADDR_ANY;
addr.sin_port = htons(port);
if (bind(fd, (sockaddr*)&addr, sizeof(addr)) < 0) {
throw std::runtime_error("bind() failed");
}
if (listen(fd, SOMAXCONN) < 0) {
throw std::runtime_error("listen() failed");
}
return fd;
}
Hint 4 - Message Buffering:
bool handleClientData(int fd) {
char buf[1024];
ssize_t n = recv(fd, buf, sizeof(buf), 0);
if (n <= 0) {
// n == 0: client closed connection
// n < 0: error (check errno)
return false;
}
clients_[fd].read_buffer.append(buf, n);
// Process complete lines
std::string& buffer = clients_[fd].read_buffer;
size_t pos;
while ((pos = buffer.find('\n')) != std::string::npos) {
std::string line = buffer.substr(0, pos);
buffer.erase(0, pos + 1);
// Remove \r if present (Windows telnet)
if (!line.empty() && line.back() == '\r') {
line.pop_back();
}
processCommand(fd, line);
}
// Prevent buffer from growing too large (DoS protection)
if (buffer.size() > 4096) {
sendToClient(fd, "ERROR: Message too long\n");
buffer.clear();
}
return true;
}
5.8 The Interview Questions They’ll Ask
- “What’s the difference between select(), poll(), and epoll()?”
- select: Limited to FD_SETSIZE (often 1024), copies fd_set each call
- poll: No fd limit, still O(n) to check all fds
- epoll: O(1) after setup, kernel maintains interest list
- “Why use SO_REUSEADDR?”
- Allows binding to a port in TIME_WAIT state
- Essential for quick server restarts during development
- “What happens if send() returns less than requested?”
- Not all bytes were sent. You must retry with remaining bytes.
- Consider using non-blocking sockets and tracking write buffers.
- “How would you handle 10,000 concurrent connections?”
- Use epoll (Linux) or kqueue (macOS)
- Consider multiple threads with work distribution
- May need to increase system file descriptor limits
- “How do you prevent a slow client from blocking others?”
- Use non-blocking I/O
- Track write buffers per client
- Disconnect clients with huge pending writes
5.9 Books That Will Help
| Topic | Book & Chapter |
|---|---|
| Socket API | “TCP/IP Sockets in C” Ch. 1-4 - Donahoo & Calvert |
| I/O Multiplexing | “The Linux Programming Interface” Ch. 63 - Kerrisk |
| TCP/IP Protocol | “TCP/IP Illustrated, Vol. 1” - Stevens |
| Network byte order | “Unix Network Programming” Ch. 3 - Stevens |
| High-performance I/O | “Linux System Programming” Ch. 4 - Robert Love |
5.10 Implementation Phases
Phase 1: Basic Server (Day 1-3)
- Create listening socket
- Accept connections
- Echo received data back
- Handle disconnect
Phase 2: Multiple Clients (Day 4-6)
- Implement select() loop
- Handle multiple clients simultaneously
- Add message buffering for partial reads
- Test with multiple telnet sessions
Phase 3: Commands (Day 7-10)
- Implement /nick command
- Implement /join and room management
- Implement message broadcasting to room
- Implement /rooms listing
Phase 4: Polish (Day 11-14)
- Add /msg private messaging
- Add /quit command
- Handle edge cases (no room, duplicate nicks)
- Write stress tests
5.11 Key Implementation Decisions
| Decision | Recommended Choice | Reasoning |
|---|---|---|
| I/O Model | select() for learning, poll() for production | select() is simpler, poll() has no fd limit |
| Message delimiter | Newline (\n) | Works with telnet/nc, simple parsing |
| Blocking mode | Blocking with select | Simpler than non-blocking |
| Thread model | Single-threaded | Sufficient for learning, avoids race conditions |
6. Testing Strategy
Manual Testing:
# Terminal 1: Server
./chat_server 8080
# Terminal 2: Client 1
nc localhost 8080
/nick Alice
/join general
Hello!
# Terminal 3: Client 2
nc localhost 8080
/nick Bob
/join general
# Should see "Hello!" from Alice
Automated Testing:
TEST(ChatServer, AcceptsConnection) {
Server server(8080);
std::thread t([&]{ server.run(); });
int client = connectTo("localhost", 8080);
ASSERT_GE(client, 0);
// Read welcome message
std::string msg = readLine(client);
EXPECT_TRUE(msg.find("Welcome") != std::string::npos);
close(client);
server.stop();
t.join();
}
7. Common Pitfalls & Debugging
| Problem | Symptom | Root Cause | Fix |
|---|---|---|---|
| “Address already in use” | Can’t restart server | Port in TIME_WAIT | Use SO_REUSEADDR |
| Messages combined | Two commands treated as one | Not buffering properly | Buffer and find delimiters |
| Server hangs | Stops responding | Blocking recv() on wrong fd | Use select() correctly |
| Memory leak | Growing memory | Not removing disconnected clients | Clean up on disconnect |
| Garbled data | Random characters | Not handling partial recv | Append to buffer properly |
Debug with netstat:
# See listening sockets
netstat -an | grep LISTEN
# See established connections
netstat -an | grep ESTABLISHED
# See TIME_WAIT (why you need SO_REUSEADDR)
netstat -an | grep TIME_WAIT
8. Extensions & Challenges
- TLS Support: Add encryption with OpenSSL
- WebSocket: Implement WebSocket protocol for browser clients
- Persistence: Save chat history to database
- Authentication: Add username/password login
- Rate Limiting: Prevent spam/DoS
- File Transfer: Allow users to share files
9. Real-World Connections
- IRC: Uses very similar architecture
- Discord: Modern chat with similar concepts
- Nginx: Uses epoll for high concurrency
- Redis: Single-threaded with epoll, handles millions of connections
10. Resources
11. Self-Assessment Checklist
- Server accepts connections on specified port
- Multiple clients can connect simultaneously
- /nick command works
- /join command works
- Messages broadcast to room members
- /msg private messages work
- Clean disconnect handling
- No memory leaks (Valgrind clean)
- Handles partial reads correctly
- Works with nc/telnet as client
12. Submission / Completion Criteria
Your implementation is complete when:
- Core features work: nick, join, rooms, msg, quit
- Handles 50+ simultaneous connections: Stress tested
- No resource leaks: Valgrind clean
- Handles edge cases: Partial reads, rapid disconnect
- Can explain: select() loop and message buffering in interview