Project 5: Const Generic Matrix - Type-Level Linear Algebra

Goal: Build a Matrix library where dimensions (rows/columns) are encoded in the type system, enabling compile-time dimension checking for all matrix operations. This eliminates an entire class of runtime errors by making invalid matrix operations impossible to express.

Project Metadata

Main Programming Language: Rust
Coolness Level: Level 3: Genuinely Clever
Difficulty: Level 2: Intermediate
Knowledge Area: Mathematics / Type Systems
Estimated Time: 1-2 weeks
Prerequisites: Basic Rust (ownership, traits, generics), elementary linear algebra

Learning Objectives

By completing this project, you will:

Understand const generics - How to use constant values as type parameters (const N: usize)
Master monomorphization - How the compiler generates specialized code for each matrix size
Implement operator traits - Overload +, -, * for custom types with proper trait bounds
Apply type-level constraints - Enforce mathematical rules (M x N * N x P = M x P) at compile time
Design zero-cost abstractions - Prove through benchmarks that type safety has no runtime overhead
Use fixed-size arrays - Understand why [[T; C]; R] beats Vec<Vec<T>> for known dimensions
Explore nightly features - Learn about generic_const_exprs for computed const values
Build production-quality APIs - Create ergonomic interfaces with new, zeros, identity, transpose
Write comprehensive tests - Including compile-fail tests that verify type errors
Connect to real libraries - Understand how nalgebra, ndarray, and graphics engines use these patterns

Deep Theoretical Foundation

Before writing any code, you must internalize the concepts that make this project powerful. This section provides the theoretical underpinning for type-level programming in Rust.

Type-Level Programming: Encoding Values in Types

In most programming, values exist at runtime:

// Runtime value - known when program runs
let rows: usize = 3;
let cols: usize = 4;

In type-level programming, values exist at compile time:

// Compile-time value - known when program is compiled
struct Matrix<const R: usize, const C: usize>;
let m: Matrix<3, 4>;  // The "3" and "4" ARE the type

This shift is profound. When dimensions are part of the type:

Regular Matrix:            Const Generic Matrix:
+------------------+       +------------------+
| rows: usize      |       | NO RUNTIME DATA  |
| cols: usize      |       | Type IS the data |
| data: Vec<f32>   |       | data: [[f32;C];R]|
+------------------+       +------------------+
     |                            |
     v                            v
  Runtime check:              Compile-time check:
  if rows != other.cols       The code won't compile
     panic!("Mismatch!")      if dimensions mismatch

History of Const Generics: RFC 2000 and Rust 1.51

Const generics were one of Rust’s most requested features. Here’s the timeline:

2017: RFC 2000 proposed (Const Generics)
      "Allow generic parameters over constant values"

2019: Partial implementation lands (min_const_generics)
      Limited to integers, no const expressions

2021: Rust 1.51 stabilizes min_const_generics
      - Arrays can use const generics: [T; N]
      - Structs can have const parameters
      - Basic arithmetic NOT allowed yet

2023+: generic_const_exprs (nightly)
       - Allows R * C computations
       - Enables more complex type math

Before const generics, you had two bad options:

// Option 1: Macros (ugly, slow to compile)
macro_rules! matrix {
    ($rows:expr, $cols:expr) => {
        struct Matrix { data: [[f32; $cols]; $rows] }
    };
}

// Option 2: Runtime dimensions (slow, error-prone)
struct Matrix {
    rows: usize,
    cols: usize,
    data: Vec<f32>,
}

Const generics give us the best of both worlds: compile-time guarantees with clean syntax.

Monomorphization: How the Compiler Specializes Your Code

When you write generic code, Rust doesn’t create one “generic” function. It creates a specialized copy for each concrete type used.

// You write this:
impl<const R: usize, const C: usize> Matrix<R, C> {
    pub fn zeros() -> Self {
        Self { data: [[0.0; C]; R] }
    }
}

// The compiler generates these (conceptually):
impl Matrix<2, 2> {
    pub fn zeros() -> Self {
        Self { data: [[0.0, 0.0], [0.0, 0.0]] }
    }
}

impl Matrix<3, 4> {
    pub fn zeros() -> Self {
        Self { data: [[0.0, 0.0, 0.0, 0.0],
                      [0.0, 0.0, 0.0, 0.0],
                      [0.0, 0.0, 0.0, 0.0]] }
    }
}
// ... one for each size you actually use

Visual representation of monomorphization:

Source Code                      Compiled Binary
+------------------+             +------------------+
| Matrix<R, C>     |             | Matrix_2_2       |
| generic code     | ==========> | Matrix_3_4       |
|                  |   rustc     | Matrix_10_10     |
+------------------+             +------------------+
                                 (specialized copies)

Tradeoff: More sizes = larger binary. But each function is optimized perfectly for its specific size.

Compile-Time vs Runtime Dimension Checking

Let’s compare the two approaches:

RUNTIME CHECKING (Traditional)

fn multiply(a: DynMatrix, b: DynMatrix) -> Result<DynMatrix, Error> {
    if a.cols != b.rows {           // Check happens at RUNTIME
        return Err("Dimension mismatch!");
    }
    // ... perform multiplication
}

+----------------------------------+
| CALL multiply(m1, m2)            |
|                                  |
|   1. Load a.cols from memory     |
|   2. Load b.rows from memory     |
|   3. Compare (CPU instruction)   |
|   4. Branch to error or continue |
|                                  |
+----------------------------------+
   This check runs EVERY time.
   Error found: Program crashes or returns Err.


COMPILE-TIME CHECKING (Const Generics)

fn multiply<const R1: usize, const C1: usize, const C2: usize>
    (a: Matrix<R1, C1>, b: Matrix<C1, C2>) -> Matrix<R1, C2>
{                            ^^^      ^^^
    // No check needed!      These MUST match or code won't compile
    // ... perform multiplication
}

+----------------------------------+
| CALL multiply(m1, m2)            |
|                                  |
|   1. No dimension check          |
|   2. No branch                   |
|   3. Direct to multiplication    |
|                                  |
+----------------------------------+
   Error found: Code won't compile. Ever.

Linear Algebra Basics for Matrix Operations

Before implementing matrix operations, you must understand the mathematical rules:

Matrix Dimensions

        C columns
       <-------->
    +--+--+--+--+
R   | a| b| c| d|  <- Row 0
r   +--+--+--+--+
o   | e| f| g| h|  <- Row 1
w   +--+--+--+--+
s   | i| j| k| l|  <- Row 2
    +--+--+--+--+

This is a 3x4 matrix (R=3 rows, C=4 columns)
Storage: [[a,b,c,d], [e,f,g,h], [i,j,k,l]]

Matrix Addition (Same Dimensions Required)

Matrix A (2x3)     Matrix B (2x3)     Result (2x3)
+--+--+--+         +--+--+--+         +--+--+--+
| 1| 2| 3|    +    | 9| 8| 7|    =    |10|10|10|
+--+--+--+         +--+--+--+         +--+--+--+
| 4| 5| 6|         | 6| 5| 4|         |10|10|10|
+--+--+--+         +--+--+--+         +--+--+--+

Rule: A(R, C) + B(R, C) = Result(R, C)
      Dimensions MUST match exactly.

Matrix Multiplication (Inner Dimensions Must Match)

Matrix A (M x N)         Matrix B (N x P)         Result (M x P)
       N cols                  P cols                  P cols
      <----->                 <----->                 <----->
   +--+--+--+              +--+--+              +--+--+
M  | a| b| c|         N    | x| y|          M  | ?| ?|
r  +--+--+--+   x     r    +--+--+     =    r  +--+--+
o  | d| e| f|         o    | z| w|          o  | ?| ?|
w  +--+--+--+         w    +--+--+          w  +--+--+
s                     s    | u| v|          s
                           +--+--+

A is M x N (2 rows, 3 cols)
B is N x P (3 rows, 2 cols)
Result is M x P (2 rows, 2 cols)

CRITICAL RULE: A's columns (N) MUST equal B's rows (N)

The calculation for element [i, j] of the result:

result[i][j] = sum(A[i][k] * B[k][j]) for k in 0..N

Example: result[0][0] = a*x + b*z + c*u
         result[0][1] = a*y + b*w + c*v
         result[1][0] = d*x + e*z + f*u
         result[1][1] = d*y + e*w + f*v

The Type State Pattern

Const generics enable the “Type State” pattern - encoding valid states in types:

// Traditional approach (runtime errors possible)
struct RocketEngine {
    ignited: bool,
}

impl RocketEngine {
    fn fire(&mut self) {
        if !self.ignited {
            panic!("Engine not ignited!");  // Runtime error
        }
        // ...
    }
}

// Type State approach (compile-time safety)
struct RocketEngine<const IGNITED: bool>;

impl RocketEngine<false> {
    fn ignite(self) -> RocketEngine<true> {
        // Transform the type, not just a flag
        RocketEngine::<true>
    }
}

impl RocketEngine<true> {
    fn fire(&self) {
        // Only callable when IGNITED is true at the type level
    }
}

// This won't compile:
// let engine = RocketEngine::<false>;
// engine.fire();  // ERROR: fire() doesn't exist on RocketEngine<false>

With matrices, we use this pattern for dimension compatibility:

// Only matrices with matching inner dimension can be multiplied
impl<const R: usize, const N: usize, const C: usize>
    Mul<Matrix<N, C>> for Matrix<R, N>
{           ^^^                 ^^^
    type Output = Matrix<R, C>;
    // The constraint is IN THE TYPE SIGNATURE
}

Zero-Cost Abstractions: Proving No Overhead

A “zero-cost abstraction” means the safe, high-level code compiles to the same machine code as hand-optimized low-level code.

High-level Rust:                    Equivalent hand-optimized C:

let a: Matrix<2,2> = ...;           float a[2][2] = ...;
let b: Matrix<2,2> = ...;           float b[2][2] = ...;
let c = a + b;                      float c[2][2];
                                    c[0][0] = a[0][0] + b[0][0];
                                    c[0][1] = a[0][1] + b[0][1];
                                    c[1][0] = a[1][0] + b[1][0];
                                    c[1][1] = a[1][1] + b[1][1];

Both compile to IDENTICAL assembly!

You can verify this with:

$ cargo rustc --release -- --emit asm
# Or use Compiler Explorer (godbolt.org)

The wrapper types (Matrix<R, C>) disappear completely. They exist only at compile time.

Array vs Vec for Fixed-Size Data

Vec<Vec<f32>> (Dynamic)            [[f32; C]; R] (Static)
+-------+                          +------------------------+
| ptr --|---> [ptr, ptr, ptr]      | a | b | c | d | e | f |
| len   |      |    |    |         +------------------------+
| cap   |      v    v    v         All data contiguous!
+-------+     [ ]  [ ]  [ ]        Size known at compile time.
              Scattered in memory!  Perfect for CPU cache.
              Heap allocations.     Stack or inline allocation.
              Runtime bounds checks No bounds check if inlined.

Performance implications:

Cache Performance:

Vec<Vec<T>>:
+------+    +------+    +------+
|Row 0 | -> |Row 1 | -> |Row 2 |   (Cache misses on each row access)
+------+    +------+    +------+

[[T; C]; R]:
+------+------+------+------+------+------+
|  a   |  b   |  c   |  d   |  e   |  f   |   (All in one cache line)
+------+------+------+------+------+------+

The `generic_const_exprs` Feature (Nightly)

For advanced operations, you need computed const values:

#![feature(generic_const_exprs)]

// Flatten a matrix into a 1D array
impl<const R: usize, const C: usize> Matrix<R, C>
where
    [(); R * C]:,  // Ensures R * C is a valid array size
{
    pub fn flatten(self) -> [f32; R * C] {
        // Now we can compute R * C at compile time!
        unsafe { std::mem::transmute(self.data) }
    }
}

This is unstable because:

The compiler must prove R * C doesn’t overflow
Const evaluation is limited in edge cases
Error messages can be confusing

For now, work around it with stable Rust patterns.

Memory Layout: Row-Major vs Column-Major

Matrix (3 rows x 4 columns):

     Col 0  Col 1  Col 2  Col 3
    +------+------+------+------+
Row 0|  a   |  b   |  c   |  d   |
    +------+------+------+------+
Row 1|  e   |  f   |  g   |  h   |
    +------+------+------+------+
Row 2|  i   |  j   |  k   |  l   |
    +------+------+------+------+

ROW-MAJOR STORAGE (our choice):
Memory: [a, b, c, d, e, f, g, h, i, j, k, l]
        <-- Row 0 --><-- Row 1 --><-- Row 2 -->
Access [1][2] = memory[1*4 + 2] = memory[6] = g

COLUMN-MAJOR STORAGE (Fortran, BLAS):
Memory: [a, e, i, b, f, j, c, g, k, d, h, l]
        <-Col0-><-Col1-><-Col2-><-Col3->
Access [1][2] = memory[2*3 + 1] = memory[7] = g

Row-major is better for:
- Iterating row by row (common in most algorithms)
- C-style array compatibility
- Intuitive [[T; C]; R] representation

Column-major is better for:
- BLAS/LAPACK interop
- Matrix-vector multiplication

Our [[T; C]; R] storage is naturally row-major.

Solution Architecture

The Core Matrix Struct

/// A fixed-size matrix with dimensions known at compile time.
///
/// # Type Parameters
/// - `R`: Number of rows (must be > 0)
/// - `C`: Number of columns (must be > 0)
///
/// # Memory Layout
/// Row-major storage using nested arrays: [[T; C]; R]
/// Total size: R * C * size_of::<f64>() bytes
#[derive(Clone, Copy, PartialEq)]
pub struct Matrix<const R: usize, const C: usize> {
    data: [[f64; C]; R],
}

Storage Decision

WHY [[T; C]; R] and not [[T; R]; C]?

Given: 3 rows, 4 columns

[[T; C]; R] = [[T; 4]; 3]
            = [ [_, _, _, _],    <- Row 0 (4 elements)
                [_, _, _, _],    <- Row 1 (4 elements)
                [_, _, _, _] ]   <- Row 2 (4 elements)

This matches mathematical notation: matrix[row][col]

Access pattern:
  self.data[row][col]  // Natural!

Required Trait Implementations

+----------------+----------------------------------------+
| Trait          | Purpose                                |
+----------------+----------------------------------------+
| Debug          | Pretty-print matrices                  |
| Clone, Copy    | Value semantics (for small matrices)   |
| PartialEq      | Compare matrices element-wise          |
| Default        | Create zero matrix                     |
| Index          | Access elements with matrix[r][c]      |
| Add            | Matrix + Matrix (same dimensions)      |
| Sub            | Matrix - Matrix (same dimensions)      |
| Mul            | Matrix * Matrix (compatible dims)      |
| Mul<f64>       | Scalar multiplication                  |
+----------------+----------------------------------------+

Multiplication Constraint Visualization

The Magic of Type-Level Dimension Checking:

impl<const R1: usize, const C1: usize, const C2: usize>
    Mul<Matrix<C1, C2>> for Matrix<R1, C1>
{
    type Output = Matrix<R1, C2>;
    ...
}

What this says:
- Left matrix:  Matrix<R1, C1> (R1 rows, C1 cols)
- Right matrix: Matrix<C1, C2> (C1 rows, C2 cols)
                      ^^   ^^
                      These MUST be the same type parameter!
- Result:       Matrix<R1, C2> (R1 rows, C2 cols)

Example:
  Matrix<3, 5> * Matrix<5, 2> = Matrix<3, 2>  // OK!
  Matrix<3, 5> * Matrix<4, 2>                 // ERROR: no impl for Mul<Matrix<4, 2>>

Helper Methods

impl<const R: usize, const C: usize> Matrix<R, C> {
    /// Create a new matrix from raw data
    pub fn new(data: [[f64; C]; R]) -> Self;

    /// Create a matrix filled with zeros
    pub fn zeros() -> Self;

    /// Create a matrix filled with ones
    pub fn ones() -> Self;

    /// Access element at (row, col)
    pub fn get(&self, row: usize, col: usize) -> Option<&f64>;

    /// Mutable access to element
    pub fn get_mut(&mut self, row: usize, col: usize) -> Option<&mut f64>;

    /// Get dimensions as tuple
    pub fn dimensions(&self) -> (usize, usize);
}

impl<const N: usize> Matrix<N, N> {
    /// Create an identity matrix (only for square matrices)
    pub fn identity() -> Self;

    /// Calculate trace (sum of diagonal)
    pub fn trace(&self) -> f64;
}

impl<const R: usize, const C: usize> Matrix<R, C> {
    /// Transpose: swap rows and columns
    pub fn transpose(&self) -> Matrix<C, R>;
}

Phased Implementation Guide

Phase 1: Define the Basic Matrix Struct (Day 1)

Goal: Create the struct and basic constructors.

Tasks:

Create a new library: cargo new --lib const-matrix
Define Matrix<const R: usize, const C: usize>
Implement new(), zeros(), and basic indexing
Add Debug trait for printing

Key Code:

#[derive(Clone, Copy, PartialEq)]
pub struct Matrix<const R: usize, const C: usize> {
    data: [[f64; C]; R],
}

impl<const R: usize, const C: usize> Matrix<R, C> {
    pub fn new(data: [[f64; C]; R]) -> Self {
        Self { data }
    }

    pub fn zeros() -> Self {
        Self { data: [[0.0; C]; R] }
    }
}

Checkpoint: Can you create Matrix::<3, 4>::zeros()?

Phase 2: Implement Debug and Basic Methods (Day 2-3)

Goal: Make matrices printable and add element access.

Tasks:

Implement std::fmt::Debug with nice formatting
Add get() and get_mut() for safe element access
Implement Index and IndexMut traits
Add dimensions() method

Key Code for Debug:

impl<const R: usize, const C: usize> std::fmt::Debug for Matrix<R, C> {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        writeln!(f, "Matrix<{}, {}> [", R, C)?;
        for row in &self.data {
            write!(f, "  [")?;
            for (i, val) in row.iter().enumerate() {
                if i > 0 { write!(f, ", ")?; }
                write!(f, "{:8.4}", val)?;
            }
            writeln!(f, "]")?;
        }
        write!(f, "]")
    }
}

Checkpoint: Does println!("{:?}", matrix) show a nicely formatted matrix?

Phase 3: Implement Add for Same-Size Matrices (Day 4-5)

Goal: Enable matrix_a + matrix_b syntax.

Tasks:

Implement Add<Matrix<R, C>> for Matrix<R, C>
Implement Sub<Matrix<R, C>> for Matrix<R, C>
Implement scalar multiplication Mul<f64> for Matrix<R, C>
Write tests proving different-size addition fails to compile

Key Code:

impl<const R: usize, const C: usize> std::ops::Add for Matrix<R, C> {
    type Output = Self;

    fn add(self, rhs: Self) -> Self::Output {
        let mut result = [[0.0; C]; R];
        for i in 0..R {
            for j in 0..C {
                result[i][j] = self.data[i][j] + rhs.data[i][j];
            }
        }
        Matrix { data: result }
    }
}

Checkpoint: Does Matrix::<2, 3>::zeros() + Matrix::<2, 3>::ones() work? Does Matrix::<2, 3>::zeros() + Matrix::<3, 2>::ones() fail to compile?

Phase 4: Implement Mul with Dimension Checking (Day 6-8)

Goal: Matrix multiplication with compile-time dimension verification.

Tasks:

Implement Mul<Matrix<C1, C2>> for Matrix<R1, C1> with output Matrix<R1, C2>
Implement the actual multiplication algorithm
Add special case for square matrix identity
Write tests for various dimension combinations

Key Code:

impl<const R1: usize, const C1: usize, const C2: usize>
    std::ops::Mul<Matrix<C1, C2>> for Matrix<R1, C1>
{
    type Output = Matrix<R1, C2>;

    fn mul(self, rhs: Matrix<C1, C2>) -> Self::Output {
        let mut result = [[0.0; C2]; R1];
        for i in 0..R1 {
            for j in 0..C2 {
                for k in 0..C1 {
                    result[i][j] += self.data[i][k] * rhs.data[k][j];
                }
            }
        }
        Matrix { data: result }
    }
}

Checkpoint: Does Matrix::<3, 2>::zeros() * Matrix::<2, 4>::ones() return Matrix<3, 4>?

Phase 5: Add SIMD Optimization (Optional, Day 9+)

Goal: Use SIMD intrinsics for faster operations.

Tasks:

Add feature flag for SIMD: #[cfg(feature = "simd")]
Use std::simd (nightly) or packed_simd crate
Implement vectorized addition for aligned matrices
Benchmark against naive implementation

Key Code (nightly):

#![feature(portable_simd)]
use std::simd::f64x4;

// For matrices where C is a multiple of 4
impl<const R: usize, const C: usize> Matrix<R, C>
where
    [(); C / 4]:,  // C divisible by 4
{
    pub fn add_simd(self, rhs: Self) -> Self {
        let mut result = self;
        for i in 0..R {
            for j in (0..C).step_by(4) {
                let a = f64x4::from_slice(&self.data[i][j..j+4]);
                let b = f64x4::from_slice(&rhs.data[i][j..j+4]);
                (a + b).copy_to_slice(&mut result.data[i][j..j+4]);
            }
        }
        result
    }
}

Testing Strategy

Unit Tests for Basic Operations

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_zeros() {
        let m: Matrix<3, 4> = Matrix::zeros();
        assert_eq!(m.data, [[0.0; 4]; 3]);
    }

    #[test]
    fn test_addition() {
        let a = Matrix::new([[1.0, 2.0], [3.0, 4.0]]);
        let b = Matrix::new([[5.0, 6.0], [7.0, 8.0]]);
        let c = a + b;
        assert_eq!(c.data, [[6.0, 8.0], [10.0, 12.0]]);
    }

    #[test]
    fn test_multiplication_dimensions() {
        let a: Matrix<3, 2> = Matrix::ones();
        let b: Matrix<2, 4> = Matrix::ones();
        let c: Matrix<3, 4> = a * b;  // Dimensions checked at compile time!

        // Each element is sum of row_a * col_b = 1*1 + 1*1 = 2
        assert_eq!(c.data[0][0], 2.0);
    }
}

Compile-Fail Tests

Create a test that SHOULD fail to compile. Use trybuild crate:

// tests/compile_fail.rs
#[test]
fn compile_fail_tests() {
    let t = trybuild::TestCases::new();
    t.compile_fail("tests/ui/*.rs");
}

// tests/ui/dimension_mismatch.rs
use const_matrix::Matrix;

fn main() {
    let a: Matrix<3, 2> = Matrix::zeros();
    let b: Matrix<5, 4> = Matrix::zeros();
    let _ = a * b;  // ERROR: no impl for Mul<Matrix<5, 4>>
}

// tests/ui/add_mismatch.rs
use const_matrix::Matrix;

fn main() {
    let a: Matrix<3, 2> = Matrix::zeros();
    let b: Matrix<2, 3> = Matrix::zeros();
    let _ = a + b;  // ERROR: expected Matrix<3, 2>, found Matrix<2, 3>
}

Property-Based Tests with proptest

use proptest::prelude::*;

proptest! {
    #[test]
    fn matrix_addition_commutative(
        a00 in -1000.0..1000.0f64,
        a01 in -1000.0..1000.0f64,
        b00 in -1000.0..1000.0f64,
        b01 in -1000.0..1000.0f64,
    ) {
        let a = Matrix::new([[a00, a01]]);
        let b = Matrix::new([[b00, b01]]);
        prop_assert_eq!(a + b, b + a);
    }

    #[test]
    fn matrix_multiplication_associative(/* ... */) {
        // (A * B) * C == A * (B * C)
    }
}

Benchmark vs nalgebra

// benches/comparison.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use const_matrix::Matrix;
use nalgebra::SMatrix;

fn benchmark_addition(c: &mut Criterion) {
    let a: Matrix<100, 100> = Matrix::ones();
    let b: Matrix<100, 100> = Matrix::ones();

    let na: SMatrix<f64, 100, 100> = SMatrix::from_element(1.0);
    let nb: SMatrix<f64, 100, 100> = SMatrix::from_element(1.0);

    c.bench_function("const_matrix add 100x100", |bench| {
        bench.iter(|| black_box(a) + black_box(b))
    });

    c.bench_function("nalgebra add 100x100", |bench| {
        bench.iter(|| black_box(na) + black_box(nb))
    });
}

Common Pitfalls and Debugging

Pitfall 1: Forgetting Copy Bounds

// WRONG: T might not be Copy
impl<T, const R: usize, const C: usize> Matrix<T, R, C> {
    pub fn zeros() -> Self {
        Self { data: [[T::default(); C]; R] }  // Error!
    }
}

// CORRECT: Constrain T
impl<T: Copy + Default, const R: usize, const C: usize> Matrix<T, R, C> {
    pub fn zeros() -> Self {
        Self { data: [[T::default(); C]; R] }  // OK!
    }
}

Pitfall 2: Confusing Row/Column Order

// WRONG: Transposed indices
let m: Matrix<3, 4> = ...;  // 3 rows, 4 cols
let val = m.data[3][2];      // Out of bounds! Max row is 2

// CORRECT: [row][col]
let val = m.data[2][3];      // Row 2, Col 3 (valid!)

Pitfall 3: Stack Overflow for Large Matrices

// DANGEROUS: Huge matrix on stack
let m: Matrix<10000, 10000> = Matrix::zeros();  // 800 MB!

// SOLUTION: Box large matrices
let m: Box<Matrix<10000, 10000>> = Box::new(Matrix::zeros());

Pitfall 4: Type Inference Failures

// FAILS: Compiler can't infer dimensions
let m = Matrix::zeros();  // Error: cannot infer R and C

// WORKS: Provide type annotation
let m: Matrix<3, 4> = Matrix::zeros();
// OR
let m = Matrix::<3, 4>::zeros();

Pitfall 5: Multiplication Output Type Confusion

// The output type is determined by input types
let a: Matrix<3, 5> = Matrix::zeros();
let b: Matrix<5, 2> = Matrix::zeros();
let c = a * b;  // c is Matrix<3, 2>, NOT Matrix<3, 5>!

// If you need to store result, annotate it:
let c: Matrix<3, 2> = a * b;

Pitfall 6: Generic Functions Need Where Clauses

// WRONG: This won't work for all R, C
fn double<const R: usize, const C: usize>(m: Matrix<R, C>) -> Matrix<R, C> {
    m + m  // Error: Add might not be implemented
}

// CORRECT: Add trait bound
fn double<const R: usize, const C: usize>(m: Matrix<R, C>) -> Matrix<R, C>
where
    Matrix<R, C>: std::ops::Add<Output = Matrix<R, C>>,
{
    m + m
}

Extensions and Challenges

Extension 1: Gaussian Elimination

Implement row reduction to solve systems of linear equations.

impl<const N: usize> Matrix<N, N> {
    /// Solve Ax = b using Gaussian elimination
    pub fn solve(&self, b: Matrix<N, 1>) -> Option<Matrix<N, 1>> {
        // 1. Augment matrix [A|b]
        // 2. Forward elimination (create upper triangular)
        // 3. Back substitution
        // 4. Return solution vector
    }

    /// Calculate determinant
    pub fn determinant(&self) -> f64 {
        // Use row reduction, det = product of pivots
    }

    /// Calculate inverse (if exists)
    pub fn inverse(&self) -> Option<Self> {
        // Use Gauss-Jordan with identity augmented
    }
}

Extension 2: Generic Element Type

Make the matrix work with any numeric type:

use num_traits::{Zero, One, NumOps};

pub struct Matrix<T, const R: usize, const C: usize>
where
    T: Copy + Zero + One + NumOps,
{
    data: [[T; C]; R],
}

// Now works with f32, f64, i32, complex numbers, etc.

Extension 3: SIMD-Optimized Operations

#![feature(portable_simd)]

impl<const R: usize, const C: usize> Matrix<R, C>
where
    [(); C % 8]:,  // Ensure C is multiple of 8 for AVX
{
    pub fn mul_simd(self, rhs: Matrix<C, R>) -> Matrix<R, R> {
        // Use AVX-512 for 8 f64s at once
        // 8x speedup for large matrices
    }
}

Extension 4: Compile-Time Matrix Inversion

Using generic_const_exprs:

#![feature(generic_const_exprs)]

impl<const N: usize> Matrix<N, N>
where
    [(); N * N]:,  // Ensure we can flatten
    [(); N * 2]:,  // For augmented matrix
{
    pub const fn inverse_const() -> Option<Self> {
        // Completely evaluated at compile time!
    }
}

Real-World Connections

nalgebra

The industry-standard linear algebra library for Rust uses const generics:

use nalgebra::{SMatrix, SVector};

// Fixed-size matrix (compile-time dimensions)
type Mat3x3 = SMatrix<f64, 3, 3>;

// Your project teaches the fundamentals that nalgebra builds on

ndarray

For dynamic-sized arrays with optional compile-time dimensions:

use ndarray::{Array2, ArrayView2};

// ndarray uses different approach (Dim trait)
// Understanding const generics helps you understand their design decisions

Graphics Programming (glam, cgmath)

Game engines need fast, fixed-size math:

use glam::Mat4;  // 4x4 matrix, size known at compile time

// Transformations in 3D graphics:
let model = Mat4::from_rotation_y(angle);
let view = Mat4::look_at_rh(eye, target, up);
let projection = Mat4::perspective_rh(fov, aspect, near, far);
let mvp = projection * view * model;  // Type-safe multiplication!

Machine Learning (burn, candle)

Neural network frameworks use const generics for tensor shapes:

// In modern ML frameworks:
type LayerWeights = Matrix<784, 128>;  // MNIST input -> hidden layer
type LayerBias = Matrix<128, 1>;

// Shape mismatches caught at compile time!

The Core Question You’re Answering

“Can I force the compiler to understand linear algebra?”

Yes. By moving values (like 3 and 2) into the type system, the compiler’s trait solver becomes a dimensional analysis engine. Every matrix operation is verified before your program runs.

Concepts You Must Understand First

Const Generics basics
- How to define struct Matrix<const R: usize, const C: usize>
- Book Reference: “Programming Rust” Ch. 11
Monomorphization
- Why does the compiler generate a new version of your function for every different size?
- Book Reference: “Idiomatic Rust” Ch. 5
Trait implementation for Generics
- How to implement Mul only for matrices where columns of A match rows of B

Questions to Guide Your Design

Storage
- Should you use Vec<T> or [T; R * C]? (Hint: Since R and C are const, an array is faster!)
Operations
- How do you define the return type of a multiplication as Matrix<R1, C2>?
Bounds
- How do you handle cases where you need R * C to be calculated at compile time? (Hint: generic_const_exprs feature on Nightly)

Thinking Exercise

The Cost of Freedom

If you have 100 different matrix sizes in your program, how does that affect the binary size? Compare this to a library where dimensions are stored as runtime integers.

Answer: Each unique Matrix<R, C> type generates its own code. 100 sizes means 100 copies of each method. This is the “monomorphization cost” - larger binaries but perfect optimization for each size. Runtime-sized matrices have one code copy but add branching and bounds checks.

The Interview Questions They’ll Ask

“What is the difference between a generic type parameter and a const generic parameter?”
“How do const generics reduce runtime overhead?”
“What are the limitations of const generics in Stable Rust currently?”
“Can you explain ‘Type State’ pattern using const generics?”

Hints in Layers

Hint 1: The definition

struct Matrix<const R: usize, const C: usize> { data: [[f32; C]; R] }

Hint 2: Implementation Use the impl block:

impl<const R: usize, const C: usize> Matrix<R, C> { ... }

Hint 3: Multiplication The Mul trait implementation:

impl<const R1: usize, const C1: usize, const C2: usize>
    Mul<Matrix<C1, C2>> for Matrix<R1, C1>
{
    type Output = Matrix<R1, C2>;
    ...
}

Example Build & Run

$ cargo new --lib const-matrix
     Created library `const-matrix` package

$ cd const-matrix

$ cargo test
   Compiling const-matrix v0.1.0
    Finished test [unoptimized + debuginfo] target(s) in 1.23s
     Running unittests src/lib.rs

running 8 tests
test tests::test_zeros ... ok
test tests::test_ones ... ok
test tests::test_addition ... ok
test tests::test_subtraction ... ok
test tests::test_scalar_mul ... ok
test tests::test_matrix_mul_dimensions ... ok
test tests::test_transpose ... ok
test tests::test_identity ... ok

test result: ok. 8 passed; 0 failed; 0 ignored

$ cargo run --example demo
   Compiling const-matrix v0.1.0
    Finished dev [unoptimized + debuginfo] target(s) in 0.45s
     Running `target/debug/examples/demo`

=== Const Generic Matrix Demo ===

Creating a 3x2 matrix:
Matrix<3, 2> [
  [  1.0000,   2.0000]
  [  3.0000,   4.0000]
  [  5.0000,   6.0000]
]

Creating a 2x4 matrix:
Matrix<2, 4> [
  [  1.0000,   2.0000,   3.0000,   4.0000]
  [  5.0000,   6.0000,   7.0000,   8.0000]
]

Multiplying 3x2 * 2x4 = 3x4:
Matrix<3, 4> [
  [ 11.0000,  14.0000,  17.0000,  20.0000]
  [ 23.0000,  30.0000,  37.0000,  44.0000]
  [ 35.0000,  46.0000,  57.0000,  68.0000]
]

Dimension mismatch would be caught at compile time!
// let bad = matrix_3x2 * matrix_3x4;  // Won't compile!

Books That Will Help

Topic	Book	Chapter
Const Generics	“Programming Rust”	Ch. 11
Matrix Math	“Math for Programmers”	Ch. 4
Type-level logic	“Idiomatic Rust”	Ch. 5

Summary

This project teaches you to leverage Rust’s type system to encode mathematical constraints. By the end, you will have:

A fully functional matrix library with compile-time dimension checking
Deep understanding of const generics and monomorphization
Experience implementing operator traits with complex type bounds
A foundation for understanding production libraries like nalgebra
Appreciation for zero-cost abstractions and type-level programming

The skills transfer directly to graphics programming, scientific computing, and any domain where “types as documentation” prevents bugs before runtime.

Project 5: Const Generic Matrix - Type-Level Linear Algebra

Project Metadata

Learning Objectives

Deep Theoretical Foundation

Type-Level Programming: Encoding Values in Types

History of Const Generics: RFC 2000 and Rust 1.51

Monomorphization: How the Compiler Specializes Your Code

Compile-Time vs Runtime Dimension Checking

Linear Algebra Basics for Matrix Operations

Matrix Dimensions

Matrix Addition (Same Dimensions Required)

Matrix Multiplication (Inner Dimensions Must Match)

The Type State Pattern

Zero-Cost Abstractions: Proving No Overhead

Array vs Vec for Fixed-Size Data

The generic_const_exprs Feature (Nightly)

Memory Layout: Row-Major vs Column-Major

Solution Architecture

The Core Matrix Struct

Storage Decision

Required Trait Implementations

Multiplication Constraint Visualization

Helper Methods

Phased Implementation Guide

Phase 1: Define the Basic Matrix Struct (Day 1)

Phase 2: Implement Debug and Basic Methods (Day 2-3)

Phase 3: Implement Add for Same-Size Matrices (Day 4-5)

Phase 4: Implement Mul with Dimension Checking (Day 6-8)

Phase 5: Add SIMD Optimization (Optional, Day 9+)

Testing Strategy

Unit Tests for Basic Operations

Compile-Fail Tests

Property-Based Tests with proptest

Benchmark vs nalgebra

Common Pitfalls and Debugging

Pitfall 1: Forgetting Copy Bounds

Pitfall 2: Confusing Row/Column Order

Pitfall 3: Stack Overflow for Large Matrices

Pitfall 4: Type Inference Failures

Pitfall 5: Multiplication Output Type Confusion

Pitfall 6: Generic Functions Need Where Clauses

Extensions and Challenges

Extension 1: Gaussian Elimination

Extension 2: Generic Element Type

Extension 3: SIMD-Optimized Operations

Extension 4: Compile-Time Matrix Inversion

Real-World Connections

nalgebra

ndarray

Graphics Programming (glam, cgmath)

Machine Learning (burn, candle)

The Core Question You’re Answering

Concepts You Must Understand First

Questions to Guide Your Design

Thinking Exercise

The Interview Questions They’ll Ask

Hints in Layers

Example Build & Run

Books That Will Help

Summary

The `generic_const_exprs` Feature (Nightly)