Satellite Flight Software Engineering Mastery: Real-World Projects

Goal: Build a deep, systems-level understanding of satellite flight software (FSW) by designing the core subsystems of a CubeSat mission from first principles. You will master timekeeping, telemetry/telecommand, power and thermal budgeting, attitude estimation and control, fault management, and ground operations. By the end, you will be able to explain why space systems are built the way they are and implement a simulation-grade FSW stack that survives real mission constraints. You will also learn to reason about failure modes, validate behaviors on the ground, and integrate a full mission simulator that mirrors operations.

Introduction

Satellite flight software is the embedded, real-time software that keeps a spacecraft alive, safe, and productive. It translates mission intent into robust actions across power, communications, guidance, and payload subsystems while coping with intermittent ground contact, radiation, and tight resources.

What you will build (by the end of this guide):

A CCSDS-compliant telemetry/telecommand parser and scheduler
A CubeSat flight state machine with safe mode entry/exit logic
A high-fidelity EPS power budget simulator with eclipse modeling
A full mission simulator that fuses orbit propagation, ADCS, EPS, and ground operations

Scope (what is included):

CubeSat subsystem architecture (C&DH, EPS, ADCS, COMMS, Payload)
Orbit propagation (TLE/SGP4), timekeeping, and scheduling
Telemetry/telecommand (CCSDS Space Packet) and ground operations
Fault management: SEU mitigation, watchdogs, and FDIR

Out of scope (for this guide):

Launch vehicle systems
Onboard propulsion design
Deep space navigation and interplanetary mission design

The Big Picture (Mental Model)

           Space Environment (radiation, thermal, eclipse)
                          |
                          v
Sensors -> ADCS -> Attitude State -> Control Laws -> Actuators
   |         |                                   |
   |         v                                   v
   |     State Estimator (EKF)             Reaction Wheels
   |                                           |
   v                                           v
Telemetry -> Packetization -> Radio -> Ground Station -> Operators
   ^                                           |
   |                                           v
   +------------ C&DH + FSW Scheduler <---- Commands
                      |
                      v
                EPS Power Budget

Key Terms You Will See Everywhere

C&DH (Command and Data Handling): The software brain that routes commands, schedules tasks, and manages data.
ADCS: Attitude Determination and Control System (sensors + estimators + actuators).
EPS: Electrical Power System responsible for generation, storage, and distribution.
FSW: Flight Software, the embedded code that runs onboard the spacecraft.
CCSDS Space Packet: A standard for telemetry/telecommand packetization.

How to Use This Guide

Read the Theory Primer first like a mini-book. This builds the mental models.
Pick a learning path that matches your background (software, controls, or ops).
Build projects in order if you are new to space systems; skip around only if you are experienced.
Validate each project using its Definition of Done checklist before moving on.
Integrate everything into the final mission simulator to prove mastery.

Prerequisites & Background Knowledge

Before starting these projects, you should have foundational understanding in these areas:

Essential Prerequisites (Must Have)

Programming Skills:

C or C++ fundamentals (pointers, memory, bit operations, structs)
Python for simulations and plotting
Basic data structures (queues, ring buffers) and state machine design

Math & Physics Basics:

Linear algebra (vectors, matrices)
Basic differential equations and numerical integration
Introductory orbital mechanics (what an orbit is, Kepler basics)

Systems Fundamentals:

Embedded systems concepts (interrupts, timers, watchdogs)
Basic networking concepts (packets, headers, checksums)

Helpful But Not Required

Advanced Math:

Quaternions and 3D rotations
Kalman filtering and estimation theory

RF / Communications:

Link budgets, modulation basics
Ground station operations

Self-Assessment Questions

Before starting, ask yourself:

Can you write and debug a C program that parses a binary file with bitfields?
Can you explain what a state machine is and implement one?
Can you numerically integrate a simple ODE in Python?
Can you read a data sheet and translate requirements into code?
Can you write a unit test to validate an algorithm?

If you answered “no” to questions 1-3: spend 1-2 weeks on C and Python fundamentals before starting.

If you answered “yes” to all 5: you are ready to begin.

Development Environment Setup

Required Tools:

Linux or macOS terminal
C compiler (clang or gcc)
Python 3.11+
git and a text editor/IDE

Recommended Tools:

gdb or lldb for debugging C
matplotlib and numpy
A hex editor for inspecting CCSDS packets
Optional: sgp4 Python library for orbit propagation reference

Testing Your Setup:

$ gcc --version
$ python3 --version
$ git --version

Time Investment

Simple projects (1, 2, 7): 4-8 hours each
Moderate projects (3, 8, 9, 10, 11, 12): 1-2 weeks each
Complex projects (4, 5, 6, 13): 2-4 weeks each
Total sprint: 3-5 months if you complete everything

Important Reality Check

Flight software is unforgiving. You are building logic that must survive unreliable sensors, limited power, and minimal human intervention. Expect to iterate multiple times. The real learning happens when you debug unexpected behavior and trace it back to system design choices.

Big Picture / Mental Model

A CubeSat is an autonomous robot in orbit. It runs a tight loop that senses, decides, acts, and reports. The same patterns repeat across subsystems: measure, estimate, control, and log.

Mission Timeline
+----------+   +----------+   +--------------+   +------------+
|  Launch  |-> |  Deploy  |-> | Commission   |-> | Nominal Ops|
+----------+   +----------+   +--------------+   +------------+
                                         |
                                         v
                               +-------------------+
                               |  Safe Mode (FDIR) |
                               +-------------------+

FSW Loop (every second)
Sensors -> Estimator -> Controller -> Actuators -> Telemetry

Theory Primer

This section is the mini-book. Each chapter builds a mental model you will reuse in the projects.

Chapter 1: Flight Software Architecture, Modes, and Subsystem Contracts

Fundamentals

Flight software is the coordinator of every subsystem. It reads sensors, dispatches commands, schedules tasks, and makes sure the spacecraft stays alive. In CubeSats, you typically have a Command and Data Handling (C&DH) system that hosts the CPU and RTOS, plus subsystems like EPS, ADCS, COMMS, and Payload. Each subsystem exposes a narrow interface (telemetry and commands), usually over I2C, SPI, CAN, or UART. The FSW defines a global state machine (safe, nominal, science, comms) and enforces system-wide invariants like “never transmit if battery is below threshold.” If you only remember one thing: subsystem boundaries are real, and your software must treat them as contracts. The purpose of architecture is not beauty; it is isolation and predictability under stress.

Deep Dive into the Concept

A CubeSat architecture is constrained by mass, power, and volume. That means the flight computer (OBC) is usually a low-power microcontroller with limited RAM and flash. The software stack is layered: low-level drivers talk to hardware; a HAL abstracts sensors and buses; middleware handles telemetry, command routing, and scheduling; and application logic runs the mission. This layering is not academic. It lets you replace a magnetometer or radio without rewriting the entire system. It also isolates faults: if a bus misbehaves, the driver layer can detect and reset it without crashing the mission logic.

Subsystem interfaces are often “single-writer, multiple-reader” patterns. For example, EPS might publish battery voltage and current; ADCS and COMMS read that telemetry to decide when to run. Commands go the other way: a radio command asks EPS to disable a payload or asks ADCS to slew. These data flows must be deterministic and measurable, because in space you debug with logs, not with probes. The C&DH layer is a message broker with strict real-time constraints and a safety override. That override is safe mode logic: when power, attitude, or thermal conditions exceed limits, C&DH must preempt everything and drop into minimal survival behavior.

Scheduling is a core architectural decision. Many CubeSats use a cooperative scheduler that runs tasks in a fixed cadence (1 Hz, 10 Hz, 1/60 Hz). Some use a preemptive RTOS with fixed priorities. Both require design discipline: tasks must be short, deterministic, and bounded. A missed deadline can be as bad as a crash, because control algorithms depend on predictable timing. When you design architecture, you are really designing time budgets and failure containment boundaries.

Mode management is the other half of architecture. Modes are not just UI states; they encode power, thermal, and comms constraints. A typical safe mode is a minimum viable configuration: radio on for beacons, ADCS in sun-pointing to keep batteries charged, payload off. Mode transitions must be explicit and logged. You must prevent oscillation, where the system bounces between modes because thresholds are too tight. This is why you use hysteresis and cooldown timers. A mode chart is a specification, not a suggestion.

Interfaces are usually formalized in interface control documents (ICDs). Telemetry points have units, ranges, and update rates. Commands have parameters, constraints, and expected effects. If you do not formalize these contracts, integration fails because every subsystem will make assumptions the others cannot satisfy. In flight software, you enforce these contracts at runtime: units checks, bounds checks, and cross-subsystem invariants (for example, “payload cannot be enabled in eclipse”).

Finally, robust architecture requires a simulation build. A good FSW stack can compile for hardware and for a desktop simulator with fake drivers. This allows you to test mode logic, telemetry scheduling, and FDIR without risking hardware. It also enables unit tests on CI, which is essential for reliability. The best FSW design is one that can be verified without flying.

In practice, teams also use configuration tables, feature flags, and build profiles to keep one codebase across multiple missions while preserving safety defaults.

How This Fits on Projects

This architecture underpins Projects 2, 7, 11, 12, and the final mission simulator.

Definitions & Key Terms

C&DH: Core system that routes commands and aggregates telemetry.
HAL: Hardware Abstraction Layer, isolates device specifics.
Subsystem: EPS, ADCS, COMMS, Payload.
Mode: High-level behavior state (SAFE, NOMINAL, SCIENCE).
ICD: Interface Control Document defining telemetry and commands.

Mental Model Diagram

            Flight Software (FSW)
+------------------------------------+
|   Mission Logic / Modes            |
+------------------------------------+
|   Scheduler + C&DH + Telemetry     |
+------------------------------------+
|   HAL / Drivers / Bus Interfaces   |
+------------------------------------+
        |       |        |       |
       EPS     ADCS    COMMS   Payload

How It Works (Step-by-Step, Invariants, Failure Modes)

Bootloader initializes memory and starts the main loop.
Drivers bring up buses and hardware in a known state.
Scheduler runs tasks at fixed intervals.
Telemetry is sampled and queued.
Commands are validated, executed, and logged.
Health checks trigger mode changes or resets.

Invariants: never transmit below minimum battery voltage; never point away from sun in safe mode; never run payload in eclipse.

Failure modes: bus lockups, task overruns, stale telemetry, mode oscillation.

Minimal Concrete Example

// Cooperative scheduler tick
void main_loop(void) {
  while (1) {
    task_read_sensors();      // 1 Hz
    task_update_state();      // 1 Hz
    if (tick_10s()) task_pack_telemetry();
    if (tick_60s()) task_propagate_orbit();
    task_run_fdir_checks();
    feed_watchdog();
  }
}

Common Misconceptions

“We can debug in orbit like on Earth.” -> You mostly debug from telemetry only.
“Subsystems are independent.” -> They are coupled through power, timing, and data.

Check-Your-Understanding Questions

Why is a strict scheduler critical for FSW?
What is the difference between C&DH and the HAL?
How does safe mode override normal mission logic?

Check-Your-Understanding Answers

Control loops and telemetry timing depend on deterministic execution.
C&DH routes commands/telemetry; HAL abstracts hardware access.
Safe mode preempts tasks to keep power/thermal/attitude within bounds.

Real-World Applications

Commercial Earth-imaging CubeSats
University missions requiring reliability under constraints

Where You’ll Apply It

Projects 2, 7, 11, 12, 13

References

NASA CubeSat 101 (2017) - mission operations overview
Spacecraft Systems Engineering (Fortescue) - Data Handling chapter
Mission Success Handbook for CubeSat Missions (NASA GSFC-HDBK-8007)

Key Insight

FSW is not a program; it is a coordinated system of contracts between subsystems.

Summary

You build FSW as a layered architecture with strict timing, clear subsystem boundaries, and safe mode overrides.

Homework/Exercises

Sketch a block diagram of a CubeSat you want to build.
List the invariants that should never be violated.

Solutions to the Homework/Exercises

Example invariants: battery > 30%, radio off during eclipse, payload disabled when CPU temp > 70C.

Chapter 2: Timekeeping, Scheduling, and Determinism

Fundamentals

Time is the invisible backbone of flight software. Every sensor sample, control loop, command execution, and telemetry packet is anchored to time. Onboard time is not the same as wall-clock time. It can drift, reset, or lose synchronization, and your software must survive those events. A good FSW design uses monotonic time for scheduling, absolute time for logs and passes, and synchronized time for ground operations. You also need deterministic scheduling so that tasks execute at known rates. If your control loop runs at 1 Hz but sometimes slips to 0.7 Hz, your estimator diverges and your controller becomes unstable. Timekeeping is not just clocks; it is a contract that makes the rest of the system predictable.

Deep Dive into the Concept

Onboard timekeeping begins at boot. The clock source might be a crystal oscillator, an RTC, or a GPS time feed. Each has drift characteristics and failure modes. Many CubeSats use a software-maintained mission elapsed time (MET) that starts at launch or deployment. MET is monotonic and is the safest input for scheduling tasks. In parallel, the system keeps UTC or TAI for logging and pass prediction. UTC has leap seconds; TAI does not. If you do not handle leap seconds correctly, pass predictions can be off by seconds, which is fatal when your downlink window is only a few minutes. For this reason, time conversion must be centralized and tested.

Scheduling in FSW is typically either time-triggered or event-driven. Time-triggered scheduling is simple and deterministic: tasks run at fixed intervals, with known budgets. Event-driven scheduling reacts to interrupts (radio packet received, sensor threshold exceeded). Most systems are hybrid: a periodic loop handles core tasks, while interrupts enqueue events for asynchronous processing. The trick is to keep interrupt handlers short and defer heavy work to the scheduler. You should measure worst-case execution time (WCET) for each task and make sure total load stays under your CPU budget.

Time-tagged commands are common in spacecraft operations. Instead of executing immediately, a command is queued with an execution time. This allows the ground to pre-plan a pass: upload a command stack that runs after you lose contact. To make this reliable, your onboard time must be accurate and monotonic, and your command queue must be persistent across reboots. Many missions implement a time-tag queue with validity checks, grace windows, and rejection rules for stale or unsafe commands.

A subtle aspect is time in telemetry. Every packet should include a timestamp, but you must decide which clock and which time scale. If you mix MET and UTC in logs, debugging becomes impossible. Many systems include both: MET for internal consistency and UTC for operators. Timestamps also drive downlink prioritization: you can drop stale payload data to protect real-time health telemetry.

Determinism is a top-level requirement. If your 1 Hz task slips occasionally, you need a strategy: skip or catch up? Most flight systems skip to maintain predictability and avoid backlog. But skipping can break control logic, so you must design algorithms that are robust to missed cycles. This is why control and estimation algorithms often include delta-time inputs so they can handle variable step sizes.

Timekeeping also intersects with fault management. If the RTC fails or resets, you may need to fall back to MET and flag the event. If GPS time jumps, you must reject large discontinuities. A good system uses sanity checks: maximum allowable time step, monotonicity checks, and event logs. These are simple to implement and prevent catastrophic scheduling errors.

Many missions also distribute time internally using a time synchronization message that is treated like a command with strict validation. This allows subsystem microcontrollers to align their clocks within a small error bound, which improves sensor fusion and makes cross-subsystem logs comparable during post-flight analysis.

How This Fits on Projects

You will apply timekeeping and deterministic scheduling in Projects 2, 4, 7, 11, and 12, and throughout the capstone simulator.

Definitions & Key Terms

MET (Mission Elapsed Time): Monotonic time since a defined epoch (launch/deploy).
UTC: Coordinated Universal Time, includes leap seconds.
TAI: International Atomic Time, continuous without leap seconds.
WCET: Worst-case execution time for a task.
Time-tagged command: Command scheduled for a future execution time.

Mental Model Diagram

           Time Sources
   RTC ----> UTC Converter ----> Logs
   GPS ----> UTC Converter ----> Pass Prediction
   Tick ---> MET Counter ----> Scheduler

Scheduler (1 Hz)
   |-- sensor read
   |-- estimator
   |-- controller
   |-- telemetry packetize

How It Works (Step-by-Step, Invariants, Failure Modes)

Boot initializes MET = 0 and loads last known UTC.
Scheduler ticks at fixed intervals using MET.
GPS or ground sync updates UTC periodically.
Telemetry packets include UTC + MET.
Time-tagged commands execute when MET >= target.

Invariants: MET never goes backward; tasks never run twice for same tick.

Failure modes: RTC reset, GPS jump, scheduler drift, missed deadlines.

Minimal Concrete Example

// Simple time-tag execution check
void process_time_tag_queue(uint32_t met_now) {
  if (!queue_empty() && queue_peek_time() <= met_now) {
    exec_command(queue_pop());
  }
}

Common Misconceptions

“UTC is always monotonic.” -> Leap seconds can create discontinuities.
“If a task slips, just run it twice.” -> This can destabilize control loops.

Check-Your-Understanding Questions

Why is MET often preferred for scheduling?
What problems do leap seconds introduce?
How would you handle a GPS time jump of +5 seconds?

Check-Your-Understanding Answers

MET is monotonic and immune to leap-second jumps.
Leap seconds can break timing assumptions and pass predictions.
Reject the jump or apply gradually, and log a timing anomaly.

Real-World Applications

Time-tagged command stacks for short ground passes
Precise timestamping for science payloads

Where You’ll Apply It

Projects 2, 4, 7, 11, 12, 13

References

CCSDS Time Code Formats (CCSDS 301.0-B)
NASA CubeSat 101 (2017) - operations timelines

Key Insight

Timekeeping is the hidden API that every subsystem depends on.

Summary

You must design with monotonic time, deterministic scheduling, and explicit handling of time discontinuities.

Homework/Exercises

Design a timing table for a 1 Hz, 10 Hz, and 1/60 Hz task set.
Define how you will handle a time sync jump of +/- 10 seconds.

Solutions to the Homework/Exercises

Example: 1 Hz loop reads sensors and runs FDIR, 10 Hz loop runs attitude control, 1/60 Hz loop propagates orbit. Time jump handling: reject jumps >2 seconds, apply slews for smaller offsets, log event.

Chapter 3: Telemetry, Telecommand, and CCSDS Packetization

Fundamentals

Telemetry and telecommand are the lifelines between spacecraft and ground. Telemetry is the stream of health and payload data sent down; telecommand is the set of instructions sent up. To make these streams interoperable across missions and ground systems, space agencies use the CCSDS Space Packet standard. This defines a primary header with routing information like APID, sequence count, and length. Your flight software must build, parse, validate, and schedule these packets. If you mishandle a single bit in the header, you lose communication. CCSDS packetization is the language your spacecraft speaks.

Telemetry also acts as your forensic record. When something goes wrong in orbit, these packets are often the only evidence you have.

Deep Dive into the Concept

A CCSDS space packet has a primary header (fixed length) and an optional secondary header. The primary header encodes version, packet type (TM or TC), secondary header flag, APID, sequence flags, sequence count, and packet length. The packet length field is a common pitfall: it encodes the number of bytes following the primary header minus one. If you miscompute this, your ground system will frame packets incorrectly. The secondary header is mission-specific and often carries timestamps, service types, and subtypes. Your software must define this and keep it consistent across subsystems.

Packetization exists because the space link is lossy and intermittent. Raw sensor data is too large to send directly. Instead, you sample and packetize telemetry into fixed structures. You can prioritize packets by APID or by packet class. For example, EPS health might be high priority, payload data low priority. You may also need segmentation: large payload files are split into multiple packets or moved via a file protocol like CCSDS CFDP. Designing telemetry is really designing a data product pipeline under bandwidth constraints.

Telecommand is the mirror. A command arrives as a CCSDS packet, is validated, and then routed. Validation includes CRC checks (if the lower layers do not already), sequence counters, authorization, and range checks on parameters. Many missions use command verification: the spacecraft acknowledges acceptance, start, completion, or failure of a command. These verification codes are essential for operators to know what actually happened. In flight software, you need a command execution pipeline that supports both immediate and time-tagged commands, and that logs everything for later analysis.

Packet scheduling is often overlooked. The downlink is time-limited by ground passes. You need a scheduler that decides which packets fit in a pass and which to defer. This is a priority scheduling problem with deadlines and size constraints. A robust scheduler must account for packet overhead, retransmission strategy, and partial sends. It should also be able to degrade gracefully: if bandwidth drops, it should keep health telemetry flowing by shedding payload data.

At a lower layer, CCSDS space packets are usually wrapped in transfer frames that provide synchronization and error correction. Your FSW might not implement the physical layer, but you must understand that packets are not sent alone; they are framed. That framing can introduce constraints like maximum packet size per frame, padding rules, and segmentation. If you design packets without considering the frame size, you can end up with inefficient downlinks and wasted airtime.

Loss handling is another practical concern. Some missions rely on simple forward error correction and accept loss, while others implement acknowledgments and retransmissions at the application level. Your scheduler should be designed so that critical health telemetry can be repeated or re-requested, while large payload data can be deferred or sent via a file transfer protocol.

Security is also part of command handling. Many missions implement a simple authentication token, rolling code, or ground-station-only command window. While full cryptographic protocols can be heavy for CubeSats, you should still design for basic command validation to prevent accidental or malicious commands. This is especially critical for commercial or educational missions where ground station access might be less controlled.

How This Fits on Projects

Telemetry and telecommand concepts are central to Projects 1, 7, 10, 12, and 13.

Definitions & Key Terms

APID: Application Process Identifier, routes packets to subsystems.
Sequence Count: Per-APID counter to detect missing or out-of-order packets.
Secondary Header: Mission-specific header for timestamps and service types.
CFDP: CCSDS File Delivery Protocol for reliable file transfers.

Mental Model Diagram

Sensors -> Telemetry Packets -> Downlink Scheduler -> Radio -> Ground
Commands <- Command Parser <- Validation <- Ground Station

How It Works (Step-by-Step, Invariants, Failure Modes)

Telemetry is sampled and encoded into packet structures.
Primary header fields are set (APID, sequence, length).
Packets are queued by priority and size.
Downlink scheduler selects packets for a pass.
Commands are received, validated, and dispatched.

Invariants: sequence counts per APID increment monotonically; packet length matches payload.

Failure modes: header bit errors, length miscalculation, queue overflow, command spoofing.

Minimal Concrete Example

// Set CCSDS primary header length field
uint16_t payload_len = data_len + secondary_len;
uint16_t pkt_len = payload_len - 1; // per CCSDS definition
hdr.len_hi = (pkt_len >> 8) & 0xFF;
hdr.len_lo = pkt_len & 0xFF;

Common Misconceptions

“Packet length is total length.” -> It is payload length minus one.
“Sequence counters are global.” -> They are per APID.

Check-Your-Understanding Questions

Why do we use APIDs instead of one big telemetry stream?
What happens if you compute CCSDS length incorrectly?
Why are command verification packets important?

Check-Your-Understanding Answers

APIDs allow routing and prioritization across subsystems.
The ground cannot frame packets correctly, causing data loss.
Operators need to know whether a command was accepted and executed.

Real-World Applications

Health telemetry for commercial satellites
Payload data return from Earth-observation CubeSats

Where You’ll Apply It

Projects 1, 7, 10, 12, 13

References

CCSDS Space Packet Protocol (CCSDS 133.0-B-1)
F Prime (JPL) CCSDS packet framing examples

Key Insight

Packetization is not just formatting; it is mission-critical reliability.

Summary

You must build, validate, schedule, and verify packets with strict correctness and tight bandwidth constraints.

Homework/Exercises

Define a telemetry dictionary with 10 fields and map them to APIDs.
Create a command verification state machine for a critical command.

Solutions to the Homework/Exercises

Example: APID 0x001 for EPS health, 0x002 for ADCS, 0x200 for payload. Verification: accepted -> started -> completed or failed.

Chapter 4: Orbit Propagation, Pass Prediction, and the Space Environment

Fundamentals

Your satellite is always moving, and your software needs to predict where it will be and when it will see the ground station or the Sun. In low Earth orbit (LEO), the orbital period is roughly 90 minutes. This creates predictable eclipse windows where solar power is zero and thermal conditions change quickly. Orbit propagation is the act of predicting position and velocity at future times. Most CubeSat missions use Two-Line Elements (TLEs) and the SGP4 model to propagate orbits. The environment is harsh: radiation, vacuum, atomic oxygen, and thermal cycling all affect hardware and software reliability. Even if you never write orbital mechanics code in flight software, you must understand the implications for scheduling, power, and comms.

Deep Dive into the Concept

Orbit models are essential for scheduling: you need to know when to transmit, when to collect payload data, and when to enter power-saving modes. A TLE encodes orbital elements derived from tracking data. SGP4 is the standard algorithm for propagating TLEs to predict position and velocity at a given time. The algorithm models perturbations like Earth’s oblateness, atmospheric drag, and gravitational harmonics. Because TLEs are only valid over a limited time window, your software must update them periodically.

Orbit prediction also drives power modeling. Eclipse occurs when the satellite passes through Earth’s shadow. During eclipse, solar input is near zero, so EPS must draw from battery. The duration of eclipse depends on orbit altitude, inclination, and season. That means the same orbit can have different power margins at different times of year. Your FSW must use orbit prediction to schedule payload operations only when sunlight is available or battery state-of-charge is high enough.

The environment matters in software design. Radiation causes bit flips (SEUs) and can corrupt memory. Vacuum means no convection, so thermal control relies on radiation and conduction paths. Atomic oxygen can degrade surfaces, affecting thermal properties over time. These factors should influence your fault handling and margins in simulations. Even if you are not building hardware, your simulator should model them as constraints so the software you write has the right behaviors.

In flight software, orbit propagation is typically a background task that updates the spacecraft state vector (position and velocity) and derived values (lat/lon/alt, sun vector, ground station visibility). If you get this wrong, you may transmit at the wrong time or mis-point antennas. In real missions, ground systems are used to validate onboard orbit estimates, but the onboard logic still needs a reasonable model to make decisions.

Coordinate frames are a common trap. Your propagation output is often in the TEME or ECI frame, while ground station visibility calculations use Earth-fixed coordinates (ECEF). You must transform between frames with correct Earth rotation and time conversions. A single frame mismatch can create huge pointing errors. This is why you always annotate vectors with their frame and time basis. Think of frames as part of the type system of your code.

Pass prediction requires geometry: given satellite position and ground station location, you compute elevation angle. A pass begins when the elevation rises above a threshold (for example, 10 degrees) and ends when it drops below. The total pass time is usually only a few minutes. Your scheduler must account for link acquisition time and cut off commands early enough to avoid sending into a fading link. These details are what make ground operations realistic.

Finally, orbit propagation is a modeling problem under uncertainty. TLEs are derived from tracking observations and represent a best-fit orbit at a specific epoch. SGP4 is a mathematical model that applies perturbations to approximate the orbit. The accuracy depends on how close you are to the epoch and how well the model captures dominant forces. In LEO, drag is the biggest variable, and it changes with solar activity. That means a TLE from a week ago may already be inaccurate. Onboard predictions should be treated as advisory and validated with ground planning.

How This Fits on Projects

Orbit propagation and pass prediction are core to Projects 4, 7, 9, 12, and 13.

Definitions & Key Terms

TLE: Two-Line Element set describing orbital elements.
SGP4: Standard propagation model for TLEs.
ECI/TEME/ECEF: Common reference frames for orbital states.
Eclipse: Period when spacecraft is in Earth’s shadow.

Mental Model Diagram

TLE -> SGP4 -> Position/Velocity -> Frame Transform -> Ground Pass
                                      |
                                      v
                                  Sun Vector

How It Works (Step-by-Step, Invariants, Failure Modes)

Parse TLE and load epoch.
Propagate orbit to current time using SGP4.
Convert to Earth-fixed coordinates.
Compute elevation for ground station.
Compute sun vector and eclipse status.

Invariants: timestamps are consistent; frames are clearly labeled.

Failure modes: stale TLEs, frame mismatch, time conversion error.

Minimal Concrete Example

# Pseudocode for pass prediction
state = sgp4_propagate(tle, t)
pos_ecef = eci_to_ecef(state.pos, t)
elev = elevation_angle(pos_ecef, gs_location)
if elev > 10.0:
    pass_active = True

Common Misconceptions

“TLEs are exact.” -> They are approximate and drift over time.
“Frame conversions are minor.” -> They can dominate pointing errors.

Check-Your-Understanding Questions

Why do TLEs need frequent updates?
What is the risk of mixing ECI and ECEF frames?
How does eclipse timing affect EPS budgets?

Check-Your-Understanding Answers

Drag and perturbations make predictions drift over days.
Pointing and pass prediction can be wrong by large angles.
Eclipse determines when solar input is zero, driving battery usage.

Real-World Applications

Ground station scheduling and automation
Power planning and safe mode timing

Where You’ll Apply It

Projects 4, 7, 9, 12, 13

References

CelesTrak documentation on TLE format
Vallado et al. “Revisiting Spacetrack Report #3” (SGP4)
NASA CubeSat 101 (2017)

Key Insight

Orbit prediction is not a math exercise; it is the timing backbone of your mission.

Summary

You must model orbits, frames, and eclipse accurately enough to make safe operational decisions.

Homework/Exercises

Parse a real TLE and compute the orbital period.
Simulate a 24-hour pass schedule for a single ground station.

Solutions to the Homework/Exercises

Use SGP4 to propagate over 24 hours, compute elevation, and log pass start/end times.

Chapter 5: Power and Thermal Budgeting (EPS + Thermal Control)

Fundamentals

Power is the ultimate constraint in small satellites. Every subsystem competes for energy that comes from solar arrays and is stored in batteries. The EPS decides what can run, for how long, and under what conditions. Thermal control is tied to power: heaters draw energy, and electronics generate heat. In LEO, eclipse can last 30-40 minutes per orbit, which means no solar input and rapid thermal changes. Your flight software must plan around these constraints, shed loads during low power, and prevent thermal runaway. A satellite that runs out of power or overheats is effectively dead, regardless of how perfect its algorithms are.

Deep Dive into the Concept

EPS modeling starts with a power budget: each subsystem has a power draw and duty cycle. For example, the radio might draw 2 W during transmit, the payload 5 W during imaging, and the ADCS 1 W continuously. The solar array produces power when in sunlight, which depends on sun vector, panel orientation, and degradation. Battery state-of-charge (SoC) is then updated using a simple energy balance: SoC(t+dt) = SoC(t) + (P_in - P_out) * dt / BatteryCapacity. This is the core simulation for EPS. If you implement this correctly, you can test whether a given mission schedule is viable.

Load shedding is a critical strategy. When SoC drops below a threshold, you must disable non-essential loads, reduce comms, or enter safe mode. This requires a priority list of loads and clear rules. For instance, thermal heaters might be higher priority than payload because they protect the battery. Your FSW must enforce these rules automatically, because you cannot rely on ground intervention during short passes.

Thermal modeling can be surprisingly simple and still useful. A single-node model treats the spacecraft as one thermal mass with a net heat input. More detailed models use multiple nodes for battery, payload, and structure. The key is to capture the dynamics: heating during sunlight, cooling in eclipse, and heater control with hysteresis. Hysteresis prevents heater thrashing by turning on at a low threshold and off at a higher threshold. This is a classic control pattern that shows up repeatedly in FSW.

EPS and thermal systems are tightly coupled. If you run heaters during eclipse, you must budget that power draw. If you run the radio at high power, you create heat that might reduce heater usage. These interactions should be simulated together. The goal is to avoid surprise interactions that cause battery depletion or thermal violations.

In practice, EPS telemetry is one of the highest priority streams. Voltage, current, and temperature are the first clues that something is wrong. Your FSW should monitor these values continuously, and FDIR should trigger safe mode if thresholds are crossed. You should also include rate-of-change checks: a sudden voltage drop is more alarming than a slow drift. These checks are simple but highly effective.

Battery modeling benefits from a little more nuance than a single capacity number. Real batteries have charge and discharge efficiency, internal resistance, and temperature-dependent performance. A simple improvement is to include a coulomb-counting model with efficiency factors and a temperature-based derating curve. You do not need a full electrochemical model to get value; even a rough derating factor can prevent overly optimistic plans that would fail in orbit.

Battery modeling requires conservative assumptions. Lithium batteries degrade over time and suffer from temperature extremes. Your model should include a margin factor (for example, 20% capacity loss) so your simulation is not optimistic. Thermal modeling should also include worst-case assumptions for eclipse duration and sun angles. The guiding principle is: if it works in the worst case, it will likely work in the nominal case.

How This Fits on Projects

Power and thermal concepts are central to Projects 3, 9, 11, and 13.

Definitions & Key Terms

SoC (State of Charge): Remaining battery capacity as a percentage.
Load Shedding: Disabling non-essential loads to conserve power.
Hysteresis: Control technique using different on/off thresholds.
Eclipse: Period with zero solar input.

Mental Model Diagram

Sunlight -> Solar Array -> Battery -> Loads
                         ^           |
                         |           v
                       EPS Logic  Thermal/Heaters

How It Works (Step-by-Step, Invariants, Failure Modes)

Compute solar input based on sun vector and eclipse.
Sum subsystem power draws.
Update battery SoC.
If SoC below threshold, shed loads.
Update thermal model and heater states.

Invariants: SoC never exceeds 100%; critical heaters are always powered if possible.

Failure modes: battery depletion, heater thrashing, optimistic power model.

Minimal Concrete Example

# Simple SoC update
soc = soc + (p_in - p_out) * dt / battery_capacity
soc = max(0.0, min(1.0, soc))

Common Misconceptions

“Power budget is just average draw.” -> Peaks and duty cycles dominate failures.
“Thermal is hardware only.” -> Software controls heaters and modes.

Check-Your-Understanding Questions

Why is eclipse modeling critical for EPS?
What is the purpose of hysteresis in heater control?
Why should battery capacity be derated in simulations?

Check-Your-Understanding Answers

Eclipse defines when solar input is zero, so battery must carry the load.
Hysteresis prevents rapid on/off switching that wastes power and wears components.
Batteries degrade and have temperature-dependent capacity.

Real-World Applications

Load-shedding logic in operational CubeSats
Thermal safety in long-duration missions

Where You’ll Apply It

Projects 3, 9, 11, 13

References

Spacecraft Systems Engineering - Power and Thermal chapters
NASA CubeSat 101 (2017)

Key Insight

Power and thermal are not separate problems; they are the same system in different units.

Summary

You must model energy flows and thermal dynamics with conservative margins to keep the spacecraft alive.

Homework/Exercises

Build a simple power budget table for a 3U CubeSat.
Simulate SoC over two orbits with a 30-minute eclipse.

Solutions to the Homework/Exercises

Example: sum subsystem duty-cycle power, compute net energy per orbit, and verify SoC stays above 30%.

Chapter 6: Attitude Determination and Control (ADCS)

Fundamentals

ADCS is how the spacecraft knows and controls its orientation in space. The attitude determines whether solar panels face the Sun, antennas point to the ground, and payloads are aimed correctly. Attitude determination uses sensors like gyros, magnetometers, and sun sensors to estimate orientation. Attitude control uses actuators like reaction wheels and magnetorquers to achieve desired orientations. In small satellites, ADCS is often the most complex subsystem because it blends physics, estimation, and control in real time.

Because the spacecraft is a rigid body in orbit, you must reason about angular rates, inertia, and reference frames, not just pointing direction.

Deep Dive into the Concept

Attitude is commonly represented as quaternions to avoid singularities of Euler angles. A quaternion is a 4-element vector that represents a rotation. The FSW must integrate gyro rates to propagate attitude, then correct it with sensor measurements. This is usually done with an Extended Kalman Filter (EKF) or complementary filter. The EKF is a nonlinear estimator that predicts the state (quaternion + gyro bias) and updates it with measurements. The update step requires a measurement model that maps state to sensor readings. The filter must be tuned with realistic noise covariances, otherwise it will diverge.

Control is often done with reaction wheels. You compute an attitude error between the current quaternion and the target quaternion, convert it to an error vector, and apply a control law (often PID). The controller outputs desired torque, which is translated into wheel speed commands. Reaction wheels can saturate, so you need momentum dumping using magnetorquers or a safe mode pointing that reduces angular momentum. This is a classic constraint: your controller must respect actuator limits or it will fail.

Sensor availability is mission-dependent. Sun sensors only work in sunlight; magnetometers can be noisy near disturbances; gyros drift. A robust estimator must handle missing sensors and switch measurement models based on conditions (eclipse, sensor fault). This is why you simulate sensor dropout and noise in the projects.

The dynamics model matters. A rigid-body spacecraft is governed by Euler’s rotational equations, which include the inertia matrix and external torques. Even a simplified 1-axis model can capture the essence: torque changes angular rate, and angular rate integrates to attitude. Disturbances like atmospheric drag, gravity-gradient torque, and magnetic torques can introduce slow drifts that the controller must counteract. Including these disturbances in simulation makes your estimator and controller more realistic and exposes tuning issues earlier.

Calibration and alignment are practical concerns. Sensors are mounted with small misalignments, scale factors, and biases. If you ignore these, your attitude estimate can be off by degrees. A common approach is to include calibration parameters in the estimator or to apply a pre-flight calibration matrix. On the control side, you must map commanded torques to actual actuator capabilities, which is a control allocation problem when you have multiple actuators. This allocation must respect saturation and avoid commanding impossible torques.

ADCS is tightly coupled with power and thermal. Sun-pointing is essential for charging; payload pointing might conflict with power needs. ADCS modes (detumble, sun-point, nadir-point) must be integrated with overall mission modes. This is another example of subsystem contracts: ADCS promises to keep the attitude within bounds, but only if power and mode constraints allow it.

Timing is critical. Control loops have fixed update rates, often 1-10 Hz for CubeSats. If timing slips, the controller can become unstable. This is why ADCS code is often given high priority in the scheduler. It is also why you must carefully bound execution time in software.

Finally, ADCS validation is mostly simulation. You can simulate dynamics and sensors to validate your estimator and controller before hardware exists. This is a common workflow in aerospace: software-in-the-loop (SIL) before hardware-in-the-loop (HIL). The projects here mirror that approach.

How This Fits on Projects

ADCS concepts drive Projects 5, 6, and the capstone simulator.

Definitions & Key Terms

Quaternion: Four-element representation of 3D rotation.
EKF: Extended Kalman Filter for nonlinear state estimation.
Reaction Wheel: Actuator that controls attitude by spinning a flywheel.
Detumble: Mode to reduce angular rates after deployment.

Mental Model Diagram

Sensors -> Estimator (EKF) -> Attitude State -> Controller -> Actuators
        ^                                                      |
        +---------------------- Feedback ----------------------+

How It Works (Step-by-Step, Invariants, Failure Modes)

Propagate attitude using gyro integration.
Predict sensor readings using model.
Update state with measurements.
Compute attitude error vs target.
Apply control law and command actuators.

Invariants: quaternion norm stays 1; control outputs respect actuator limits.

Failure modes: filter divergence, wheel saturation, sensor dropout.

Minimal Concrete Example

// Normalize quaternion after update
float norm = sqrtf(q0*q0 + q1*q1 + q2*q2 + q3*q3);
q0 /= norm; q1 /= norm; q2 /= norm; q3 /= norm;

Common Misconceptions

“Euler angles are fine.” -> They suffer from singularities (gimbal lock).
“More sensors always improve estimation.” -> Bad sensors can destabilize the filter.

Check-Your-Understanding Questions

Why are quaternions preferred for spacecraft attitude?
What is wheel saturation and why does it matter?
How does an EKF handle nonlinear measurement models?

Check-Your-Understanding Answers

Quaternions avoid singularities and are numerically stable for 3D rotations.
Saturation prevents the wheel from producing required torque, causing loss of control.
The EKF linearizes the model around the current estimate and updates with covariance.

Real-World Applications

Earth-pointing and sun-pointing for CubeSats
Payload pointing for imaging missions

Where You’ll Apply It

Projects 5, 6, 13

References

Spacecraft Attitude Determination and Control (Wertz)
Fundamentals of Space Systems - ADCS chapters

Key Insight

ADCS is the closed-loop nervous system of the spacecraft.

Summary

You must estimate attitude robustly and control it within actuator limits under strict timing constraints.

Homework/Exercises

Simulate a constant-rate rotation and integrate gyro measurements.
Implement a simple PID for a 1-axis rotation and tune it.

Solutions to the Homework/Exercises

Use a 1-axis model: theta_dot = wheel_torque / I, integrate with dt = 0.1 s, adjust PID gains to minimize overshoot.

Chapter 7: Fault Detection, Isolation, and Recovery (FDIR)

Fundamentals

Space is hostile and unforgiving. Radiation flips bits, sensors fail, and software hangs. FDIR is the discipline of detecting faults, isolating the root cause, and recovering safely. In CubeSats, FDIR is often a simple but critical set of rules: watchdog resets, safe mode entry, and load shedding. You cannot rely on human operators to fix problems in real time, because passes are short and delays are long. FDIR is how your spacecraft survives when things go wrong.

Good FDIR is less about clever algorithms and more about consistent, conservative behavior under stress. It is also the discipline of knowing when to do nothing.

Deep Dive into the Concept

FDIR starts with detection. The simplest detection is a watchdog timer: if the main loop does not kick it, the system resets. More sophisticated detection uses health checks: if a sensor reading is out of range, if a task misses its deadline, or if telemetry stops updating. You should classify faults by severity. A single sensor glitch might only trigger a warning, while a persistent power drop triggers safe mode.

Isolation is about identifying which subsystem is misbehaving. For example, if the I2C bus is stuck, you can reset the bus and see if telemetry resumes. If the radio does not respond, you can reboot the radio or power-cycle it. This requires you to design your hardware and software with isolation hooks: power switches, bus resets, and task restarts. Without these, recovery is impossible.

Recovery is a staged process. A common pattern is: retry the task, reset the subsystem, reboot the OBC, then enter safe mode. Each stage should have a cooldown period to avoid rapid cycling. Safe mode is the last resort: minimal power draw, sun-pointing if possible, radio beacons only. The key is to keep the spacecraft alive long enough for ground intervention.

Memory corruption is a special case. Single Event Upsets (SEUs) can flip bits in RAM or registers. Many systems use EDAC (Error Detection and Correction) with Hamming codes, plus periodic memory scrubbing. Scrubbing scans memory for correctable errors and fixes them before they accumulate into uncorrectable ones. This is a software pattern you can simulate and implement.

FDIR also needs telemetry. Every fault detection and recovery action should be logged with a reason code. Operators need to know if a reset was due to watchdog, power drop, or thermal limit. If you do not log this, you will be blind to the system’s health.

FDIR logic should be driven by explicit fault trees. A fault tree is a structured map from symptoms to likely causes and recovery actions. For example, a low voltage event could be caused by eclipse, payload overuse, or battery degradation. The recovery action may differ depending on the cause, but you can still implement a conservative default: shed loads, enter safe mode, and notify ground. Fault trees help you avoid ad hoc logic that becomes brittle over time.

Redundancy is another dimension. You may have redundant sensors, redundant buses, or a cold spare radio. FDIR must know how to switch to the redundant path and how to verify the switch succeeded. Even if you do not have physical redundancy, you can implement software redundancy by cross-checking sensors and using voting logic. These techniques are common in higher-reliability missions and should be understood even if not fully implemented in CubeSats.

Finally, every recovery action should be reversible or at least observable in telemetry.

Testing FDIR is as important as implementing it. You should inject faults in simulation: stall a task, corrupt memory, drop battery voltage. A good FDIR system behaves predictably and never makes things worse. This is why FDIR rules should be simple and conservative.

How This Fits on Projects

FDIR is central to Projects 8 and 11, and integrates with Projects 3 and 13.

Definitions & Key Terms

FDIR: Fault Detection, Isolation, and Recovery.
Watchdog: Hardware or software timer that resets on missed heartbeats.
Safe Mode: Minimal operational mode to keep spacecraft alive.
EDAC: Error Detection and Correction, often via Hamming codes.

Mental Model Diagram

Fault -> Detect -> Isolate -> Recover
           |          |         |
           v          v         v
        Log Code   Reset    Safe Mode

How It Works (Step-by-Step, Invariants, Failure Modes)

Monitor health telemetry and task heartbeats.
If thresholds exceeded, classify fault.
Attempt local recovery (reset task/bus).
If persistent, escalate to subsystem reboot.
If still unresolved, enter safe mode.

Invariants: never reset endlessly; safe mode always possible.

Failure modes: reset loops, false positives, missed detection.

Minimal Concrete Example

if (task_heartbeat_missing("COMMS")) {
  log_fault(FAULT_COMMS_STALL);
  reset_subsystem(COMMS);
  if (fault_persistent()) enter_safe_mode();
}

Common Misconceptions

“A reboot always fixes it.” -> Some faults persist and require mode change.
“More checks are always better.” -> Excess checks can trigger false positives.

Check-Your-Understanding Questions

Why is staged recovery preferable to immediate safe mode?
How does memory scrubbing reduce SEU risk?
What telemetry should be logged for a watchdog reset?

Check-Your-Understanding Answers

It preserves mission function when faults are transient.
It corrects single-bit errors before they accumulate.
Reason code, timestamp, and last known task health.

Real-World Applications

Autonomous fault recovery in commercial missions
High-reliability systems in educational CubeSats

Where You’ll Apply It

Projects 8, 11, 13

References

NASA Mission Success Handbook for CubeSat Missions (GSFC-HDBK-8007)
Making Embedded Systems - reliability and watchdog patterns

Key Insight

FDIR is the difference between a mission that survives and one that dies silently.

Summary

You must detect faults early, isolate the cause, and recover with conservative rules.

Homework/Exercises

Design a fault tree for “battery voltage below threshold”.
Simulate a bus lockup and define recovery steps.

Solutions to the Homework/Exercises

Example: detect low voltage -> shed payload -> disable radio -> safe mode if still low.

Chapter 8: Ground Segment, Operations, and Mission Planning

Fundamentals

The spacecraft is only half the system. The ground segment is where operators plan passes, decode telemetry, and send commands. Because CubeSats have short passes and limited bandwidth, ground operations must be efficient and automated. Your flight software must integrate with this reality: it must generate telemetry that is easy to interpret, accept commands with validation, and support time-tagged command stacks. If the ground segment is weak, the mission fails regardless of onboard software quality.

Good ground tools also preserve mission memory: logs, annotations, and pass reports become the living history of the spacecraft. That history is often what enables anomaly recovery months later.

Deep Dive into the Concept

Ground operations revolve around the pass schedule. A pass is a short window when the satellite is above the horizon. Operators must prioritize what to downlink and which commands to send. This is why telemetry must be prioritized and why command stacks are often uploaded in bulk. The flight software must support this workflow by accepting command queues and executing them safely when no contact exists.

Telemetry decoding is another key function. On the ground, you must parse CCSDS packets, convert raw values into engineering units, and display them in dashboards. This requires a consistent telemetry dictionary and unit conversions. If the onboard software changes a field without updating the ground dictionary, you will misinterpret telemetry. This is a common and costly error. Ground tools must also track trends over time to identify slow failures (battery degradation, thermal drift).

Commanding is a safety-critical process. A command console should validate parameters and prevent unsafe commands in certain modes. For example, payload commands should be blocked in safe mode. Authentication is also important: you must ensure only authorized operators can send commands, even if this is a simple token or passphrase. The flight software should reject invalid or unsafe commands with clear error codes.

Ground systems also manage data products. Raw telemetry is not very useful by itself; it must be archived, indexed, and transformed into engineering units and trends. A common pattern is to store raw packets, decoded telemetry, and derived metrics (like battery degradation rate) separately. This separation allows you to reprocess data later if you discover an error in decoding. It also supports mission reporting and long-term analysis, which are critical for fleet operations where you need to compare multiple satellites.

Anomaly response is another operational reality. When something goes wrong, operators need a playbook: what to check first, which commands are allowed, and how to confirm recovery. Your ground tools should make this easy by highlighting abnormal telemetry and providing a fault history timeline. Even for small missions, this can be the difference between a quick recovery and a total loss. Designing software with anomaly response in mind makes the entire mission more resilient.

Security and access control are part of operations as well. Even a simple CubeSat should restrict commanding to authenticated operators and defined contact windows. A lightweight approach might include a shared secret token, command sequence counters, and ground-station-only acceptance rules. These measures reduce the risk of accidental or malicious commanding without requiring heavy cryptography.

Pass automation is increasingly common. Scripts can automatically track when a pass begins, establish link, downlink telemetry, and upload command stacks. The onboard software can assist by sending a beacon at predictable intervals and providing a consistent downlink format. Your project will implement a simplified version of this workflow.

Finally, mission planning is about constraints. You cannot run all payloads all the time. You must plan around power, thermal, and comms. This planning often happens on the ground, but the onboard software must enforce it. The mission simulator project integrates ground planning with onboard constraints to demonstrate this full loop.

How This Fits on Projects

Ground operations are central to Projects 7, 12, and 13, and influence telemetry design in Project 1.

Definitions & Key Terms

Pass: Visibility window between ground station and spacecraft.
Command Stack: Batch of time-tagged commands uploaded in one pass.
Telemetry Dictionary: Mapping of packet fields to engineering units.
Beacon: Low-rate periodic health transmission.

Mental Model Diagram

Ground Planner -> Command Stack -> Uplink -> FSW
FSW -> Telemetry Packets -> Downlink -> Ground Dashboard

How It Works (Step-by-Step, Invariants, Failure Modes)

Ground predicts pass schedule and plans commands.
Uplink sends command stack during pass.
FSW validates and queues commands.
Telemetry downlink provides health and payload data.
Ground decodes and updates dashboards.

Invariants: unsafe commands are rejected; telemetry dictionary is consistent.

Failure modes: command timing mismatch, unit conversion errors, unauthorized access.

Minimal Concrete Example

# Simple telemetry decode mapping
apid_map = {0x001: "EPS_HEALTH", 0x002: "ADCS_HEALTH"}

Common Misconceptions

“Ground ops is just UI.” -> It is mission-critical automation and validation.
“Commands can be sent anytime.” -> Only during passes unless autonomous.

Check-Your-Understanding Questions

Why are command stacks important for CubeSats?
What happens if the telemetry dictionary is wrong?
How can you prevent unsafe commands in safe mode?

Check-Your-Understanding Answers

Passes are short; batching commands reduces overhead.
Operators misinterpret data, leading to wrong decisions.
Implement mode-based command validation in FSW and ground UI.

Real-World Applications

Automated ground stations for academic CubeSats
Commercial mission operations centers

Where You’ll Apply It

Projects 1, 7, 12, 13

References

NASA CubeSat 101 (2017)
CCSDS Mission Operations Services (CCSDS 520.0/521.0 series)

Key Insight

Ground operations are the other half of flight software; you must design both together.

Summary

You must design telemetry, commands, and automation that match real-world pass constraints.

Homework/Exercises

Define a pass timeline with a command stack and downlink priorities.
Design a telemetry dashboard layout for EPS + ADCS.

Solutions to the Homework/Exercises

Example: 7-minute pass, first 2 minutes downlink health, next 3 minutes downlink payload, last 2 minutes uplink commands.

Glossary (High-Signal)

ADCS: Attitude Determination and Control System; handles orientation and pointing.
APID: Application Process Identifier; routes CCSDS packets.
CFDP: CCSDS File Delivery Protocol for reliable file transfer.
C&DH: Command and Data Handling; central flight software core.
Detumble: Initial attitude control mode to reduce angular rates.
Eclipse: Time when the spacecraft is in Earth’s shadow.
EKF: Extended Kalman Filter for nonlinear state estimation.
FDIR: Fault Detection, Isolation, and Recovery.
MET: Mission Elapsed Time, monotonic time since epoch.
SoC: State of Charge of the battery.
TLE: Two-Line Element, standard orbit data format.

Why Satellite Flight Software Matters

The Modern Problem It Solves

Small satellites now dominate space activity, but they operate with limited power, intermittent comms, and harsh environments. Flight software is the control system that makes these missions possible: it keeps the spacecraft safe, schedules operations, and ensures data gets home despite short contact windows.

Real-world impact with recent statistics:

Smallsat launches: Nearly 2,800 smallsats were launched in 2024, representing about 97% of all spacecraft launched (BryceTech Smallsats by the Numbers 2025).
Operational satellites: The UCS Satellite Database lists over 7,560 operational satellites with data current through May 2023 (page updated January 2, 2024).
CubeSat standardization: NASA CubeSat 101 identifies the basic CubeSat size as roughly 10 x 10 x 11 cm with about 1.33 kg per 1U, enabling standardized deployers and mass production.

Sources: BryceTech Smallsats by the Numbers 2025; UCS Satellite Database (updated Jan 2, 2024; data through May 1, 2023); NASA CubeSat 101 (2017).

These numbers mean two things for software engineers: (1) CubeSat-scale missions are no longer rare, and (2) the reliability burden has shifted to software because hardware is small and resource-constrained.

OLD APPROACH                        NEW APPROACH
+-----------------------+           +-----------------------+
| Large single satellite |           | Constellations of     |
| Custom hardware        |           | small, standardized   |
| Long design cycles     |           | satellites            |
+-----------------------+           +-----------------------+

Context & Evolution (Brief)

CubeSats began as educational platforms, but their standardized form factor and lower costs have driven commercial and government adoption. The rise of large constellations means that robust, repeatable flight software is now a competitive advantage.

Concept Summary Table

This section provides a map of the mental models you will build during these projects.

Concept Cluster	What You Need to Internalize
FSW Architecture & Modes	How subsystem contracts, scheduling, and modes prevent failure.
Timekeeping & Determinism	How clocks, time scales, and task scheduling keep behavior predictable.
Telemetry & Telecommand (CCSDS)	How packets are structured, validated, and prioritized for downlink.
Orbit & Environment Modeling	How pass prediction and eclipse drive mission operations.
Power & Thermal Budgeting	How energy flows and thermal cycles constrain mission behavior.
ADCS Estimation & Control	How sensors, filters, and actuators keep the spacecraft pointed.
FDIR & Reliability	How faults are detected, isolated, and recovered safely.
Ground Operations	How passes, command stacks, and dashboards complete the system.

Project-to-Concept Map

Project	What It Builds	Primer Chapters It Uses
Project 1: Space Packet Parser	CCSDS packet decode/encode	3, 2
Project 2: Flight State Machine	Mode management + scheduler	1, 2
Project 3: EPS Power Budget Simulator	Energy model + load shedding	5
Project 4: SGP4 Orbit Propagator	Orbit prediction + frames	4, 2
Project 5: Attitude Estimator	EKF + sensor fusion	6, 2
Project 6: Reaction Wheel PID Controller	Control law + actuator limits	6
Project 7: Telemetry Scheduler	Pass-aware priority scheduling	3, 2, 8
Project 8: Memory Scrubbing Simulator	EDAC + reliability	7
Project 9: Thermal Profile Forecaster	Thermal model + heater control	5
Project 10: Payload Image Compressor	Data reduction + packetization	3
Project 11: FDIR Watchdog	Fault detection + safe mode	7, 1
Project 12: Ground Station Console	Telemetry decoding + command validation	8, 3
Project 13: Full Mission Simulator	Integrated system	1-8

Deep Dive Reading by Concept

This section maps each concept to specific book chapters or technical standards for deeper understanding.

Fundamentals & Architecture

Concept	Book or Standard	Why This Matters
FSW architecture	Making Embedded Systems by Elecia White - Ch. 1-3	Embedded architecture patterns and reliability mindset.
Systems design	Fundamentals of Software Architecture by Richards/Ford - Ch. 4-6	Trade-offs and architecture decision records.
CubeSat mission lifecycle	NASA CubeSat 101 (2017 presentation)	High-level mission flow and constraints.

Comms & Packetization

Concept	Book or Standard	Why This Matters
CCSDS Space Packet	CCSDS 133.0-B-1	Primary packet format for telemetry/telecommand.
File delivery	CCSDS 727.0-B (CFDP)	Reliable downlink of large payloads.
Ground ops services	CCSDS 520/521 series	Mission operations service concepts.

Orbit, Time, and Environment

Concept	Book or Standard	Why This Matters
SGP4 propagation	Vallado et al. “Revisiting Spacetrack Report #3”	Canonical SGP4 implementation details.
TLE format	CelesTrak TLE documentation	Correct parsing and interpretation of orbital data.
Time formats	CCSDS 301.0-B	Consistent spacecraft timekeeping.

Power, Thermal, and ADCS

Concept	Book or Standard	Why This Matters
Power systems	Spacecraft Systems Engineering (Fortescue) - Power chapter	Realistic EPS design constraints.
Thermal control	Fundamentals of Space Systems - Thermal chapter	Thermal cycles and heater strategies.
Attitude control	Spacecraft Attitude Determination and Control (Wertz)	Estimation and control methods.

Reliability & FDIR

Concept	Book or Standard	Why This Matters
FDIR patterns	Making Embedded Systems - Reliability sections	Practical watchdog and recovery techniques.
CubeSat mission success	NASA GSFC-HDBK-8007	Mission success guidance and risk management.

Quick Start

Feeling overwhelmed? Start here instead of reading everything:

Day 1 (4 hours):

Read only “Introduction” and Chapter 1 (FSW Architecture).
Skim Chapter 3 (CCSDS) to understand packet structure.
Start Project 1 - just decode a CCSDS header from a binary file (Hint 1).
Do not worry about secondary headers yet.

Day 2 (4 hours):

Add APID routing and sequence count validation to Project 1.
Write a tiny script that generates a packet and prints the fields.
Read Project 2 “Core Question” and build a minimal mode state machine.
Run a small simulation with 3 modes and a timer.

End of Weekend: You now understand CCSDS packet structure and basic mode logic. That is 80% of the mental model. The other projects are variations of these two ideas.

Next Steps:

If it clicked: Continue to Project 3
If confused: Re-read Chapter 3 and the Project 1 hints
If frustrated: Take a break. FSW is hard. Come back in a week.

Recommended Learning Paths

Path 1: The Software Engineer (Recommended Start)

Best for: Developers with strong programming background but new to space.

Start with Project 1 (Space Packet Parser) - Learn CCSDS basics.
Then Project 2 (Flight State Machine) - Learn modes and scheduling.
Then Project 7 (Telemetry Scheduler) - Learn pass constraints.
Advanced: Projects 3, 11, 12, then capstone.

Path 2: The Controls Engineer

Best for: Controls/robotics engineers who know dynamics but not ops.

Start with Project 5 (Attitude Estimator).
Then Project 6 (Reaction Wheel PID).
Then Project 4 (SGP4 Orbit Propagator).
Advanced: Project 13 capstone with ADCS + orbit integration.

Path 3: The Ops Engineer

Best for: Mission operations or systems engineering background.

Start with Project 7 (Telemetry Scheduler).
Then Project 12 (Ground Station Console).
Then Project 1 (Packet Parser) for deeper packet knowledge.
Advanced: Project 13 capstone for full mission simulation.

Path 4: The Completionist

Best for: Those building a complete CubeSat lab environment.

Phase 1: Foundation (Weeks 1-2)

Project 1 (Packet Parser)
Project 2 (State Machine)

Phase 2: Subsystems (Weeks 3-6)

Project 3 (EPS Simulator)
Project 4 (SGP4 Propagator)
Project 5 (Attitude Estimator)
Project 6 (PID Controller)

Phase 3: Ops & Reliability (Weeks 7-9)

Project 7 (Telemetry Scheduler)
Project 8 (Memory Scrubber)
Project 9 (Thermal Forecaster)
Project 11 (FDIR Watchdog)

Phase 4: Integration (Weeks 10-12)

Project 10 (Image Compressor)
Project 12 (Ground Console)
Project 13 (Full Mission Simulator)

Success Metrics

You can explain the CCSDS primary header fields and compute packet length correctly.
You can design a mode state machine with explicit safe mode transitions.
Your power budget simulator stays within SoC limits for a 24-hour run.
Your orbit propagator predicts passes within a few minutes over 24 hours.
Your attitude estimator converges on simulated data with sensor dropout.
Your FDIR logic avoids reboot loops and always reaches a stable safe mode.
Your capstone simulator integrates EPS, ADCS, comms, and ground ops end-to-end.

Appendix A: CCSDS Packet Cheat Sheet

Primary Header (6 bytes total)
- Version (3 bits)
- Type (1 bit)
- Secondary Header Flag (1 bit)
- APID (11 bits)
- Sequence Flags (2 bits)
- Sequence Count (14 bits)
- Packet Length (16 bits) = (payload length + secondary header length) - 1

Appendix B: Coordinate Frames and Time Conversions

ECI (inertial) -> ECEF (Earth-fixed) -> Ground Station Topo
   |                                  |
   +---- requires Earth rotation -----+

Time scales:
UTC -> TAI -> GPS (use known offsets; handle leap seconds)

Appendix C: FDIR Checklist

Watchdog kicks every main loop cycle
Task heartbeats monitored with timeouts
Reset escalation ladder implemented
Safe mode entry criteria logged
Recovery actions rate-limited to avoid loops

Appendix D: Simulation Stack

Orbit propagator (SGP4 or reference)
EPS power model with eclipse
ADCS estimator + controller
Telemetry packetization and scheduler
Ground console for visualization

Project Overview Table

Project	Difficulty	Time	Core Skill	Output
1. CCSDS Parser	Level 2	Weekend	Binary protocols	Packet decoder/encoder
2. State Machine	Level 2	Weekend	System architecture	Mode manager
3. EPS Simulator	Level 2	1 week	Power modeling	SoC simulator
4. SGP4 Propagator	Level 3	1-2 weeks	Orbit mechanics	Pass predictor
5. Attitude Estimator	Level 4	2 weeks	Sensor fusion	EKF estimator
6. PID Controller	Level 3	1 week	Control systems	Slew controller
7. Telemetry Scheduler	Level 2	Weekend	Data management	Pass scheduler
8. Memory Scrubber	Level 3	1 week	Reliability	EDAC simulator
9. Thermal Forecaster	Level 2	1 week	Thermal modeling	Temperature simulator
10. Image Compressor	Level 3	1 week	Data reduction	Compression CLI
11. FDIR Watchdog	Level 3	1 week	Fault management	Recovery logic
12. Ground Console	Level 2	2 weeks	Ops tooling	UI + command console
13. Full Mission Simulator	Level 4	2-4 weeks	Integration	Digital twin

Project List

Project 1: The Space Packet Parser (CCSDS Protocol)

Main Programming Language: C
Alternative Programming Languages: Rust, Python
Coolness Level: Level 3: The Mission Decoder
Business Potential: Medium. The “Ground Segment Tooling” niche
Difficulty: Level 2: Intermediate
Knowledge Area: Space Communications / Protocols
Software or Tool: CCSDS Space Packet Protocol
Main Book: “Making Embedded Systems” by Elecia White

What you’ll build: A CCSDS space packet parser and encoder that reads binary telemetry files, extracts header fields, validates length/sequence counters, and emits JSON.

Why it teaches satellite FSW: CCSDS is the lingua franca of space telemetry and command. You cannot build ground tools or onboard routing without mastering this packet format.

Core challenges you’ll face:

Binary parsing -> Bit-level fields and endian correctness
Length rules -> Correct payload length calculation
Sequence handling -> Per-APID counters and missing packet detection

Real World Outcome

You will run a CLI that reads a downlink capture and prints a decoded stream with validation status.

What you will see:

Header decode with APID, sequence count, and packet length.
Validation showing missing or out-of-order packets.
JSON output ready for ground dashboards.

Command Line Outcome Example:

$ ./ccsds_parse --in downlink.bin --apid 0x002
[PKT] APID=0x002 SEQ=105 LEN=64 TYPE=TM
[PKT] APID=0x002 SEQ=106 LEN=64 TYPE=TM
[WARN] Missing packet: expected SEQ=107, got SEQ=108
[PKT] APID=0x002 SEQ=108 LEN=64 TYPE=TM
[OUT] wrote telemetry.json (3 packets)

The Core Question You’re Answering

“How do you speak the standard language of spacecraft telemetry without losing a single bit?”

A single header error can make an entire downlink unreadable. This project teaches you how to build reliable, defensive packet parsing.

Concepts You Must Understand First

CCSDS Primary Header Fields
- What does APID mean?
- How is packet length computed?
- Book Reference: CCSDS 133.0-B-1
Binary Parsing and Endianness
- How do you extract bitfields from bytes?
- How do you test for off-by-one errors?
- Book Reference: “Effective C” by Robert C. Seacord - Ch. 5
Sequence Counting
- Why sequence counts are per APID
- How do you detect missing packets?
- Book Reference: “Making Embedded Systems” - Ch. 7

Questions to Guide Your Design

Header decoding
- How will you isolate bitfields without masking errors?
- What is your strategy for endian-safe decoding?
Validation
- How will you compute and verify packet length?
- How will you track sequence counts per APID?
Output format
- Do you output raw bytes, decoded fields, or both?
- How will you represent malformed packets?

Thinking Exercise

The “One Bit Off” Problem

If a single bit in the primary header is flipped, how will your parser behave? Sketch the chain of errors and decide which checks can detect it early.

The Interview Questions They’ll Ask

“How is the CCSDS packet length field defined?”
“Why are sequence counts per APID?”
“How would you detect out-of-order packets?”
“What are the risks of endian confusion?”
“How would you extend this parser for secondary headers?”

Hints in Layers

Hint 1: Start with header-only parsing

uint16_t apid = ((buf[0] & 0x07) << 8) | buf[1];

Hint 2: Compute length carefully The CCSDS length field is (payload length - 1). Do not include the 6-byte primary header.

Hint 3: Track sequence per APID Use a hash map (or array indexed by APID) to store last sequence count.

Hint 4: Add a validator mode

$ ./ccsds_parse --validate --in downlink.bin
[OK] 512 packets validated

Books That Will Help

Topic	Book	Chapter
Binary parsing	“Effective C” by Robert C. Seacord	Ch. 5-6
Embedded data handling	“Making Embedded Systems” by Elecia White	Ch. 6-7
Protocol standards	CCSDS 133.0-B-1	Sections 3-4

Common Pitfalls & Debugging

Problem: “Packet length mismatch”

Why: You included the 6-byte header in the length calculation.
Fix: Length = payload length - 1.
Quick test: Parse a known test vector and compare lengths.

Problem: “Sequence counts look random”

Why: Sequence is per APID, not global.
Fix: Maintain a per-APID counter table.
Quick test: Group packets by APID and verify monotonic counts.

Definition of Done

Correctly parses CCSDS primary header fields
Computes packet length and detects mismatches
Tracks sequence counts per APID
Emits JSON output with decoded fields
Handles malformed packets without crashing

Project 2: The Flight State Machine (The Life Cycle)

Main Programming Language: C
Alternative Programming Languages: Rust, Python
Coolness Level: Level 3: The Mode Master
Business Potential: Medium. The “Flight Software Core” niche
Difficulty: Level 2: Intermediate
Knowledge Area: System Architecture / FSW Core
Software or Tool: State machine + scheduler
Main Book: “Making Embedded Systems” by Elecia White

What you’ll build: A mode manager that transitions between SAFE, NOMINAL, and SCIENCE modes, with explicit entry/exit actions and timing rules.

Why it teaches satellite FSW: Mode logic is the backbone of spacecraft autonomy. It encodes operational constraints and safety rules.

Core challenges you’ll face:

Mode transitions -> Avoid oscillation and race conditions
Timing -> Ensure transitions respect cooldown timers
Safety -> Safe mode overrides all other logic

Real World Outcome

You will run a simulator that shows mode transitions based on power and attitude state.

$ ./mode_sim --battery 25 --attitude LOST
[MODE] NOMINAL -> SAFE (reason: low battery)
[SAFE] payload OFF, radio beacon ON, sun-pointing

The Core Question You’re Answering

“How do you keep a spacecraft alive when everything else fails?”

Concepts You Must Understand First

Finite State Machines
- How do you encode modes and transitions?
- Book Reference: “Clean Architecture” by Robert C. Martin - Ch. 7
Safety Overrides
- Why does safe mode preempt other logic?
- Book Reference: “Making Embedded Systems” - Ch. 10
Timing and Cooldowns
- How do you avoid mode thrashing?
- Book Reference: CCSDS time code guidance

Questions to Guide Your Design

What conditions trigger SAFE mode?
What is the minimum dwell time before leaving SAFE?
Which subsystem commands are allowed in SAFE?

Thinking Exercise

Design a transition table for SAFE -> NOMINAL -> SCIENCE, and include hysteresis thresholds for battery voltage.

The Interview Questions They’ll Ask

“How do you prevent mode oscillation?”
“What actions happen on SAFE entry?”
“How would you test mode transitions?”
“What happens if telemetry is stale?”

Hints in Layers

Hint 1: Start with a simple enum for states.

Hint 2: Add explicit transition functions with guard conditions.

Hint 3: Use timers to enforce minimum dwell time.

Hint 4: Log every transition with reason codes.

Books That Will Help

Topic	Book	Chapter
State machines	“Clean Architecture” by Robert C. Martin	Ch. 7
Embedded reliability	“Making Embedded Systems” by Elecia White	Ch. 10

Common Pitfalls & Debugging

Problem: “Mode oscillates rapidly”

Why: No hysteresis or cooldown.
Fix: Add minimum dwell time and hysteresis thresholds.
Quick test: Simulate fluctuating battery voltage.

Definition of Done

SAFE, NOMINAL, SCIENCE modes implemented
Entry/exit actions logged
Hysteresis prevents oscillation
SAFE mode overrides all other logic

Project 3: EPS Power Budget Simulator (Energy Management)

Main Programming Language: Python
Alternative Programming Languages: C
Coolness Level: Level 3: The Power Guardian
Business Potential: Medium. Power simulation tooling
Difficulty: Level 2: Intermediate
Knowledge Area: Power Systems / EPS
Software or Tool: EPS simulator
Main Book: “Spacecraft Systems Engineering” by Fortescue

What you’ll build: A power budget simulator that models solar input, battery SoC, and load shedding across orbit cycles.

Why it teaches satellite FSW: Power constraints drive every operational decision. You must prove your mission is energy-feasible.

Core challenges you’ll face:

Energy balance -> Solar input vs load draw
Eclipse modeling -> Zero input periods
Load shedding -> Priority-based shutdowns

Real World Outcome

A CLI that simulates SoC over multiple orbits and outputs a plot.

$ python eps_sim.py --orbits 5
[INFO] Min SoC: 32%
[INFO] Max SoC: 88%
[SHED] Payload OFF at SoC 28%
[PLOT] eps_soc.png generated

The Core Question You’re Answering

“Can your spacecraft survive the night side of every orbit?”

Concepts You Must Understand First

Energy Balance
- How do you compute SoC updates?
- Book Reference: Spacecraft Systems Engineering - Power chapter
Eclipse Modeling
- How long is eclipse for your orbit?
- Book Reference: NASA CubeSat 101
Load Shedding
- Which subsystems are critical?
- Book Reference: Making Embedded Systems - reliability sections

Questions to Guide Your Design

What is the solar array power curve?
What SoC threshold triggers safe mode?
How do you model battery degradation?

Thinking Exercise

If your payload draws 5 W for 10 minutes every orbit, how does that change minimum SoC? Compute it for a 90-minute orbit.

The Interview Questions They’ll Ask

“How do you compute battery SoC?”
“What happens if eclipse is longer than expected?”
“How do you prioritize loads?”

Hints in Layers

Hint 1: Start with a constant power draw model.

Hint 2: Add eclipse windows and set solar input to zero.

Hint 3: Implement load shedding when SoC < threshold.

Hint 4: Plot SoC vs time to validate behavior.

Books That Will Help

Topic	Book	Chapter
Power systems	“Spacecraft Systems Engineering”	Power chapter
Embedded design	“Making Embedded Systems”	Reliability sections

Common Pitfalls & Debugging

Problem: “SoC grows above 100%”

Why: No bounds check.
Fix: Clamp SoC to [0, 1].
Quick test: Run with high solar input and verify clamp.

Definition of Done

SoC simulator works across multiple orbits
Eclipse modeled with zero solar input
Load shedding triggers at thresholds
Plot output generated

Project 4: SGP4 Orbit Propagator (Finding Your Place)

Main Programming Language: Python
Alternative Programming Languages: C
Coolness Level: Level 4: The Navigator
Business Potential: Medium. Ground planning tools
Difficulty: Level 3: Advanced
Knowledge Area: Orbital Mechanics
Software or Tool: SGP4 propagator
Main Book: “Fundamentals of Astrodynamics and Applications” by Vallado

What you’ll build: A TLE-driven orbit propagator that predicts satellite position, ground passes, and eclipse windows.

Why it teaches satellite FSW: Orbit prediction drives comms windows, power planning, and payload scheduling.

Core challenges you’ll face:

TLE parsing -> Strict format
Frame conversions -> ECI to ECEF
Pass prediction -> Elevation calculations

Real World Outcome

A CLI that prints upcoming passes for a given ground station.

$ python pass_predict.py --tle iss.tle --lat 28.5 --lon -80.6
[PASS] 2026-01-02T01:12:15Z start
[PASS] max elev 62 deg, duration 7m12s
[PASS] 2026-01-02T02:49:40Z start

The Core Question You’re Answering

“Where is my satellite, and when can I talk to it?”

Concepts You Must Understand First

TLE Format
- What fields define the orbit?
- Book Reference: CelesTrak documentation
SGP4 Model
- Why is it the standard?
- Book Reference: Vallado “Revisiting Spacetrack Report #3”
Coordinate Frames
- How do you convert to ground station coordinates?
- Book Reference: Fundamentals of Astrodynamics - frames chapter

Questions to Guide Your Design

How will you validate TLE checksum?
What elevation threshold defines a pass?
How do you compute eclipse from sun vector?

Thinking Exercise

If your TLE is 5 days old, how could your pass prediction be wrong? Estimate the effect on pass start time.

The Interview Questions They’ll Ask

“Why is SGP4 used for TLEs?”
“What is the difference between ECI and ECEF?”
“How do you compute a ground station pass?”

Hints in Layers

Hint 1: Use a reference SGP4 library to validate your results.

Hint 2: Implement a simple elevation calculation first.

Hint 3: Add pass start/stop detection at 10 degrees elevation.

Hint 4: Plot the ground track for visual validation.

Books That Will Help

Topic	Book	Chapter
Orbit propagation	“Fundamentals of Astrodynamics and Applications”	SGP4 section
TLE format	CelesTrak documentation	TLE format

Common Pitfalls & Debugging

Problem: “Pass times are wrong”

Why: Frame conversion error or time conversion issue.
Fix: Validate with a known pass prediction tool.
Quick test: Compare against online pass predictor for a known satellite.

Definition of Done

Parses TLEs correctly
Propagates orbit with SGP4
Predicts pass windows for a ground station
Validated against reference tool

Project 5: The Attitude Estimator (Sensor Fusion)

Main Programming Language: Python
Alternative Programming Languages: C
Coolness Level: Level 4: The Attitude Whisperer
Business Potential: Medium. ADCS algorithm dev
Difficulty: Level 4: Expert
Knowledge Area: Estimation / ADCS
Software or Tool: EKF attitude estimator
Main Book: “Spacecraft Attitude Determination and Control” by Wertz

What you’ll build: An EKF that fuses gyro, magnetometer, and sun sensor data to estimate attitude.

Why it teaches satellite FSW: Estimation is the foundation of control; without it, you cannot point or stabilize.

Core challenges you’ll face:

Quaternion math -> Normalize and avoid drift
Noise tuning -> Keep filter stable
Sensor dropout -> Handle missing measurements

Real World Outcome

A simulator that tracks true attitude and estimated attitude over time.

$ python ekf_attitude.py --sim
[EKF] RMS error: 0.8 deg
[EKF] Sensor dropouts handled: 3
[PLOT] attitude_error.png generated

The Core Question You’re Answering

“How do you know where you are pointing when your sensors lie?”

Concepts You Must Understand First

Quaternions
- Why not Euler angles?
- Book Reference: Spacecraft Attitude Determination and Control - Ch. 3
Kalman Filtering
- How do you tune process/measurement noise?
- Book Reference: Estimation theory text
Sensor Models
- How do sun and magnetometer measurements map to attitude?
- Book Reference: ADCS chapters in Fundamentals of Space Systems

Questions to Guide Your Design

How will you model gyro bias?
How will you handle eclipse (no sun sensor)?
How will you normalize quaternions?

Thinking Exercise

If your sun sensor fails in eclipse, how will the filter weight the magnetometer and gyro?

The Interview Questions They’ll Ask

“Why do we use quaternions instead of Euler angles?”
“How does an EKF handle nonlinear models?”
“What happens if a sensor fails?”
“How do you tune measurement noise?”

Hints in Layers

Hint 1: Start with a basic gyro propagation step.

Hint 2: Implement measurement update using magnetometer only.

Hint 3: Add sun sensor when in sunlight.

Hint 4: Normalize quaternion after each update.

Books That Will Help

Topic	Book	Chapter
Attitude determination	“Spacecraft Attitude Determination and Control”	Ch. 3-5
Estimation	“Fundamentals of Space Systems”	ADCS chapters

Common Pitfalls & Debugging

Problem: “Filter diverges”

Why: Bad tuning or numerical instability.
Fix: Increase measurement noise or normalize quaternions.
Quick test: Run on simulated data with known truth.

Definition of Done

EKF converges on simulated data
Handles sensor dropout
Produces stable attitude estimate

Project 6: Reaction Wheel PID Controller (The Mover)

Main Programming Language: C
Alternative Programming Languages: Python
Coolness Level: Level 4: The Attitude Sculptor
Business Potential: Medium. ADCS control logic
Difficulty: Level 3: Advanced
Knowledge Area: Control Systems / ADCS
Software or Tool: PID controller
Main Book: “Spacecraft Attitude Determination and Control” by Wertz

What you’ll build: A PID controller that rotates the satellite to a target attitude with minimal overshoot.

Why it teaches satellite FSW: Control translates estimation into action; it makes the spacecraft point where you need it.

Core challenges you’ll face:

Gain tuning -> Stability vs response time
Saturation handling -> Wheel speed limits
Timing -> Stable update rate

Real World Outcome

A simulation showing a 90-degree slew and settle.

$ ./pid_slew --target 90deg
[INIT] Error: 90 deg
[CTRL] Wheel speed: 1200 rpm
[CTRL] Error: 5 deg
[CTRL] Error: 0.3 deg
[OK] Slew complete in 18s

The Core Question You’re Answering

“How do you stop spinning in a world without friction?”

Concepts You Must Understand First

PID Control
- How do P, I, and D terms affect response?
- Book Reference: Control systems text
Reaction Wheel Dynamics
- How does wheel torque affect attitude?
- Book Reference: ADCS chapters
Saturation
- What happens when wheel speed maxes out?
- Book Reference: Control systems text

Questions to Guide Your Design

What gains achieve critical damping?
How will you detect saturation?
What is the update frequency?

Thinking Exercise

Sketch an error vs time curve for a poorly tuned PID (overshoot). How would you fix it?

The Interview Questions They’ll Ask

“How do you tune PID gains?”
“What is momentum dumping?”
“How do you prevent overshoot in space?”

Hints in Layers

Hint 1: Start with proportional control only.

Hint 2: Add derivative term to damp oscillations.

Hint 3: Add integral term to remove bias.

Hint 4: Clamp wheel speed and log saturation events.

Books That Will Help

Topic	Book	Chapter
PID control	Control systems text	PID chapter
ADCS	“Spacecraft Attitude Determination and Control”	Control chapters

Common Pitfalls & Debugging

Problem: “Overshoot and oscillation”

Why: Gains too aggressive.
Fix: Reduce P or increase D.
Quick test: Step response plot.

Definition of Done

Achieves <1 deg error in <30s
No sustained oscillation
Handles wheel saturation

Project 7: Priority Telemetry Scheduler (The Traffic Cop)

Main Programming Language: C
Alternative Programming Languages: Python
Coolness Level: Level 3: The Bandwidth Boss
Business Potential: Medium. Ground segment planning
Difficulty: Level 2: Intermediate
Knowledge Area: Data Management / Ops
Software or Tool: Telemetry scheduler
Main Book: “Space Mission Engineering” (SMAD)

What you’ll build: A telemetry scheduler that prioritizes health packets over payload data and fits into a downlink window.

Why it teaches satellite FSW: Bandwidth is limited; you must decide what is mission-critical to downlink.

Core challenges you’ll face:

Priority policy -> Health vs payload
Bandwidth budgeting -> Capacity per pass
Drop handling -> Log and retry

Real World Outcome

A scheduler that outputs which packets were downlinked during a 7-minute pass.

$ ./tm_sched --pass 420 --queue tm_queue.json
[PASS] 7 minutes available
[SENT] Health APID 0x001 (all)
[SENT] Power APID 0x002 (all)
[SENT] Attitude APID 0x003 (partial)
[DROP] Payload APID 0x200 (insufficient time)

The Core Question You’re Answering

“When you only have 7 minutes to talk, what do you send first?”

Concepts You Must Understand First

Packet Prioritization
- Which telemetry is critical?
- Book Reference: Space Mission Engineering ops chapters
Bandwidth Budgeting
- How many bytes fit in a pass?
- Book Reference: Spacecraft Systems Engineering comms chapters
Queueing
- FIFO vs priority queues
- Book Reference: Algorithms (Sedgewick) - priority queues

Questions to Guide Your Design

How do you define priority classes?
How do you handle partial packets or segmentation?
How do you log dropped packets?

Thinking Exercise

If health telemetry is 5% of the data but 90% of the importance, how do you schedule it?

The Interview Questions They’ll Ask

“How do you prioritize telemetry?”
“How do you manage limited downlink bandwidth?”
“What happens when the queue overflows?”

Hints in Layers

Hint 1: Start with a fixed priority queue.

Hint 2: Compute pass capacity in bytes.

Hint 3: Stop scheduling when capacity exhausted.

Hint 4: Track dropped packets for later retransmit.

Books That Will Help

Topic	Book	Chapter
Mission ops	“Space Mission Engineering”	Ops planning
Data structures	“Algorithms” by Sedgewick	Priority queues

Common Pitfalls & Debugging

Problem: “Pass capacity exceeded”

Why: Not accounting for packet overhead.
Fix: Include headers in size calculations.
Quick test: Simulate a small pass and verify byte counts.

Definition of Done

Prioritizes health packets
Schedules within pass time
Logs dropped packets

Project 8: Memory Scrubbing Simulator (Defeating Radiation)

Main Programming Language: C
Alternative Programming Languages: Python
Coolness Level: Level 3: The Bit Guardian
Business Potential: Low. Reliability tooling
Difficulty: Level 3: Advanced
Knowledge Area: Reliability / FDIR
Software or Tool: EDAC + scrubbing simulator
Main Book: “Making Embedded Systems” by Elecia White

What you’ll build: A memory scrubbing task that detects and corrects SEUs using Hamming codes.

Why it teaches satellite FSW: Radiation-induced bit flips are common; software must detect and correct them.

Core challenges you’ll face:

Hamming codes -> Correct syndrome decoding
Scrub scheduling -> Background task design
Error logging -> Corrected vs uncorrectable

Real World Outcome

A simulator showing bit flips detected and corrected.

$ ./scrub_sim --memory 1024 --rate 1e-5
[SEU] Bit flip at address 0x1A4
[FIX] Corrected single-bit error
[STATS] 12 errors corrected, 0 uncorrectable

The Core Question You’re Answering

“How do you keep software correct when memory is unreliable?”

Concepts You Must Understand First

SEU Effects
- What causes bit flips?
- Book Reference: NASA mission success handbook
Hamming Codes
- How do you detect/correct errors?
- Book Reference: Coding theory text
Scrubbing Schedules
- How often to scrub?
- Book Reference: Reliability engineering sections

Questions to Guide Your Design

What is your scrub interval?
How do you detect uncorrectable errors?
How do you log scrub statistics?

Thinking Exercise

If your SEU rate doubles during a solar storm, how should scrub frequency change?

The Interview Questions They’ll Ask

“What is memory scrubbing?”
“Why do you need EDAC in space?”
“How do you handle double-bit errors?”

Hints in Layers

Hint 1: Implement a simple Hamming(7,4) code first.

Hint 2: Add a background loop that scans memory.

Hint 3: Inject random bit flips for testing.

Hint 4: Log corrected vs uncorrectable errors.

Books That Will Help

Topic	Book	Chapter
Reliability	“Making Embedded Systems”	Reliability sections
Coding theory	Any coding text	Hamming codes

Common Pitfalls & Debugging

Problem: “False corrections”

Why: Incorrect syndrome decoding.
Fix: Verify with known test vectors.
Quick test: Flip one bit and verify correction.

Definition of Done

Detects and corrects single-bit errors
Logs uncorrectable errors
Scrubs on a configurable interval

Project 9: Thermal Profile Forecaster (Heat Management)

Main Programming Language: Python
Alternative Programming Languages: C
Coolness Level: Level 2: The Thermal Whisperer
Business Potential: Low. Modeling tool
Difficulty: Level 2: Intermediate
Knowledge Area: Thermal Systems
Software or Tool: Thermal simulator
Main Book: “Fundamentals of Space Systems”

What you’ll build: A thermal simulator that predicts temperature over orbit cycles with heater control.

Why it teaches satellite FSW: Thermal control is software-driven in small satellites and tied to EPS.

Core challenges you’ll face:

Thermal balance -> Radiative vs conductive modeling
Eclipse effects -> Rapid cooling
Heater control -> Hysteresis logic

Real World Outcome

A plot showing temperature variations and heater activations.

$ python thermal_sim.py --orbits 5
[INFO] Min temp: -12C
[INFO] Max temp: 38C
[HEATER] Activated for 18 minutes total
[PLOT] thermal_profile.png generated

The Core Question You’re Answering

“How do you keep components alive while temperature swings every orbit?”

Concepts You Must Understand First

Thermal Balance
- Radiative vs conductive heat transfer
- Book Reference: Fundamentals of Space Systems - Thermal Systems
Eclipse Effects
- Why does temperature drop in shadow?
- Book Reference: Fundamentals of Space Systems
Heater Control
- How does hysteresis prevent oscillation?
- Book Reference: Control systems text

Questions to Guide Your Design

How many thermal nodes are needed for accuracy?
What is the heater power draw?
How do you integrate with EPS budget?

Thinking Exercise

If a heater draws 2W and must run 30 minutes per orbit, how does this affect SoC?

The Interview Questions They’ll Ask

“Why is thermal control software-driven?”
“What is the role of eclipse in thermal cycles?”
“How do you prevent heater thrashing?”

Hints in Layers

Hint 1: Start with a single-node thermal model.

Hint 2: Add eclipse-driven solar input changes.

Hint 3: Implement heater with hysteresis.

Hint 4: Integrate with EPS simulation for power impact.

Books That Will Help

Topic	Book	Chapter
Thermal control	“Fundamentals of Space Systems”	Thermal Systems
Power coupling	“Spacecraft Systems Engineering”	Power + Thermal

Common Pitfalls & Debugging

Problem: “Temperature diverges”

Why: Incorrect sign in heat balance equation.
Fix: Verify each term with simple cases.
Quick test: Run with zero solar input and verify cooling.

Definition of Done

Simulates temperature for multiple orbits
Models eclipse transitions
Heater control with hysteresis

Project 10: Payload Image Compressor (Space JPEG)

Main Programming Language: C
Alternative Programming Languages: Python
Coolness Level: Level 3: The Bandwidth Magician
Business Potential: Medium. Payload data pipelines
Difficulty: Level 3: Advanced
Knowledge Area: Payload Ops / Data Compression
Software or Tool: DCT-based compressor
Main Book: Image processing text

What you’ll build: A lightweight image compressor suitable for low-power downlinks.

Why it teaches satellite FSW: Payload data must be reduced to fit limited bandwidth.

Core challenges you’ll face:

DCT + quantization -> Lossy compression design
Packetization -> Segment compressed data
Quality metrics -> PSNR calculation

Real World Outcome

A CLI that compresses an image and reports size savings.

$ ./img_compress input.pgm output.bin
[OK] Original: 1024x1024 (1.0 MB)
[OK] Compressed: 120 KB (8.3x reduction)
[PSNR] 34.2 dB

The Core Question You’re Answering

“How do you send useful images over tiny bandwidth?”

Concepts You Must Understand First

DCT & Quantization
- How do you reduce frequency data?
- Book Reference: Image processing text
Packetization
- How do you segment compressed output?
- Book Reference: CCSDS packetization
Quality Metrics
- What is PSNR?
- Book Reference: Image processing text

Questions to Guide Your Design

What block size should you use?
How will you store quantization tables?
How will you recover from missing packets?

Thinking Exercise

If you lose 10% of packets, how would you reconstruct a partial image?

The Interview Questions They’ll Ask

“Why use DCT for compression?”
“How do you balance quality and size?”
“How do you handle packet loss?”

Hints in Layers

Hint 1: Start with a small grayscale image.

Hint 2: Implement 8x8 DCT blocks.

Hint 3: Add quantization tables.

Hint 4: Add run-length encoding for zeros.

Books That Will Help

Topic	Book	Chapter
Image processing	Standard text	DCT and compression
Telemetry	CCSDS 133.0-B-1	Packetization

Common Pitfalls & Debugging

Problem: “Output is garbage”

Why: DCT or inverse DCT incorrect.
Fix: Validate with known test images.
Quick test: Reconstruct and compare with original.

Definition of Done

Compresses images by 4x+ ratio
Produces valid reconstruction
Reports PSNR

Project 11: FDIR Watchdog (The Dead Man’s Switch)

Main Programming Language: C
Alternative Programming Languages: Python
Coolness Level: Level 3: The Survivor
Business Potential: Medium. Reliability patterns
Difficulty: Level 3: Advanced
Knowledge Area: Fault Management
Software or Tool: Watchdog + recovery logic
Main Book: “Making Embedded Systems”

What you’ll build: A watchdog and fault recovery system that resets subsystems on failure.

Why it teaches satellite FSW: Autonomous recovery is essential when you only get short passes.

Core challenges you’ll face:

Heartbeat monitoring -> Task liveness
Escalation ladder -> Reset vs safe mode
Logging -> Fault reason codes

Real World Outcome

A simulator showing watchdog resets and safe mode entry.

$ ./fdir_watchdog --sim
[OK] All tasks alive
[FAULT] COMMS task stalled
[RECOVER] Rebooted radio
[FAULT] Battery low
[SAFE] Entering safe mode

The Core Question You’re Answering

“How does a spacecraft recover when the software is hung?”

Concepts You Must Understand First

Watchdog Timers
- How do they detect hangs?
- Book Reference: Making Embedded Systems - reliability sections
FDIR Policies
- What actions are safe vs risky?
- Book Reference: NASA GSFC-HDBK-8007
Safe Mode
- What must remain on?
- Book Reference: Space Mission Engineering ops sections

Questions to Guide Your Design

What is the timeout for each task?
When do you reboot vs enter safe mode?
How do you avoid reboot loops?

Thinking Exercise

Design a recovery tree: first restart task, then reboot subsystem, then safe mode.

The Interview Questions They’ll Ask

“What is the difference between hardware and software watchdogs?”
“How do you avoid cascading resets?”
“What triggers safe mode?”

Hints in Layers

Hint 1: Start with a global watchdog timer.

Hint 2: Add per-task heartbeat counters.

Hint 3: Implement escalating recovery actions.

Hint 4: Log all resets with reason codes.

Books That Will Help

Topic	Book	Chapter
Embedded reliability	“Making Embedded Systems”	Reliability
FDIR	NASA GSFC-HDBK-8007	Fault management

Common Pitfalls & Debugging

Problem: “Infinite reboot loop”

Why: Recovery action triggers same fault.
Fix: Add cooldown timers and safe mode fallback.
Quick test: Inject a persistent fault and ensure safe mode triggers.

Definition of Done

Detects stalled tasks
Resets subsystems with escalation
Enters safe mode on persistent faults

Project 12: Ground Station Command Console (The HMI)

Main Programming Language: TypeScript/React
Alternative Programming Languages: Python (CLI)
Coolness Level: Level 3: The Operator
Business Potential: Medium. Ground segment UX
Difficulty: Level 2: Intermediate
Knowledge Area: Ground Operations
Software or Tool: Web console + telemetry decoder
Main Book: “Clean Code” by Robert C. Martin

What you’ll build: A web dashboard that decodes telemetry and sends commands with validation.

Why it teaches satellite FSW: Ground ops is the other half of the system, and software must match operational constraints.

Core challenges you’ll face:

Telemetry decoding -> CCSDS parsing
Command validation -> Safety gating
Pass scheduling -> Visibility windows

Real World Outcome

A web UI showing system health, pass timers, and a command console.

What you will see:

A health panel with battery SoC, temperature, and mode.
A pass countdown timer with next visibility window.
A command console that validates commands before sending.

The Core Question You’re Answering

“How do you safely control a spacecraft you can only talk to briefly?”

Concepts You Must Understand First

Telemetry Decoding
- How do you parse CCSDS packets?
- Book Reference: CCSDS 133.0-B-1
Command Validation
- How do you prevent unsafe commands?
- Book Reference: Clean Code - input validation
Pass Scheduling
- How do you compute pass windows?
- Book Reference: Orbit mechanics references

Questions to Guide Your Design

What telemetry is always shown on the dashboard?
How do you authenticate commands?
How do you handle partial packet loss?

Thinking Exercise

Design an operator workflow for a 7-minute pass: what is shown, what actions are allowed, and in what order?

The Interview Questions They’ll Ask

“What safety features should a command console have?”
“How do you decode telemetry in real time?”
“How do you schedule commands for a pass?”

Hints in Layers

Hint 1: Start with a mock telemetry JSON stream.

Hint 2: Decode CCSDS headers and display APID data.

Hint 3: Add command validation rules (no payload commands in SAFE mode).

Hint 4: Add pass timer using orbit predictions.

Books That Will Help

Topic	Book	Chapter
Input validation	“Clean Code” by Robert C. Martin	Ch. 5
Ops concepts	“Space Mission Engineering”	Ops chapters

Common Pitfalls & Debugging

Problem: “Commands sent outside pass”

Why: No pass schedule gating.
Fix: Disable command UI when no visibility.
Quick test: Simulate pass start/stop and verify UI state.

Definition of Done

Dashboard updates in real time
Commands validated before send
Pass windows displayed correctly

Project 13: Full Mission Simulator (The Digital Twin)

Main Programming Language: Python
Alternative Programming Languages: C/C++
Coolness Level: Level 5: The Mission Architect
Business Potential: High. Full mission simulation tooling
Difficulty: Level 4: Expert
Knowledge Area: Systems Integration
Software or Tool: Integrated simulator
Main Book: “Spacecraft Systems Engineering” by Fortescue

What you’ll build: A unified simulator integrating orbit propagation, EPS, ADCS, telemetry, and ground operations into a single digital twin. The simulator runs a full day of mission operations, schedules passes, and verifies safe mode transitions.

Why it teaches satellite FSW: Integration is the ultimate test. This project proves your subsystems work together under realistic constraints.

Core challenges you’ll face:

Interface contracts -> Ensure subsystems agree on units and timing
Timing integration -> Coordinated scheduler across models
End-to-end validation -> Telemetry + command loop

Real World Outcome

A full-day simulation run with logs, plots, and pass reports.

$ python mission_sim.py --hours 24
[SIM] Started 24h run
[PASS] 12 passes scheduled
[SAFE] Entered safe mode at t=03:41 due to low SoC
[RECOVER] Returned to NOMINAL at t=04:20
[OUT] mission_log.json, power_plot.png, attitude_plot.png

The Core Question You’re Answering

“Can all subsystems work together without violating mission constraints?”

Concepts You Must Understand First

Subsystem Contracts
- How do you define interfaces and units?
- Book Reference: Fundamentals of Software Architecture - Ch. 6
Time Integration
- How do you synchronize models with different rates?
- Book Reference: CCSDS time guidance
Operational Constraints
- How do you enforce safe mode and load shedding across the system?
- Book Reference: Spacecraft Systems Engineering - systems integration

Questions to Guide Your Design

What is the simulation time step for each subsystem?
How do you log and visualize cross-subsystem behavior?
How do you validate that the mission schedule is feasible?

Thinking Exercise

Design a 24-hour mission timeline with payload, comms, and safe mode contingencies. Where do you expect the tightest constraints?

The Interview Questions They’ll Ask

“How do you validate integration between EPS and ADCS?”
“How do you ensure timing consistency across models?”
“What is your strategy for regression testing the simulator?”

Hints in Layers

Hint 1: Start by connecting orbit + EPS only.

Hint 2: Add ADCS and verify sun-pointing keeps SoC positive.

Hint 3: Add telemetry scheduler and ground passes.

Hint 4: Inject faults and verify FDIR behavior.

Books That Will Help

Topic	Book	Chapter
Systems integration	“Spacecraft Systems Engineering”	Integration chapters
Architecture tradeoffs	“Fundamentals of Software Architecture”	Ch. 6-8

Common Pitfalls & Debugging

Problem: “Subsystems disagree on units”

Why: Mixed radians/degrees or seconds/minutes.
Fix: Enforce unit annotations and conversions.
Quick test: Add unit tests for each interface.

Definition of Done

Runs a 24-hour simulation without crashes
Generates logs and plots for power, attitude, passes
Demonstrates safe mode entry/exit
Validates telemetry + command loop end-to-end

Summary

This guide turns satellite flight software into a concrete engineering skill. By completing these projects, you will have a full-stack understanding of how CubeSat missions survive, communicate, and deliver value under severe constraints.

Expected Outcomes

You can design and implement a complete FSW stack with safe mode logic.
You can validate telemetry protocols and scheduling under bandwidth limits.
You can integrate orbit, power, thermal, and ADCS models into a mission simulator.