Sprint: Jido Elixir AI Agents Mastery - Real World Projects

Goal: Build first-principles mastery of agent engineering in Elixir using the Jido ecosystem (jido, jido_ai, req_llm) and core BEAM capabilities. You will learn to design deterministic agent cores, model side effects as directives, and run autonomous workflows on fault-tolerant supervision trees. You will also learn how to combine reasoning strategies (ReAct, Chain-of-Thought, Tree-of-Thoughts, Graph-of-Thoughts, Adaptive) with robust tool contracts, streaming, observability, and cost controls. By the end, you will be able to design, test, and operate production-style AI agent systems that recover from failure, survive node instability, and remain explainable under pressure.

Introduction

What is this topic? It is the intersection of LLM systems engineering and BEAM-native reliability engineering, centered on Jido’s v2 architecture (Actions with run/2 + StateOps + Directives + AgentServer runtime execution).
What problem does it solve today? It solves the gap between “demo agents” and production systems by combining deterministic core logic (Actions returning results, StateOps for in-strategy state mutations, and Directives for external effects) with asynchronous side-effect execution, supervision, and telemetry.
What will you build? 20 projects that progress from single-agent tool use to distributed multi-agent autonomous systems with safety, persistence, and release discipline.
In scope: Jido Action system and Plan DAGs, Jido.AI strategy patterns, ReqLLM multi-provider abstraction, Plugin/Skill composition, Signal Bus/Router/Dispatch/Journal, BEAM supervision/distribution, production operations.
Out of scope: training foundation models from scratch, full MLOps platform design, non-BEAM runtime internals.

                         +------------------------------+
                         |         Human / API          |
                         |     Goals, Constraints,      |
                         |      Approval Decisions      |
                         +---------------+--------------+
                                         |
                                         v
+--------------------+       +-----------+---------------+       +------------------------+
|  Signal Bus/Router +------>+    Jido Agent Core        +------>+   Directive Queue       |
|  HTTP/PubSub/Cron  |       |  Action.run(params, ctx)  |       | LLMStream/ToolExec/    |
|  Dispatch adapters |       |  -> {:ok, result}         |       | LLMGenerate/LLMEmbed/  |
+--------------------+       |  -> {:ok, result, dirs}   |       | EmitToolError/Emit     |
                             |  StateOps: Set/Replace/   |       +-----------+------------+
                             |  Delete/SetPath/DelPath   |                   |
                             +-----------+---------------+                   v
                                         |                       +---------+-----------+
                                         |                       |   Runtime Executor  |
                                         |                       | AgentServer GenSrv  |
                                         |                       +---------+-----------+
                                         |                                 |
                                         v                                 v
                             +-----------+-------------+       +-----------+-----------+
                             |   Agent State Snapshot  |       | External Effects      |
                             | status, memory, metrics |       | LLM/API/DB/Tools      |
                             +-------------------------+       +-----------+-----------+
                                                                            |
                                                                            v
                                                               +------------+-----------+
                                                               | Observability + Cost    |
                                                               | Telemetry + Usage + SLA |
                                                               +-------------------------+

How to Use This Guide

Read ## Theory Primer before building projects. The projects assume the mental models from that section.
Choose one learning path in ## Recommended Learning Paths based on your goal.
Build each project with a strict Definition of Done and collect evidence (logs, traces, deterministic transcripts).
Treat every project as a production rehearsal: include failure tests, timeout paths, and rollback behavior.

Prerequisites & Background Knowledge

Essential Prerequisites (Must Have)

Elixir basics: modules, pattern matching, structs, processes, OTP applications.
HTTP and JSON fundamentals, API auth keys, and schema validation basics.
Basic LLM API familiarity (prompts, context windows, temperature, token budgets).
Recommended Reading: “Designing Elixir Systems with OTP” by James Edward Gray II and Bruce A. Tate.

Helpful But Not Required

Phoenix + LiveView basics (you will use this in projects 4 and 15).
Distributed Erlang fundamentals (you will learn this deeply in projects 13 and 20).
Telemetry/OpenTelemetry familiarity (you will apply this in projects 6, 17, and 20).

Self-Assessment Questions

Can you explain why “process per request” can work on BEAM but fails on many thread-based runtimes?
Can you design a JSON-schema-like contract for a tool call and reason about invalid arguments?
Can you describe how a supervisor should react to repeated child crashes?

Development Environment Setup Required Tools:

Elixir ~> 1.17
Erlang/OTP ~> 27 or newer
GitHub CLI (gh) for repository inspection
Docker (recommended for reproducible service dependencies)

Recommended Tools:

Phoenix + LiveView stack for interactive projects
OpenTelemetry collector for local observability experiments
A managed or self-hosted Redis/Postgres pair for selected projects

Testing Your Setup:

$ elixir --version
Erlang/OTP 27 or 28
Elixir 1.17.x or newer

$ gh --version
gh version 2.x

$ mix --version
Mix 1.17.x (compiled with Erlang/OTP 27+)

Time Investment

Simple projects: 4-8 hours each
Moderate projects: 10-20 hours each
Complex projects: 20-40 hours each
Total sprint: 4-8 months part-time

Important Reality Check Agent systems fail in ways CRUD systems do not: loops, tool misuse, stale context, token overrun, and cascading retries. You will learn fastest by intentionally injecting failures and proving recovery behavior. If you only test happy-path prompts, you will not build production intuition.

Big Picture / Mental Model

A Jido-based system should be understood as two coupled but separated machines:

Machine A (deterministic): Actions implement run(params, context) returning {:ok, result}, {:ok, result, directives}, or {:error, error}. StateOps (SetState, ReplaceState, DeleteKeys, SetPath, DeletePath) handle in-strategy state mutations. This layer is testable without processes.
Machine B (effectful): The AgentServer GenServer processes signals, routes to strategies, and executes directives (LLMStream, ToolExec, LLMGenerate, LLMEmbed, EmitToolError, EmitRequestError) under supervision. Outcomes re-enter as signals.

            Machine A (Deterministic)                        Machine B (Effectful)
+------------------------------------------------+   +----------------------------------------+
| Inputs: signal/action params + context         |   | Inputs: directives + runtime context  |
|                                                |   |                                        |
| 1) Action.run(params, context)                 |   | 1) call LLM/provider/tool              |
| 2) Schema validation pipeline                  |   | 2) spawn/stop child processes          |
|    (before_validate -> schema ->               |   | 3) schedule/cancel delayed work        |
|     after_validate -> run ->                   |   | 4) emit outcome signals via Bus        |
|     validate_output -> output_schema)          |   |                                        |
| 3) StateOps for in-strategy state changes      |   | AgentServer processes signals and      |
| 4) Directives for external effects (data only) |   | routes to strategies via Router        |
|                                                |   |                                        |
| Output: {:ok, result} | {:ok, result, dirs}   |   | Output: external effects + feedback    |
+-------------------+----------------------------+   +------------------+---------------------+
                    |                                                   |
                    +-------------------------feedback signals-----------+

This split is the key to scaling complexity: deterministic logic stays readable and testable, while runtime behavior remains observable and controllable by OTP semantics. The Jido.Exec module provides the execution engine with timeout, retries, and backoff for running Action-based Plans (DAGs of Instructions with dependency resolution).

Theory Primer

Concept 1: Deterministic Agent Core (Actions, StateOps, Directives) and AgentServer Runtime

Fundamentals Jido formalizes an important separation that many agent frameworks blur: state transition logic is deterministic and explicit, while side effects are deferred and described as directives. In v2, the core contract is built around the Jido.Action behaviour (defined in the jido_action package): actions implement run(params, context) and return {:ok, result}, {:ok, result, directives}, or {:error, error}. State mutations within a strategy are handled by StateOps (Jido.Agent.StateOp.SetState, ReplaceState, DeleteKeys, SetPath, DeletePath – with helper constructors StateOp.set_state/1, StateOp.replace_state/1, StateOp.delete_keys/1, StateOp.set_path/2, StateOp.delete_path/1). The Jido.Agent.StateOps module provides apply_result/2 for deep-merging action results and apply_state_ops/2 which separates StateOps from external directives: it reduces a list of structs, applies state operations to the agent, and collects non-StateOp structs as external directives to return.

External side effects are expressed as two directive families. Core directives (from Jido.Agent.Directive) handle BEAM-level operations: Emit (dispatch a signal via Jido.Signal.Dispatch), Error (wrap a Jido.Error.t()), Spawn (fire-and-forget BEAM child), SpawnAgent (child agent with parent-child hierarchy tracking, monitors, and children map), StopChild (graceful child stop by tag), Schedule (delayed message via delay_ms), Stop (stop self), Cron (recurring schedule via cron expression), and CronCancel (stop a recurring job by job_id). AI directives (from Jido.AI.Directive) handle LLM operations: LLMStream (streaming generation with id, model, context, tools, tool_choice, max_tokens, temperature, timeout, metadata), ToolExec (tool execution with id, tool_name, action_module, arguments, context), LLMGenerate (non-streaming generation), LLMEmbed (embedding generation with model, texts, dimensions), EmitToolError (immediate error for unknown tools – prevents Machine deadlock), and EmitRequestError (immediate error when agent is busy). All directives implement the Jido.AgentServer.DirectiveExec protocol for polymorphic execution.

The AgentServer GenServer (at Jido.AgentServer) processes signals, routes them to strategies via Jido.Signal.Router, and executes directives through its drain loop. Its public API: start/1, start_link/1, call/3 (sync signal), cast/2 (async signal), state/1, status/1, await_completion/2 (event-driven wait for terminal status), stream_status/2, attach/2/detach/2/touch/1 (lifecycle attachment for LiveView sockets), set_debug/2, recent_events/2. This makes the agent itself testable as pure behavior. You can reason about invariants like terminal status, retry counters, safety flags, and tool budget without needing processes, network, or provider mocks.

Deep Dive The deterministic core pattern matters because AI systems are stochastic at the model edge but do not need to be stochastic everywhere. If your entire architecture is probabilistic, incidents become impossible to debug. Jido’s Action-based pattern gives you a deterministic center: state transitions are ordinary data transformations. Even when a model output is uncertain, your handling of that output does not need to be. For example, a model can request a tool call with malformed arguments; your deterministic layer can reject, normalize, retry, or escalate based on explicit policies. This is exactly where reliability is won.

The Action system (defined with use Jido.Action) provides a rich validation pipeline with six overridable lifecycle hooks: on_before_validate_params/1 -> schema validation (via Zoi or NimbleOptions schemas) -> on_after_validate_params/1 -> run/2 -> on_after_run/1 -> on_before_validate_output/1 -> output schema validation -> on_after_validate_output/1. Actions also support compensation (for rollback via on_error/4), with configurable compensation: %{enabled: true, max_retries: N, timeout: N}. The Jido.Action.Tool.to_tool/0 callback converts any Action into a JSON-schema tool definition compatible with OpenAI function calling and similar LLM tool formats, bridging deterministic Elixir code with LLM tool calling. The Action config is validated at compile time using Zoi schemas – invalid configs raise CompileError before the module ever loads.

A common anti-pattern in agent implementations is putting remote calls directly inside runtime callbacks, then mutating in-memory state opportunistically. That pattern couples latency, error handling, and business rules into one untestable thread. Jido’s directive approach decouples these concerns. The core emits typed intention structs. For example, the AI directive LLMStream carries fields id (call correlation), model (e.g. "anthropic:claude-haiku-4-5"), model_alias (e.g. :fast, resolved via Jido.AI.resolve_model/1), system_prompt, context (conversation messages), tools (list of ReqLLM.Tool.t()), tool_choice (:auto | :none | {:required, name}), max_tokens, temperature, and timeout. Similarly, ToolExec carries id, tool_name, action_module (direct module execution bypassing Registry), arguments, and context. The AgentServer runtime executes directives via the DirectiveExec protocol dispatch – each directive type implements exec/3 which receives (directive, input_signal, state). Execution is async: LLMStream and ToolExec spawn tasks under a per-agent Task.Supervisor, and results re-enter as signals (react.llm.response, react.tool.result) through AgentServer.cast/2.

The invariants are strong and practical:

Action run/2 results are complete at that point in logical time; StateOps are applied atomically.
Directives do not mutate already-returned state; they describe future effects.
Runtime outcomes must re-enter through signals if they should affect future state.

This model resembles event-sourced control loops without forcing full event sourcing everywhere. You can still persist snapshots via Jido.Signal.Journal (with InMemory, ETS, or Mnesia backends), but your deterministic contract remains simple enough for property-based thinking. For complex multi-agent systems, this contract is essential. Without it, child lifecycle events, tool completion events, and retry loops create hidden state transitions.

Directive semantics also improve security posture. When effects are explicit data structs validated at construction time (each Directive uses Zoi.struct/3 schemas with new!/1 constructors that raise on invalid data), you can inspect, filter, and gate them. Core directives provide helper constructors: Directive.emit/2, Directive.spawn_agent/3, Directive.stop_child/2, Directive.schedule/2, Directive.cron/3, Directive.cron_cancel/1, and Directive.emit_to_parent/3 (for child-to-parent communication). A policy layer can reject high-risk ToolExec directives unless an approval bit is set. The SpawnAgent directive includes a meta field for passing context to child agents, and children are tracked by tag in the parent’s children map with process monitors for exit detection. This is harder when effects are direct function calls hidden in procedural code.

At operations time, this split helps with replay and incident analysis. The AgentServer supports debug mode (set_debug/2) with an in-memory ring buffer (max 50 events) recording :signal_received and :directive_started events with monotonic timestamps. You can also replay a sequence of signals against historical state snapshots (using Jido.Signal.Journal persistence) and compare whether the same directives were emitted. Divergence indicates nondeterminism introduced accidentally. The AgentServer emits structured telemetry at [:jido, :agent_server, :signal, :start | :stop | :exception] and [:jido, :agent_server, :directive, :start | :stop | :exception], plus [:jido, :agent_server, :queue, :overflow] for queue saturation. This replay and telemetry discipline is the foundation for trustworthy autonomous systems, where you must explain why an agent acted.

Failure modes to design around:

Directive queue saturation: caused by large bursts or slow downstream providers.
Stale feedback loops: when delayed tool results return after state has moved on.
Implicit state mutation leaks: when helper functions mutate shared mutable containers outside the Action contract.
Schema validation failures: when Action params do not match the declared schema, caught by the validation pipeline.

Design countermeasures include strict state versioning, idempotency keys for tool responses, bounded queue sizes, timeout-based demotion paths, and Action compensation for rollback. In interviews and real systems, the engineer who understands this deterministic/effectful split can usually move from prototype to production faster than teams that keep adding retries to opaque loops.

How this fit on projects You will apply this concept in projects 1, 2, 3, 8, 16, 17, and 20.

Definitions & key terms

Action (Jido.Action): Behaviour in the jido_action package. Modules use Jido.Action with schema, implement run(params, context) returning {:ok, result}, {:ok, result, directives}, or {:error, error}. Six lifecycle hooks: on_before_validate_params/1, on_after_validate_params/1, on_after_run/1, on_before_validate_output/1, on_after_validate_output/1, on_error/4.
StateOp (Jido.Agent.StateOp): In-strategy state mutation operations: SetState (deep merge), ReplaceState (wholesale), DeleteKeys (top-level), SetPath (nested set), DeletePath (nested delete). Applied by Jido.Agent.StateOps.apply_state_ops/2.
Core Directive (Jido.Agent.Directive): BEAM-level effect struct: Emit, Error, Spawn, SpawnAgent, StopChild, Schedule, Stop, Cron, CronCancel.
AI Directive (Jido.AI.Directive): LLM/tool effect struct: LLMStream, ToolExec, LLMGenerate, LLMEmbed, EmitToolError, EmitRequestError.
DirectiveExec: Protocol (Jido.AgentServer.DirectiveExec) with exec/3 callback; each directive type implements this for polymorphic execution.
AgentServer (Jido.AgentServer): GenServer that processes signals, routes to strategies, and executes directives. Public API: start/1, start_link/1, call/3, cast/2, state/1, status/1, await_completion/2, attach/2/detach/2, set_debug/2, recent_events/2.
Zoi: Schema validation library used for compile-time config validation and runtime struct construction (Zoi.struct/3, new!/1).
Deterministic core: Pure state transition logic where same inputs lead to same outputs.
Invariant: Condition that must stay true across state transitions.

Mental model diagram

Input Signal (via Jido.Signal.Bus)
    |
    v
[Jido.Signal.Router (trie matching)]
    |
    v
[AgentServer routes to Strategy]
    |
    v
[Action.run(params, ctx)]  (validated via Zoi schema pipeline)
    |
    +-- {:ok, result}
    |       |
    |       v
    |   [Jido.Agent.StateOps.apply_result/2]  (deep-merge into agent state)
    |
    +-- {:ok, result, mixed_structs}
    |       |
    |       v
    |   [Jido.Agent.StateOps.apply_state_ops/2]
    |       |
    |       +-- StateOps (SetState, ReplaceState, DeleteKeys, SetPath, DeletePath)
    |       |       -> applied atomically to agent state
    |       |
    |       +-- External Directives (non-StateOp structs)
    |               -> enqueued for AgentServer drain loop
    |
    +-- {:error, error}
            -> error handling / compensation

[AgentServer Drain Loop]
    |
    v
[DirectiveExec.exec/3 protocol dispatch]
    |
    +-- Core (Jido.Agent.Directive):
    |     Emit, Error, Spawn, SpawnAgent, StopChild,
    |     Schedule, Stop, Cron, CronCancel
    |
    +-- AI (Jido.AI.Directive):
          LLMStream, ToolExec, LLMGenerate, LLMEmbed,
          EmitToolError, EmitRequestError
          (spawn Task under per-agent Task.Supervisor)
    |
    v
[Feedback Signals re-enter via AgentServer.cast/2]
  react.llm.response, react.tool.result, react.llm.delta
    |
    v
[Next Signal Cycle]

How it works (step-by-step, with invariants and failure modes)

Signal arrives via Jido.Signal.Bus and is matched by Jido.Signal.Router (trie-based pattern matching).
AgentServer routes signal to the appropriate strategy.
Strategy selects and runs Action.run(params, context) through the validation pipeline.
StateOps (if any) are applied atomically to strategy state.
Directives (if any) are enqueued for runtime execution by AgentServer.
Runtime outcomes re-enter as new signals through the Bus, closing the loop.
Handle failures with explicit retry/abort state transitions and Action compensation.

Minimal concrete example

PSEUDOCODE (v2 Action pattern)
defmodule PriceLookup do
  use Jido.Action,
    name: "price_lookup",
    schema: [sku: [type: :string, required: true]]

  def run(%{sku: sku}, _context) do
    result = %{price: lookup(sku)}
    {:ok, result}
  end
end

# In strategy, Action returns result + directives:
Action.run(params, context)
  => {:ok, %{tool_calls: [%{name: "price_lookup", args: %{sku: "A1"}}]},
      [%ToolExec{id: "tool_001", tool_name: "price_lookup",
                 action_module: PriceLookup, arguments: %{sku: "A1"}}]}

# StateOps applied in strategy:
SetState.run(%{status: "awaiting_tool", pending_tool_calls: [%{id: "tool_001", name: "price_lookup", arguments: %{sku: "A1"}, result: nil}]}, ctx)

Common misconceptions

“Directives are just async function calls.” No: they are data contracts that can be audited and governed.
“If LLM output is random, deterministic state is pointless.” Wrong: deterministic handling is where reliability is created.

Check-your-understanding questions

Why should directive execution (via DirectiveExec.exec/3) not mutate already-returned state?
What happens when Jido.Agent.StateOps.apply_state_ops/2 encounters a struct that is not a StateOp type (e.g., an LLMStream directive)?
Why does AgentServer.set_debug/2 use a ring buffer (max 50 events) rather than unbounded logging?
What is the difference between Directive.spawn/2 (fire-and-forget BEAM child) and Directive.spawn_agent/3 (child agent with parent-child hierarchy)?
How does the Action lifecycle hook on_before_validate_params/1 differ from on_after_validate_params/1 in terms of when each is useful?

Check-your-understanding answers

It preserves logical time and deterministic reasoning about transitions. If exec/3 mutated state directly, you would lose the guarantee that state changes are traceable to specific Action results.
Non-StateOp structs are collected and returned as external directives. apply_state_ops/2 pattern-matches on StateOp types, applies them to state, and accumulates everything else as directives for the AgentServer drain loop.
A ring buffer prevents memory growth in long-running agents. In production, unbounded debug logs would eventually cause OOM. The bounded buffer captures the most recent 50 events, which is sufficient for immediate incident analysis.
Spawn creates a bare BEAM child process with no lifecycle tracking. SpawnAgent creates a supervised child agent with parent-child hierarchy: the parent tracks children by tag in its children map, monitors processes for exit detection, and supports emit_to_parent/3 for child-to-parent communication.
on_before_validate_params/1 runs before schema validation, useful for normalizing or enriching raw input (e.g., converting string dates to DateTime). on_after_validate_params/1 runs after schema validation, useful for cross-field validation or derived value injection where you know the params are already well-typed.

Real-world applications

Regulated workflow agents where actions must be auditable.
Tool-heavy copilots with strict permissioning.
Autonomous service remediation loops with guardrails.

Where you’ll apply it

Projects 1, 2, 3, 8, 16, 17, and 20.

References

Jido README
Jido Core Loop Guide
Jido Directives Guide
Jido Action source - Jido.Action behaviour definition (in jido_action package)
Jido AgentServer source - GenServer runtime
Jido.AI Directive source - LLMStream, ToolExec, etc.

Key insights Deterministic Actions plus typed StateOps and explicit Directive structs are the shortest path from demo agent to production-grade agent.

Summary The deterministic core/directive runtime split gives you testability, replayability, and governance without sacrificing asynchronous power.

Homework/Exercises to practice the concept

Write state invariants for a ReAct agent that uses AgentServer.cast/2 for tool calls and Directive.schedule/2 for retries. Include invariants for pending_tool_calls, iteration, and max_queue_size.
Draw a failure timeline showing: (a) ToolExec directive emitted, (b) Task.Supervisor spawns async execution, (c) tool times out, (d) late react.tool.result signal arrives after state has moved to "error". Show how current_llm_call_id check prevents stale application.
Write a directive policy gate that inspects DirectiveExec dispatch and rejects ToolExec directives where tool_name is not in an approved allowlist. Show how Zoi schema validation at construction time (via new!/1) complements this runtime check.
Trace the full path from Action.run/2 returning {:ok, result, [%SpawnAgent{tag: "worker-1", ...}]} through StateOps.apply_state_ops/2 separating the SpawnAgent directive, then through DirectiveExec.exec/3 starting the child agent with process monitoring.

Solutions to the homework/exercises

Invariants: length(pending_tool_calls) <= max_concurrent_tools, iteration <= max_iterations, AgentServer queue_size <= max_queue_size (default 10000), status in ["idle", "awaiting_llm", "awaiting_tool", "completed", "error"], status == "completed" implies length(pending_tool_calls) == 0.
Timeline: t0: emit %ToolExec{id: "call_99"} -> t1: Task.Supervisor.async spawns -> t2: 30s timeout fires, state transitions to "error", current_llm_call_id cleared -> t3: late react.tool.result arrives with id "call_99" -> ReAct.Machine checks current_llm_call_id != "call_99", rejects with {:request_error, call_id, :stale, msg}.
In AgentServer drain loop, before DirectiveExec.exec/3: pattern match on %ToolExec{tool_name: name}, check name in approved_tools, reject with %Error{reason: :tool_not_allowed} if not. Construction-time validation via ToolExec.new!(%{tool_name: "unknown"}) catches structurally invalid directives; runtime policy catches semantically unauthorized ones.
Action.run/2 returns {:ok, %{worker_started: true}, [%SpawnAgent{tag: "worker-1", module: WorkerAgent, meta: %{task: "analyze"}}]}. apply_state_ops/2 deep-merges %{worker_started: true} into state, then collects %SpawnAgent{...} as an external directive. AgentServer drain loop calls DirectiveExec.exec/3 on the SpawnAgent, which starts the child agent via the agent’s DynamicSupervisor, adds {"worker-1", pid} to the parent’s children map, and sets up a process monitor for exit detection.

Concept 2: Signal Contracts, Bus/Router/Dispatch/Journal Architecture, and Multi-Agent Routing on BEAM

Fundamentals Jido and Jido.AI rely on signal-driven communication so components remain decoupled and routable. The jido_signal library provides the full signal infrastructure: Jido.Signal.Bus (GenServer pub/sub), Jido.Signal.Router (trie-based pattern matching with wildcards * and **), Jido.Signal.Dispatch (multi-adapter dispatch to pid, pubsub, http, webhook, logger, console, noop, named targets), and Jido.Signal.Journal (event persistence with InMemory, ETS, and Mnesia backends). Signals carry typed envelopes (event type, source, payload) aligned with CloudEvents-style event modeling. Real signal types from jido_ai include react.input, react.llm.response, react.tool.result, react.llm.delta, react.register_tool, react.unregister_tool, react.set_tool_context, and react.usage (noop, observability only). Routing precedence is deterministic: strategy routes, then agent routes, then plugin routes.

Deep Dive Signal contracts are the social contract of your agent system. If your signal types and payloads are ad hoc, multi-agent coordination collapses under complexity. If your signal contracts are explicit and versioned, teams can independently evolve agents, tools, and orchestration rules. Jido’s routing precedence (strategy routes, then agent routes, then plugin routes) gives a deterministic dispatch order. Signal routes are merged from all three layers with deterministic precedence. This is subtle but critical in large systems where multiple capabilities may match the same incoming event.

The Jido.Signal.Router uses a trie data structure for pattern matching, supporting single-level wildcards (*) and multi-level wildcards (**). This means you can subscribe to react.* to catch all top-level react signals, or react.** to catch all signals in the react namespace at any depth. The Router maps signal types to handler tuples like {:strategy_cmd, :react_start} for react.input, {:strategy_cmd, :react_llm_result} for react.llm.response, and {:strategy_cmd, :react_tool_result} for react.tool.result. Partial streaming results arrive via react.llm.delta mapped to {:strategy_cmd, :react_llm_partial}.

The Jido.Signal.Dispatch module provides multi-adapter dispatch in three modes: synchronous dispatch/2, asynchronous dispatch_async/2, and batched dispatch_batch/3 (with configurable max_concurrency for parallel processing). A single signal can be routed to multiple adapter targets simultaneously: :pid (direct process message), :bus (Jido.Signal.Bus), :named (registered process name), :pubsub (Phoenix.PubSub broadcast), :logger (structured log output), :console (human-readable output), :noop (no-op for testing), :http (HTTP POST), and :webhook (webhook POST with retries). This is powerful for observability: the same signal that drives agent behavior can also be logged, persisted, and forwarded to monitoring systems.

The Jido.Signal.Journal provides event persistence with pluggable backends: InMemory for tests, ETS for single-node high-speed persistence, and Mnesia for distributed durable persistence. This enables signal replay for debugging, audit trails, and crash recovery.

CloudEvents matters here because it standardizes event metadata and encourages compatibility with broader event ecosystems. You do not need to enforce every field from day one, but adopting stable signal naming and source semantics (domain.subdomain.event, stable source namespaces) pays off quickly. It makes observability cleaner, policy enforcement easier, and replay safer.

On BEAM, routing is not just a software architecture concern; it is a concurrency control mechanism. You can map high-volume signal classes to dedicated worker pools, isolate slow providers, and prevent one class of work from starving another. This is where BEAM’s scheduling model and process isolation shine. Instead of monolithic queues, you build many small bounded queues with supervision boundaries.

Parent-child agent hierarchies introduce additional routing opportunities and pitfalls. Parent agents can spawn child workers for parallel subtasks, receive completion or failure signals, and aggregate results. If done well, you get deterministic orchestration with graceful failure handling. If done poorly, you get orphaned children, duplicate aggregations, and unbounded pending maps.

Failure modes in signal systems are often semantic, not mechanical:

Schema drift: producers and consumers disagree on field shapes.
Ordering assumptions: consumers assume strict ordering across independent channels.
Ambiguous ownership: multiple agents believe they are authoritative responders.
Retry storms: repeated emission of the same signal without dedupe.

Countermeasures include schema version fields, correlation IDs, causation IDs, dedupe stores, and explicit ownership tags in payload metadata. Jido’s signal_routes and plugin patterns allow precise control, but you must design the protocol intentionally.

Distribution multiplies these concerns. Once signals cross nodes, latency and partition behavior appear. The winning architecture is explicit about eventual consistency: use reconciliation signals, timeout fences, and state snapshots rather than assuming immediate global truth. Design for net-splits as a normal state. This is the right mindset for autonomous systems that must remain safe under partial failure.

A practical design pattern is the “controller + workers” topology:

Controller handles intent, budgets, and policy.
Workers execute isolated tasks and report back.
Controller finalizes state and emits external result.

This pattern maps well to Jido’s SpawnAgent and StopChild directives and aligns with OTP supervision. It also mirrors how robust distributed systems are built in other domains: control plane + data plane, with explicit contracts between them.

How this fit on projects You will apply this concept in projects 2, 7, 11, 12, 13, 15, and 20.

Definitions & key terms

Signal contract: Stable definition of event type + payload shape + metadata.
Jido.Signal.Bus: GenServer-based pub/sub for signal distribution.
Jido.Signal.Router: Trie-based pattern matcher with * (single-level) and ** (multi-level) wildcards.
Jido.Signal.Dispatch: Multi-adapter dispatcher (pid, pubsub, http, webhook, logger, console, noop, named).
Jido.Signal.Journal: Event persistence with InMemory, ETS, and Mnesia backends.
Correlation ID: Shared identifier linking events for one logical transaction.
Causation ID: Event ID that directly triggered the current event.
Routing precedence: Ordered matching: strategy routes -> agent routes -> plugin routes (deterministic).

Mental model diagram

[External Event / Internal Signal]
         |
         v
[Jido.Signal.Bus (GenServer pub/sub)]
         |
         v
[Jido.Signal.Router (Trie pattern matching)]
    |         |         |
    v         v         v
 [react.*] [github.*] [maintenance.*]
    |         |         |
    v         v         v
 Precedence: Strategy > Agent > Plugin
    |
    v
[Jido.Signal.Dispatch]
    |         |         |          |
    v         v         v          v
  [pid]   [pubsub]  [webhook]  [logger]
    |
    v
[Action.run(params, ctx)] -> [StateOps] + [Directives]
    |
    v
[AgentServer Runtime Executor]
    |
    v
[Jido.Signal.Journal (InMemory|ETS|Mnesia)]
    |
    v
[New Signals + Metrics -> back to Bus]

How it works (step-by-step, with invariants and failure modes)

Ingest external event and wrap as typed signal.
Route using deterministic precedence rules.
Execute action and emit directives.
Runtime executes effects and emits outcome signals.
Correlate all events via IDs and versioned schema.
Reconcile stale or duplicate outcomes safely.

Minimal concrete example

PSEUDOCODE SIGNALS (v2 jido_ai signal types and router mappings)
react.input          -> {:strategy_cmd, :react_start}        # User query arrives
react.llm.response   -> {:strategy_cmd, :react_llm_result}   # LLM returns tool_calls or final
react.tool.result    -> {:strategy_cmd, :react_tool_result}   # Tool execution completes
react.llm.delta      -> {:strategy_cmd, :react_llm_partial}  # Streaming partial token
react.register_tool  -> {:strategy_cmd, :react_register_tool} # Dynamic tool registration
react.usage          -> Noop (observability only)             # Token/cost metrics

Router pattern matching:
  "react.*"   matches react.input, react.usage (single level)
  "react.**"  matches react.input, react.llm.response, react.tool.result (all depths)

Dispatch example:
  signal "react.tool.result" dispatched to:
    - pid: AgentServer process (drives state machine)
    - logger: structured log for audit
    - journal: Mnesia backend for replay

All messages carry: correlation_id=REQ-901, schema_version=2

Common misconceptions

“Signal type naming is cosmetic.” No: naming is an operational contract.
“Message order is globally guaranteed.” No: only per-mailbox order is guaranteed locally.

Check-your-understanding questions

Why is routing precedence important in plugin-heavy agents?
What does correlation ID solve that process ID does not?
How do you handle duplicate result signals from retries?

Check-your-understanding answers

It prevents ambiguous handlers and unpredictable behavior.
Correlation spans processes and nodes, process ID does not.
Idempotency keys + dedupe map + terminal state checks.

Real-world applications

Incident response automation.
Multi-agent research pipelines.
Human-in-the-loop approval workflows.

Where you’ll apply it

Projects 2, 7, 11, 12, 13, 15, and 20.

References

CloudEvents Specification v1.0.2
CloudEvents CNCF Graduation Announcement
Jido Signals Guide
Jido Orchestration Guide
jido_signal Repository - Bus, Router, Dispatch, Journal
Jido.Signal.Router source - Trie-based pattern matching
Jido.Signal.Dispatch source - Multi-adapter dispatch
Jido.Signal.Journal source - Event persistence backends

Key insights A distributed agent system is only as reliable as its signal contracts, Bus/Router/Dispatch wiring, and Journal persistence discipline.

Summary Signals are the protocol layer of autonomous systems; treat them as durable contracts, not ad hoc payloads.

Homework/Exercises to practice the concept

Design a v1 and v2 schema for react.tool.result and define compatibility rules.
Draw a parent-child spawn/aggregate protocol including timeout and cancellation.
Define dedupe rules for a retried web-search tool signal.

Solutions to the homework/exercises

Add optional fields in v2, preserve required v1 keys, include schema_version.
Spawn -> child.started -> work.sent -> result.received -> child.stopped.
Use tool_call_id + correlation_id as dedupe composite key.

Concept 3: Reasoning Strategies as Explicit State Machines (ReAct, CoT, ToT, GoT, Adaptive, TRM)

Fundamentals Jido.AI treats reasoning strategies as explicit state machines rather than hidden prompt loops. This is a major engineering advantage. Each strategy is a named module under Jido.AI.Strategies.* with a companion state machine (using Fsmx for FSM transitions). The real strategy modules from the jido_ai codebase are: Jido.AI.Strategies.ReAct (with Jido.AI.ReAct.Machine), Jido.AI.Strategies.ChainOfThought, Jido.AI.Strategies.TreeOfThoughts, Jido.AI.Strategies.GraphOfThoughts, Jido.AI.Strategies.Adaptive (auto-selects based on task analysis), and Jido.AI.Strategies.TRM (Tiny Recursive Model for recursive reasoning). Side effects are represented as directives (LLMStream, ToolExec, LLMGenerate, LLMEmbed) and outcomes return via typed signals.

The Jido.AI.ReAct.Machine has these states: "idle" -> "awaiting_llm" -> "awaiting_tool" -> "completed" "error". Its fields include: status, iteration, thread, pending_tool_calls, result, current_llm_call_id, termination_reason, streaming_text, streaming_thinking, thinking_trace, usage, and started_at. This level of explicit state makes it straightforward to enforce limits (max_iterations, timeout budgets, tool call caps) and avoid unbounded loops.

Deep Dive Strategy-as-state-machine design is the bridge between research patterns and production control. Research papers such as ReAct, Chain-of-Thought, and Tree-of-Thoughts demonstrate reasoning improvements, but production systems need more than benchmark gains. They need bounded execution, observability, and graceful failure behavior. State machines (built on Fsmx) provide these guarantees with explicit transition maps and guard clauses.

ReAct (Jido.AI.Strategies.ReAct) is a loop pattern: reason, act (tool), observe, repeat. In Jido.AI, loop progress is explicit in the ReAct.Machine struct: iteration counters, pending_tool_calls list, current_llm_call_id, streaming_text accumulator, streaming_thinking accumulator, thinking_trace list, usage metrics, and termination_reason struct. The Fsmx transition map is declared explicitly:

"idle" => ["awaiting_llm"]
"awaiting_llm" => ["awaiting_tool", "completed", "error"]
"awaiting_tool" => ["awaiting_llm", "completed", "error"]
"completed" => ["awaiting_llm"] (allows re-entry for new queries)
"error" => ["awaiting_llm"] (allows recovery)

The Machine processes messages: {:start, query, call_id}, {:llm_result, call_id, result}, {:llm_partial, call_id, delta, chunk_type}, {:tool_result, call_id, result}. A busy rejection mechanism returns {:request_error, call_id, :busy, msg} when the machine is in a non-idle state. The run_tool_context field (ephemeral, cleared on terminal states) allows passing additional context into tool execution, while base_tool_context (persistent across requests) provides stable context like database connections.

Chain-of-Thought (Jido.AI.Strategies.ChainOfThought) is simpler, often single-pass with structured reasoning text. It is useful for stepwise logic but can be fragile if treated as free-form prose. Its state machine wrapper still enforces lifecycle discipline: start, think, finalize, or error.

Tree-of-Thoughts (Jido.AI.Strategies.TreeOfThoughts) introduces branching with configurable branching_factor, max_depth, and traversal_strategy (:bfs for breadth-first, :dfs for depth-first, or :best_first for score-guided). These are powerful for planning and exploration but expensive in tokens and latency.

Graph-of-Thoughts (Jido.AI.Strategies.GraphOfThoughts) extends branching to arbitrary graph structures with configurable max_nodes, max_depth, and aggregation_strategy (:voting, :weighted, or :synthesis). This enables convergent reasoning where multiple thought paths can be combined.

Adaptive (Jido.AI.Strategies.Adaptive) auto-selects the best strategy based on task analysis: it examines the query, available tools, and budget constraints to choose the most appropriate strategy. This is the practical answer to “when is ToT cost justified?”

TRM (Jido.AI.Strategies.TRM, Tiny Recursive Model) provides recursive reasoning for problems that benefit from iterative refinement of solutions.

Tool calling adds another dimension: argument validation, execution timing, retry policy, and result normalization. Jido.AI’s tool system (registry, adapter, executor with Jido.Action.Tool.to_tool/0 for AI conversion) creates a consistent path from model intent to BEAM action execution. This consistency is critical.

Streaming behavior is where state machines are especially valuable. Partial deltas arrive as react.llm.delta signals mapped to {:strategy_cmd, :react_llm_partial}. The machine accumulates streaming_text and streaming_thinking separately, preserving call correlation via current_llm_call_id, and decides when to transition from streaming to terminal states.

Failure modes to account for:

Call-ID mismatch: stale LLM/tool result applied to current state (guard with current_llm_call_id check).
Branch explosion: ToT/GoT traversal exceeds max_depth or max_nodes budget.
Tool schema mismatch: model emits malformed args repeatedly (handle via on_error/4 callback in Action).
Silent deadlock: strategy waits forever for signal not emitted (use timeout transitions in Fsmx).

Countermeasures include strict call-ID correlation checks, budget-aware traversal pruning (configurable branching_factor and max_depth), schema repair loops with escalation, and watchdog transitions (await_timeout -> error/recover).

From a hiring/interview perspective, this is high-value knowledge because it demonstrates that you can translate AI reasoning ideas into runtime-safe systems. Teams need engineers who can reason about token economics, failure surfaces, and lifecycle constraints, not just prompt templates.

In advanced systems, multiple strategies can coexist in one deployment: ReAct for tool tasks, CoT for quick logic tasks, ToT for planning tasks, and Adaptive as a meta-controller. This architecture is easier to maintain when each strategy conforms to common interfaces and emits common telemetry.

How this fit on projects You will apply this concept in projects 1, 4, 8, 14, 17, and 20.

Definitions & key terms

Reasoning strategy: Named module under Jido.AI.Strategies.* implementing a control-flow pattern.
Fsmx: FSM library used by Jido.AI for explicit state machine definitions with transition guards.
ReAct.Machine: Struct with fields: status, iteration, thread, pending_tool_calls, result, current_llm_call_id, termination_reason, streaming_text, streaming_thinking, thinking_trace, usage, started_at.
Traversal policy: Branch exploration approach for ToT: :bfs, :dfs, :best_first.
Aggregation strategy: Convergence approach for GoT: :voting, :weighted, :synthesis.
Termination reason: Structured explanation for loop exit (max_iterations, timeout, error, success).

Mental model diagram

[query via react.input signal]
   |
   v
[state:"idle"] --react_start--> [state:"awaiting_llm"]
                                      |
                          +-----------+-----------+
                          |                       |
                    tool_calls              final_answer
                    (react.llm.response)   (react.llm.response)
                          |                       |
                          v                       v
                  [state:"awaiting_tool"]   [state:"completed"]
                          |
                    tool_result
                    (react.tool.result)
                          |
                          v
                  [state:"awaiting_llm"] (next iteration)
                          |
                          +--> [state:"error"] on invalid transition/timeout/max_iterations

Machine fields tracked at each state:
  iteration=N, pending_tool_calls={...}, current_llm_call_id="call_XXX",
  streaming_text="...", streaming_thinking="...", thinking_trace=[...],
  usage=%{input_tokens: N, output_tokens: N, total_cost: $X.XX}

How it works (step-by-step, with invariants and failure modes)

Initialize strategy state from query and config.
Emit LLM directive or tool directive based on current state.
Consume signals and validate call identity.
Transition state only via allowed transition map.
Stop on completion, budget breach, or hard error.

Minimal concrete example

PSEUDOCODE (v2 ReAct.Machine with actual directive types)
machine = %ReAct.Machine{status: "idle", iteration: 0, current_llm_call_id: nil}

on react.input(query):
  machine.status = "awaiting_llm"
  machine.iteration = 1
  machine.current_llm_call_id = "call_42"
  emit %LLMStream{id: "call_42", model: :fast, system_prompt: "...", tools: [...]}

on react.llm.response(tool_calls=[{name: "calc", args: {...}}]):
  machine.status = "awaiting_tool"
  machine.pending_tool_calls = [%{id: "tool_001", name: "calc", arguments: %{...}, result: nil}]
  emit %ToolExec{id: "tool_001", tool_name: "calc", action_module: Calc, arguments: {...}}

on react.tool.result(id="tool_001", result={value: 345}):
  machine.status = "awaiting_llm"
  machine.iteration = 2
  machine.current_llm_call_id = "call_43"
  emit %LLMStream{id: "call_43", model: :fast, context: [tool_result...]}

on react.llm.response(final_answer="445"):
  machine.status = "completed"
  machine.result = "445"
  machine.termination_reason = :success

# Streaming partial tokens arrive as:
on react.llm.delta(content="4"):
  machine.streaming_text = machine.streaming_text <> "4"

Common misconceptions

“Strategies are just prompt styles.” No: in production they are control-flow programs.
“Adaptive always means better results.” Not if routing heuristics are poor or unobservable.

Check-your-understanding questions

Why should a tool result include the original call ID?
When should Adaptive choose ReAct over ToT?
What is the difference between completed and error terminal states operationally?

Check-your-understanding answers

To prevent stale result application and ensure causal linkage.
When tool usage is required and branching cost is unjustified.
completed emits user-facing result; error emits remediation/alert path.

Real-world applications

Customer support copilots with tool verification.
Multi-step compliance workflows requiring explicit reasoning traces.
Planning agents for incident response and change automation.

Where you’ll apply it

Projects 1, 4, 8, 14, 17, and 20.

References

Key insights Reasoning quality without explicit control flow is a demo; reasoning quality with Fsmx-backed state machines and typed Machine structs is an operable system.

Summary Treat strategy selection and reasoning loops as engineered state machines with bounded policies and observable transitions.

Homework/Exercises to practice the concept

Define a 6-state ReAct machine including timeout and cancellation transitions.
Compare token/latency budget between CoT and ToT for one planning task.
Write criteria for Adaptive routing misclassification detection.

Solutions to the homework/exercises

Include idle, awaiting_llm, awaiting_tool, completed, timeout, error.
ToT usually spends more tokens; use only when branch quality gains justify cost.
Track per-route success, retries, and human override frequency.

Concept 4: Unified LLM Provider Layer with ReqLLM plus Production Controls (Usage, Cost, Persistence, Scheduling, Safety)

Fundamentals ReqLLM provides a canonical interface for multi-provider LLM operations across 45+ providers and 665+ models (via LLMDB model metadata). Your application logic is not tightly bound to one provider’s wire format. The high-level API consists of four core functions: ReqLLM.generate_text/3 (synchronous text generation), ReqLLM.stream_text/3 (streaming text generation), ReqLLM.generate_object/4 (structured output with schema validation), and ReqLLM.embed/3 (embedding generation). The low-level API exposes a provider plugin system with prepare_request, attach, encode_body, and decode_response callbacks. Combined with Jido runtime features (scheduling, persistence, worker pools, telemetry, error policies), this allows teams to run AI workflows with explicit SLOs, budgets, and recovery paths.

Deep Dive Provider abstraction is not optional at scale. Teams start with one provider, then quickly need fallback, specialty models, or cost controls. Without abstraction, migration is expensive because model payloads, tool formats, streaming semantics, and usage metrics all differ. ReqLLM addresses this by offering canonical data structures and two usage layers: the high-level helpers (generate_text, stream_text, generate_object, embed) and the low-level Req plugin control for custom provider integration.

The streaming architecture is built on a StreamServer GenServer with backpressure via a high_watermark queue. StreamChunk types include :content (text tokens), :thinking (reasoning tokens for models that support extended thinking), :tool_call (incremental tool call data), and :meta (usage and metadata). The MetadataHandle module enables concurrent async usage collection, critical for accurate billing when multiple streams are active. This architecture means you can process partial tokens for live UIs while still accumulating complete usage data.

Canonicalization matters for both correctness and economics. Correctness: tool call structures, content parts, and responses become typed and inspectable. Economics: the ReqLLM.Billing module provides ReqLLM.Billing.calculate(usage, model) which returns line_items with a detailed cost breakdown (input cost, output cost, cache hits, image tokens). That means you can implement strategy-independent cost guardrails such as “abort if estimated spend > X” or “route to cheaper model if confidence threshold allows.”

Model metadata and provider capabilities are managed through LLMDB, which maintains a database of 45+ providers and 665+ models with per-model metadata: context_window, capabilities (tools, json, streaming, vision, thinking), input_cost, output_cost. ReqLLM’s model sync workflow (from models.dev + local patches) highlights a mature operational pattern: treat model catalogs as versioned infrastructure, not hardcoded constants. This supports repeatable testing and controlled rollout of new models.

Now combine this with BEAM operations:

Worker pools manage expensive agent initialization and smooth latency.
Persistence and checkpointing preserve state and thread histories across lifecycle events.
Scheduling and cron directives enable autonomous recurring workflows, with explicit tradeoffs around at-most-once timer semantics.
Telemetry events provide latency, error, queue, and directive execution visibility.
Error policies provide predictable escalation behavior instead of ad hoc exception cascades.

Safety controls should be integrated into this layer, not added later. Tool permission gates, schema validation, policy filters, and human approval checkpoints are easier when data is typed and routing is explicit. This is where many teams fail: they have rich model features but weak control planes.

Failure modes to plan around:

Provider outage or degraded latency: require fallback routing and timeout partitioning.
Hidden cost spikes: unbounded branch strategies or tool-heavy prompts.
Streaming metadata gaps: final usage not arriving reliably without robust collector logic.
Timer persistence assumptions: in-memory schedules lost on crash/restart.

Mitigations:

Multi-provider router with health and budget rules.
Per-request budgets and cumulative daily guardrails.
Separate telemetry for token cost, tool cost, image cost.
Persist critical schedules externally when exactly-once semantics are required.

This concept is the difference between a clever agent and a production platform. A platform must answer: what did it do, why, at what cost, under which policy, and how fast can it recover?

How this fit on projects You will apply this concept in projects 3, 5, 6, 9, 10, 12, 18, 19, and 20.

Definitions & key terms

Canonical model: Provider-agnostic structure for messages/tools/responses.
StreamServer: GenServer with backpressure (high_watermark queue) for streaming tokens.
StreamChunk: Typed chunk: :content, :thinking, :tool_call, :meta.
MetadataHandle: Concurrent async usage collection for billing accuracy.
ReqLLM.Billing: calculate(usage, model) returning line_items with cost breakdown.
LLMDB: Model metadata database (45+ providers, 665+ models) with context_window, capabilities, costs.
Usage accounting: Normalized token and tool cost tracking via Billing module.
At-most-once scheduling: timer model where missed runs are possible on failure.
Policy gate: Rule layer approving or rejecting risky actions.

Mental model diagram

[Agent Strategy / Directive]
      |
      v
[ReqLLM High-Level API]
  generate_text/3 | stream_text/3 | generate_object/4 | embed/3
      |
      v
[LLMDB Model Metadata] --> [Provider Selection]
  45+ providers, 665+ models     |
  context_window, capabilities   |
  input_cost, output_cost        |
      |                          |
      v                          v
[ReqLLM Provider Plugin API]
  prepare_request -> attach -> encode_body -> decode_response
      |                     |                     |
      v                     v                     v
[Provider A]          [Provider B]          [Provider C]
      |
      v
[StreamServer GenServer (backpressure via high_watermark)]
      |
      v
[StreamChunk: :content | :thinking | :tool_call | :meta]
      |
      v
[MetadataHandle (concurrent async usage)]
      |
      v
[ReqLLM.Billing.calculate(usage, model)] -> [line_items + cost breakdown]
      |
      v
[Budget Policy] -> [Telemetry] -> [Persistence/Recovery]

How it works (step-by-step, with invariants and failure modes)

Select model/provider via policy and aliases.
Build canonical context and tools.
Execute generation/streaming with timeout and retries.
Normalize response and usage into one accounting pipeline.
Apply budget/safety decisions before next action.
Persist key lifecycle state and emit telemetry.

Minimal concrete example

PSEUDOCODE (v2 ReqLLM API with actual function names)
# High-level synchronous generation
response = ReqLLM.generate_text("openai:gpt-4o-mini", context, tools: tools)
# response.content, response.tool_calls, response.usage

# High-level streaming
stream = ReqLLM.stream_text("anthropic:claude-haiku-4-5", context, tools: tools)
# Yields StreamChunk structs: %StreamChunk{type: :content, data: "..."}
#                              %StreamChunk{type: :thinking, data: "..."}
#                              %StreamChunk{type: :tool_call, data: %{...}}
#                              %StreamChunk{type: :meta, data: %{usage: ...}}

# Structured output with schema
object = ReqLLM.generate_object("openai:gpt-4o-mini", context, schema, mode: :strict)
# Returns validated object matching schema

# Embedding
embedding = ReqLLM.embed("openai:text-embedding-3-small", texts)

# Cost calculation
line_items = ReqLLM.Billing.calculate(response.usage, "openai:gpt-4o-mini")
# %{input_cost: 0.0015, output_cost: 0.0019, total: 0.0034}

# LLMDB model lookup
model_info = LLMDB.get("openai:gpt-4o-mini")
# %{context_window: 128000, capabilities: [:tools, :json, :streaming], ...}

if line_items.total > 0.02 then set state.status="degraded" and switch model_alias=:fast

Common misconceptions

“Provider abstraction hides all differences.” No: you still need provider-specific options and capability checks.
“Cron means guaranteed execution.” No: in-memory timers imply at-most-once behavior unless persisted externally.

Check-your-understanding questions

Why normalize usage/cost at the provider boundary?
When should you use worker pools instead of spawn-per-request agents?
How do you protect against tool permission escalation?

Check-your-understanding answers

To run unified budget policies independent of provider format.
When initialization is expensive and predictable throughput is required.
Directive policy gates + schema validation + approval signals.

Real-world applications

Cost-aware enterprise copilots.
Multi-provider failover APIs.
Autonomous periodic maintenance/reporting agents.

Where you’ll apply it

Projects 3, 5, 6, 9, 10, 12, 18, 19, and 20.

References

ReqLLM README
ReqLLM Core Concepts
ReqLLM Usage and Billing Guide
ReqLLM StreamServer source - Backpressure streaming
ReqLLM Billing source - Cost calculation
LLMDB Repository - Model metadata (45+ providers, 665+ models)
Jido Worker Pools Guide
Jido Persistence and Storage Guide
Jido Observability Guide

Key insights Provider abstraction without operational controls is portability theater; generate_text/3 + Billing.calculate/2 + LLMDB metadata is the production trifecta.

Summary ReqLLM + Jido runtime operations give you the control plane needed for reliable and economical AI systems.

Homework/Exercises to practice the concept

Define a budget policy that demotes strategy/model based on spend thresholds.
Design fallback rules for provider outage with latency/cost constraints.
List which scheduled tasks must move from in-memory timers to durable schedulers.

Solutions to the homework/exercises

Example: >$0.03 request cost triggers downgrade to :fast model alias and disables ToT.
Primary provider timeout at 6s, fallback provider at 8s, hard fail at 12s.
Billing-critical and compliance reports require durable external scheduler.

Concept 5: Action System, Instructions, and Plan DAGs

Fundamentals The Jido.Action behaviour (defined in the jido_action package, not the core jido package) is the fundamental unit of work in Jido. Every Action is a module declared with use Jido.Action and a config including: name, description, category, tags, vsn, schema (param validation via Zoi/NimbleOptions), output_schema, and compensation (%{enabled: true, max_retries: N, timeout: N}). The config is validated at compile time using Zoi schemas – invalid configs raise CompileError. Actions implement run(params, context) returning {:ok, result}, {:ok, result, directives}, or {:error, error}. The full validation pipeline has six overridable lifecycle hooks: on_before_validate_params/1 -> schema validation -> on_after_validate_params/1 -> run/2 -> on_after_run/1 -> on_before_validate_output/1 -> output schema validation -> on_after_validate_output/1. The on_error/4 callback handles errors and compensation logic. The Jido.Action.Tool.to_tool/0 callback converts any Action into a JSON-schema tool definition compatible with LLM tool calling. Execution is handled by Jido.Exec with sub-modules: Validator, Telemetry, Retry, Compensation, Async, Chain, and Closure (default timeout 30000ms). Instructions (Jido.Instruction struct with id, action, params, context, opts) wrap Actions for execution, and Plans (Jido.Plan) organize Instructions into DAGs with dependency resolution.

Deep Dive Understanding the Action system is essential because it is where your business logic lives. Unlike frameworks that mix model interaction with business rules, Jido forces you to write Actions as pure, schema-validated modules. This has several engineering consequences.

First, the schema validation pipeline is comprehensive. When Jido.Exec.run/4 processes an Action, it runs through six overridable lifecycle hooks: on_before_validate_params/1 -> schema validation (using Zoi/NimbleOptions schema declarations) -> on_after_validate_params/1 -> run/2 -> on_after_run/1 -> on_before_validate_output/1 -> output schema validation -> on_after_validate_output/1. The Jido.Exec module delegates to specialized sub-modules: Exec.Validator (schema checks), Exec.Telemetry (event emission at [:jido, :exec, :start | :stop | :exception]), Exec.Retry (configurable backoff), Exec.Compensation (rollback on downstream failure), Exec.Async (Task-based async execution), Exec.Chain (sequential pipeline), and Exec.Closure (anonymous function wrapping). This means invalid inputs are caught before execution and invalid outputs are caught before consumption. In production, this prevents entire classes of bugs where malformed tool results silently corrupt agent state.

Second, the Jido.Instruction struct wraps an Action with its execution context. An Instruction has fields: id (unique identifier), action (the Action module), params (validated parameters), context (execution context passed to run/2), and opts (execution options like timeout and retries). Instructions are the building blocks of Plans.

Third, Jido.Plan provides DAG-based execution planning. Plan.build/2 constructs a Plan from a list of Instructions with declared dependencies. Plan.add/3 adds Instructions to an existing Plan. The Plan resolves dependencies to determine execution order, enabling parallel execution of independent Instructions. This is powerful for multi-tool workflows: if tool A and tool B are independent, they execute concurrently; if tool C depends on both, it waits. The execution engine (Jido.Exec.run/4) handles timeout, retries, and exponential backoff with a default timeout of 30000ms. Jido.Exec.run_async/3 provides Task-based async execution, and Jido.Exec.await/1 collects the result. The Exec.Chain sub-module enables sequential pipelines where the output of one Action feeds as input to the next.

Fourth, Action compensation enables rollback. When an Action declares compensation: true, the system can call its compensation logic if downstream Actions fail. Combined with max_retries and timeout configuration, this creates a robust execution model for multi-step workflows. For example, if a tool creates a resource (step 1) and the next step fails (step 2), the compensation for step 1 can clean up the created resource.

Fifth, the Jido.Action.Tool.to_tool/0 callback is the bridge between deterministic Elixir code and LLM tool calling. Any Action module that implements this callback can be automatically registered as an AI tool. The tool definition includes the name, description, and parameter schema derived from the Action’s schema. This means your tools are always schema-validated, documented, and testable independent of any LLM.

The execution engine respects BEAM process semantics: each Action runs in the context of the AgentServer or within a Task for async execution. Timeouts are enforced at the execution level, and retries use configurable backoff strategies. This means your Action execution is bounded and observable, not hidden in retry loops.

How this fit on projects You will apply this concept in projects 1, 2, 5, 7, 8, 9, 11, 16, and 20.

Definitions & key terms

Jido.Action: Behaviour in jido_action package. Config: name, description, category, tags, vsn, schema, output_schema, compensation. Returns {:ok, result} {:ok, result, directives} {:error, error}.

Schema validation pipeline: on_before_validate_params/1 -> Zoi/NimbleOptions schema -> on_after_validate_params/1 -> run/2 -> on_after_run/1 -> on_before_validate_output/1 -> output_schema -> on_after_validate_output/1 (six hooks total).
Jido.Instruction: Struct with id, action, params, context, opts wrapping an Action for execution.
Jido.Plan: DAG-based execution plan built from Instructions with dependency resolution via Plan.build/2, Plan.add/3.
Jido.Exec: Execution engine with run/4, run_async/3, await/1. Sub-modules: Validator, Telemetry, Retry, Compensation, Async, Chain, Closure. Default timeout 30000ms.
Action compensation: Rollback logic triggered by on_error/4 callback when downstream steps fail. Configurable: compensation: %{enabled: true, max_retries: N, timeout: N}.
to_tool/0: Callback converting an Action into a JSON-schema tool definition for LLM function calling.

Mental model diagram

[Action Module Definition]
  use Jido.Action, name: "...", schema: [...]
  def run(params, context), do: {:ok, result}
  def to_tool(), do: %{name: "...", description: "...", parameters: schema}
      |
      v
[Jido.Instruction]
  %Instruction{id: "instr_1", action: MyAction, params: %{...}, context: ctx, opts: [...]}
      |
      v
[Jido.Plan (DAG)]
  Plan.build([instr_1, instr_2, instr_3], deps: %{instr_3 => [instr_1, instr_2]})
      |
      v
[Jido.Exec.run/4]
  +-- Validation Pipeline: before_validate -> schema -> run -> after_validate
  +-- Timeout enforcement
  +-- Retry with backoff
  +-- Compensation on failure
      |
      +-----------+-----------+
      |           |           |
      v           v           v
  [instr_1]  [instr_2]  (parallel, no deps)
      |           |
      +-----+-----+
            |
            v
        [instr_3]  (depends on 1 and 2)
            |
            v
      {:ok, final_result}

How it works (step-by-step, with invariants and failure modes)

Define Action module with use Jido.Action and declare schema for params.
Implement run(params, context) with business logic.
Optionally implement on_before_validate_params/1, on_after_run/1, on_error/4, to_tool/0.
Wrap Action in Jido.Instruction struct with params and context.
Build Jido.Plan from Instructions with dependency declarations.
Execute with Jido.Exec.run/4 which resolves DAG, runs validation pipeline, enforces timeouts.
On failure: retry with backoff, then compensate if configured, then propagate error.

Invariants:

Schema validation must pass before run/2 is called.
Plan dependencies must form a DAG (no cycles).
Timeout applies to each Instruction individually.
Compensation is called only if the Action declared compensation: true and a downstream step fails.

Failure modes:

Schema validation failure: Invalid params rejected before execution.
Timeout exceeded: Action killed and error returned.
Retry exhaustion: All retries fail, compensation triggered if enabled.
DAG cycle: Plan construction fails at build time.

Minimal concrete example

PSEUDOCODE (v2 Action + Plan + Exec)
defmodule FetchWeather do
  use Jido.Action,
    name: "fetch_weather",
    schema: [city: [type: :string, required: true]]

  def run(%{city: city}, _ctx) do
    {:ok, %{temp: 72, conditions: "sunny"}}
  end

  def to_tool do
    %{name: "fetch_weather", description: "Get weather for a city",
      parameters: %{city: %{type: "string", required: true}}}
  end
end

# Build instruction
instr = %Instruction{id: "w1", action: FetchWeather, params: %{city: "NYC"}}

# Execute directly
{:ok, result} = Jido.Exec.run(FetchWeather, %{city: "NYC"}, ctx, timeout: 5000)

# Or build a Plan with dependencies
plan = Plan.build([instr_fetch, instr_analyze], deps: %{instr_analyze => [instr_fetch]})
{:ok, results} = Jido.Exec.run(plan, ctx, timeout: 10000, max_retries: 2)

Common misconceptions

“Actions are just functions.” No: they are schema-validated, lifecycle-managed units with compensation, timeout, and retry semantics.
“Plans are sequential pipelines.” No: Plans are DAGs; independent Instructions execute in parallel.
“to_tool is only for LLMs.” No: tool definitions can also be used for documentation, testing, and API generation.

Check-your-understanding questions

What happens if an Action’s output schema validation fails after run/2 succeeds?
Why does Jido.Plan require a DAG rather than allowing arbitrary graphs?
How does Action compensation differ from retry?

Check-your-understanding answers

The result is rejected and an error is returned, even though run/2 succeeded. This prevents invalid data from entering agent state.
DAGs guarantee a topological execution order without infinite loops. Cycles would create deadlocks.
Retry re-executes the same Action hoping for success. Compensation undoes the effect of a previously successful Action when a downstream step fails.

Real-world applications

Multi-tool agent workflows where tools have dependencies (fetch data, then analyze, then summarize).
Automated deployment pipelines with rollback on failure.
Data processing pipelines with schema validation at every stage.

Where you’ll apply it

Projects 1, 2, 5, 7, 8, 9, 11, 16, and 20.

References

Jido.Action source (jido_action package) - The Jido.Action behaviour definition
Jido.Exec source (jido_action package) - Execution engine with Validator, Retry, Compensation, Async, Chain, Closure sub-modules
Jido.Instruction source
Jido.Plan source
Jido.Action.Tool source

Key insights Actions are the atomic unit of reliability in Jido; Plans are the composition mechanism; Exec is the bounded executor. Together they make multi-step workflows deterministic and recoverable.

Summary The Action/Instruction/Plan/Exec stack provides schema-validated, DAG-scheduled, timeout-bounded, compensation-capable execution for all agent workflows.

Homework/Exercises to practice the concept

Write an Action with both input schema and output schema validation. Inject an invalid output and verify the pipeline catches it.
Build a 3-Instruction Plan where two Instructions are independent and one depends on both. Verify parallel execution.
Implement Action compensation for a “create resource” Action and trigger it by failing the next step.

Solutions to the homework/exercises

Declare output_schema: [result: [type: :map, required: true]] in the Action options. Return a non-map value and assert {:error, _}.
Use Plan.build/2 with deps map. Instrument each Action with timestamps. Verify the two independent ones overlap.
Add compensation: true to the Action options. Implement compensate/2 callback. Force downstream failure and verify compensation is called.

Concept 6: Plugin and Skill Composition Architecture

Fundamentals Jido’s Plugin and Skill systems provide modular composition for agent capabilities. A Plugin (Jido.Plugin) extends agent behavior through runtime hooks: mount (initialization), handle_signal (signal processing), and transform_result (output transformation). Each Plugin declares a Jido.Plugin.Spec struct (defined in Jido.Plugin.Spec) containing: module, name, state_key, description, category, vsn, schema, config_schema, config, signal_patterns, tags, and actions. The state_key field provides state isolation so multiple plugins can coexist without namespace collisions. The config_schema validates plugin configuration at mount time, while signal_patterns declares which signal types the plugin subscribes to. Signal routes from plugins are merged with strategy and agent routes using deterministic precedence: strategy > agent > plugin. Skills (Jido.AI.Skill) provide a higher-level abstraction for prompt-driven capabilities with use Jido.AI.Skill macro. Skills declare name, description, license, allowed_tools, actions, plugins, and body/body_file (system prompt content). The Jido.AI.Skill.Loader loads skill definitions from SKILL.md markdown files at runtime, and Jido.AI.Skill.Registry manages available skills for lookup. Skills implement callbacks: manifest/0, body/0, allowed_tools/0, actions/0, and plugins/0.

Deep Dive The Plugin system solves a fundamental agent architecture problem: how do you add capabilities without creating a monolith? In many frameworks, adding a new tool or behavior means modifying the core agent code. In Jido, plugins are self-contained modules with explicit boundaries.

The Jido.Plugin.Spec struct declares everything the plugin needs and provides: module (the plugin module), name (human-readable identifier), state_key (isolated state namespace in the agent’s state map), description, category, vsn (version), schema (state shape validation), config_schema (configuration validation at mount time), config (runtime configuration values), signal_patterns (patterns the plugin subscribes to), tags (metadata labels), and actions (Action modules the plugin provides). This Spec is introspectable at boot time, enabling composition validation before the agent starts processing. Note that signal routing for plugins is handled through the signal_patterns field rather than explicit route maps – the AgentServer merges these patterns into the overall route table during initialization.

The mount callback initializes plugin state under its state_key in the agent’s state map. This isolation is critical: if plugin A uses state_key: :chat and plugin B uses state_key: :memory, they cannot accidentally corrupt each other’s state. The handle_signal callback processes signals relevant to the plugin, and transform_result allows plugins to modify results before they reach the caller.

Signal routes from plugins are merged into the agent’s route table with deterministic precedence: strategy routes take priority, then agent routes, then plugin routes. This means if a strategy and a plugin both claim to handle the same signal type, the strategy wins. This precedence is essential for predictable behavior and should be logged at boot time for debugging.

The Skill system (Jido.AI.Skill) operates at a higher level. While plugins provide runtime behavior (actions, signal handling, state management), Skills provide prompt-level capabilities. A Skill module is defined with use Jido.AI.Skill and declares options: name, description, license, allowed_tools (list of tool names the skill may invoke), actions (Action modules it provides), plugins (Plugin modules it depends on), and body/body_file (the system prompt content, either inline or from a file path). The Skill implements callbacks: manifest/0 (returns the Spec), body/0 (returns the system prompt text), allowed_tools/0, actions/0, and plugins/0. The Jido.AI.Skill.Loader loads skill definitions from SKILL.md markdown files at runtime – a powerful pattern for dynamic capability injection without code changes. The Jido.AI.Skill.Registry manages available skills for lookup by name.

When a skill is loaded, it brings a system prompt (from body/0 or body_file), allowed tools list, and associated actions/plugins. The rendered prompt includes only the tools that the skill is authorized to use via allowed_tools, implementing least-privilege at the prompt level. This is complementary to runtime tool permission enforcement: the prompt restricts what the model can attempt, and the runtime policy restricts what can actually execute.

The interplay between Plugins and Skills creates a layered architecture: Plugins handle runtime behavior and state, Skills handle prompt construction and tool scoping, and the agent’s strategy coordinates them all. This separation enables independent evolution: you can update a skill’s prompt without changing plugin code, add a new plugin without modifying skills, or replace a strategy without touching either.

Composition order matters. Plugins mount in declaration order, and later plugins can depend on earlier ones (via requires). Conflicting signal routes from different plugins are resolved by declaration order within the plugin layer. This is deterministic but must be documented and tested.

How this fit on projects You will apply this concept in projects 7, 11, 15, 16, 18, and 20.

Definitions & key terms

Jido.Plugin: Module extending agent behavior via mount, handle_signal, transform_result callbacks.
Jido.Plugin.Spec: Struct declaring plugin metadata: module, name, state_key, description, category, vsn, schema, config_schema, config, signal_patterns, tags, actions.
state_key: Isolated namespace for plugin state within the agent state map. Must be unique across all mounted plugins.
config_schema: Plugin configuration validation schema, checked at mount time.
Jido.AI.Skill: Higher-level prompt-driven capability defined with use Jido.AI.Skill. Declares name, description, license, allowed_tools, actions, plugins, body/body_file.
Skill callbacks: manifest/0, body/0, allowed_tools/0, actions/0, plugins/0.
SKILL.md: Markdown file format for defining skills loaded at runtime by Jido.AI.Skill.Loader.
allowed_tools: Per-skill tool whitelist implementing least-privilege at the prompt level.
Route precedence: Deterministic merge order: strategy > agent > plugin.

Mental model diagram

[Agent State Map]
  |
  +-- :strategy_state  (owned by active strategy)
  +-- :chat            (owned by ChatPlugin, isolated by state_key)
  +-- :memory          (owned by MemoryPlugin, isolated by state_key)
  +-- :tools           (owned by ToolPlugin, isolated by state_key)

[Plugin Spec]
  %Jido.Plugin.Spec{
    module: ChatPlugin,
    name: "chat",
    state_key: :chat,
    description: "Conversational chat capabilities",
    category: :communication,
    vsn: "1.0.0",
    schema: [entries: [type: :list]],
    config_schema: [max_history: [type: :integer, default: 100]],
    config: %{max_history: 100},
    signal_patterns: ["chat.*"],
    tags: [:conversation, :llm],
    actions: [ChatAction, SummarizeAction]
  }

[Skill Loading]
  SKILL.md (on disk) --> Jido.AI.Skill.Loader --> Skill.Registry
                                                      |
                                                      v
                                                [Skill.Prompt.render]
                                                  system_prompt + allowed_tools filter
                                                      |
                                                      v
                                                [Strategy receives scoped prompt + tools]

[Signal Route Merge Precedence]
  Strategy routes:  react.* -> {:strategy_cmd, :react_*}     (HIGHEST)
  Agent routes:     agent.* -> {:agent_cmd, :handle_*}       (MIDDLE)
  Plugin routes:    chat.*  -> {:plugin_cmd, :handle_chat}   (LOWEST)

How it works (step-by-step, with invariants and failure modes)

Plugin modules declare Jido.Plugin.Spec with state_key, config_schema, signal_patterns, actions.
On agent boot, plugins mount in declaration order via mount callback.
Each plugin initializes its state under its state_key in the agent state map.
Signal routes from all plugins are merged with strategy and agent routes using precedence rules.
Skills are loaded from SKILL.md files by Jido.AI.Skill.Loader and registered in Skill.Registry.
When a strategy needs a prompt, Skill.Prompt.render produces system prompt + filtered tool list.
Runtime signals are routed through the merged route table; plugins handle signals matching their patterns.

Invariants:

state_key must be unique across all mounted plugins.
Plugin requires must be satisfied by already-mounted plugins.
signal_routes from different plugins must not create ambiguous matches within the plugin layer.
allowed_tools in skills must be a subset of tools available in the runtime registry.

Failure modes:

State key collision: Two plugins use the same state_key, causing data corruption.
Unresolved dependency: Plugin A requires Plugin B which is not mounted.
Route conflict: Two plugins claim the same signal pattern with different handlers.
Skill tool mismatch: Skill allowlists reference tools not registered in the runtime.

Minimal concrete example

PSEUDOCODE (v2 Plugin + Skill composition with actual structs)
# Plugin definition
defmodule MemoryPlugin do
  use Jido.Plugin

  def spec do
    %Jido.Plugin.Spec{
      module: __MODULE__,
      name: "memory",
      state_key: :memory,
      description: "Long-term memory storage for agent",
      config_schema: [max_entries: [type: :integer, default: 1000]],
      config: %{max_entries: 1000},
      signal_patterns: ["memory.*"],
      actions: [StoreMemory, RecallMemory]
    }
  end

  def mount(agent_state, config) do
    put_in(agent_state, [:memory], %{entries: [], max_entries: config.max_entries})
  end

  def handle_signal(%{type: "memory.store"} = signal, state) do
    # Process memory storage signal
    {:ok, updated_state, []}
  end
end

# Skill module definition
defmodule IncidentAnalyst do
  use Jido.AI.Skill,
    name: "incident-analyst",
    description: "Analyzes production incidents",
    allowed_tools: ["search_logs", "summarize", "create_ticket"],
    actions: [SearchLogs, Summarize, CreateTicket],
    body_file: "priv/skills/incident_analyst.md"

  # Callbacks: manifest/0, body/0, allowed_tools/0, actions/0, plugins/0
  # are auto-generated by use Jido.AI.Skill
end

# Or load from SKILL.md at runtime:
{:ok, skill_spec} = Jido.AI.Skill.Loader.load("priv/skills/SKILL.md")
Jido.AI.Skill.Registry.register("incident-analyst", skill_spec)
You are an incident analyst. Use available tools to...

# At boot:
# 1. Mount plugins -> state = %{memory: %{entries: [], ...}, chat: %{...}}
# 2. Merge routes -> strategy routes + agent routes + plugin routes
# 3. Load skills -> Registry.register("incident-analyst", skill_spec)
# 4. Render prompt -> Skill.Prompt.render("incident-analyst") -> system_prompt + 3 tools

Common misconceptions

“Plugins and Skills are the same thing.” No: Plugins handle runtime behavior and state; Skills handle prompt construction and tool scoping.
“Plugin order does not matter.” Yes it does: mount order determines dependency resolution and route priority within the plugin layer.
“Skills can access any tool.” No: Skills declare allowed_tools which filters the tool list in the rendered prompt.

Check-your-understanding questions

What happens if two plugins declare the same state_key?
How does route precedence prevent plugin routes from overriding strategy routes?
Why are Skills loaded from markdown files rather than compiled modules?

Check-your-understanding answers

Data corruption: both plugins read/write the same state namespace, causing unpredictable behavior. This should be caught at mount time with a validation check.
During route merge, strategy routes are checked first. If a match is found at the strategy level, plugin routes are never consulted for that signal type.
Markdown files enable runtime loading without code recompilation, supporting dynamic capability injection, A/B testing of prompts, and non-developer skill authoring.

Real-world applications

Modular agent marketplaces where plugins can be installed/removed independently.
Multi-tenant systems where different tenants have different skill configurations.
Composable assistant systems where capabilities are added based on user role or subscription tier.

Where you’ll apply it

Projects 7, 11, 15, 16, 18, and 20.

References

Key insights Plugins provide runtime extensibility with state isolation; Skills provide prompt-level capability scoping. Together they enable composable agents without monolithic coupling.

Summary Plugin Specs with state_key isolation + Skill SKILL.md loading create a layered composition architecture where runtime behavior, state management, and prompt capabilities evolve independently.

Homework/Exercises to practice the concept

Write two plugins with different state_keys and verify they cannot corrupt each other’s state.
Create a SKILL.md file with an allowed_tools list and verify that rendered prompts only include those tools.
Test route precedence by creating a strategy route and a plugin route for the same signal type. Verify the strategy route wins.

Solutions to the homework/exercises

Mount both plugins, write to each state_key, and assert the other is unchanged. Add a boot-time check for duplicate state_keys.
Load the SKILL.md, render the prompt, and assert the tool list is the intersection of allowed_tools and the runtime registry.
Emit a signal matching both routes, log which handler fires, and assert it is the strategy handler.

Glossary

Action (Jido.Action): Behaviour module implementing run(params, context) with schema validation pipeline. The atomic unit of work in Jido.
AgentServer: GenServer that processes signals, routes to strategies, and executes directives.
Bus (Jido.Signal.Bus): GenServer-based pub/sub system for signal distribution.
Core Directives (Jido.Agent.Directive): BEAM-level effect structs: Emit, Error, Spawn, SpawnAgent, StopChild, Schedule, Stop, Cron, CronCancel. Helper constructors: Directive.emit/2, Directive.spawn/2, Directive.spawn_agent/3, Directive.schedule/2, Directive.cron/3, Directive.cron_cancel/1, Directive.emit_to_parent/3.
AI Directives (Jido.AI.Directive): LLM/tool effect structs: LLMStream, ToolExec, LLMGenerate, LLMEmbed, EmitToolError, EmitRequestError.
DirectiveExec (Jido.AgentServer.DirectiveExec): Protocol with exec/3 callback for polymorphic directive execution. Each directive type implements this protocol.
Dispatch (Jido.Signal.Dispatch): Multi-adapter signal dispatcher supporting pid, pubsub, http, webhook, logger, console, noop, named targets. Three modes: sync dispatch/2, async dispatch_async/2, batched dispatch_batch/3.
Exec (Jido.Exec): Execution engine with run/4, run_async/3, await/1. Sub-modules: Validator, Telemetry, Retry, Compensation, Async, Chain, Closure. Default timeout 30000ms.
Fsmx: FSM library used by Jido.AI for explicit state machine definitions with transition guards.
Idempotency Key: Identifier that prevents duplicate effect execution.
Instruction (Jido.Instruction): Struct wrapping an Action with id, action, params, context, opts for execution.
Journal (Jido.Signal.Journal): Event persistence layer with InMemory, ETS, and Mnesia backends.
LLMDB: Model metadata database tracking 45+ providers and 665+ models with context_window, capabilities, and costs.
MetadataHandle: Concurrent async usage collection module for accurate billing across multiple streams.
Netsplit: Temporary network partition between distributed BEAM nodes.
Plan (Jido.Plan): DAG-based execution plan built from Instructions with dependency resolution via Plan.build/2.
Plugin (Jido.Plugin): Module extending agent behavior via mount, handle_signal, transform_result callbacks with Spec metadata.
Plugin Spec (Jido.Plugin.Spec): Struct declaring plugin metadata: module, name, state_key, description, category, vsn, schema, config_schema, config, signal_patterns, tags, actions.
Router (Jido.Signal.Router): Trie-based pattern matcher for signals supporting * (single-level) and ** (multi-level) wildcards.
Signal: A typed event envelope used for routing and feedback, aligned with CloudEvents.
Skill (Jido.AI.Skill): Prompt-driven capability with Spec, Loader, Registry, Prompt modules. Loaded from SKILL.md files.
StateOp: In-strategy state mutation operation: SetState, ReplaceState, DeleteKeys, SetPath, DeletePath.
Strategy: Reasoning/control policy module under Jido.AI.Strategies.* (ReAct, CoT, ToT, GoT, Adaptive, TRM).
StreamChunk: Typed streaming chunk from ReqLLM: :content, :thinking, :tool_call, :meta.
Thread: Conversation history/context maintained across interactions within a strategy or agent session.
Usage Telemetry: Token/cost/latency metrics used for operational control, calculated via ReqLLM.Billing.
Worker Pool: Pre-warmed bounded set of workers for low-latency execution.
Zoi: Schema validation library used throughout Jido for struct definitions (via Zoi.struct/3) and compile-time config validation. Actions, Directives, and Plugin Specs all use Zoi schemas. Invalid configs raise CompileError before the module loads.

Why Jido + BEAM Agent Engineering Matters

AI-native engineering is now default behavior: Stack Overflow Developer Survey 2025 reports 84% of respondents are using or planning to use AI tools in development, and 50.6% of professional developers report daily AI-tool usage.
Agent workflows are now operational, not experimental: In the same 2025 survey, about 70% of AI-agent users report reduced task time and 69% report productivity gains, while only 17% report improved team collaboration. This gap is exactly where reliability, policy, and observability engineering matters.
Open event standards matured: CloudEvents was approved as a CNCF Graduated project on January 25, 2024, and the project page lists cross-cloud adopters (for example AWS EventBridge, Azure Event Grid, Google Eventarc, Knative Eventing).
BEAM remains uniquely suitable for autonomous loops: Erlang/OTP docs state a default process limit of 1,048,576 processes, configurable with +P up to 134,217,727, which is a practical fit for isolated-agent process topologies.
Jido ecosystem momentum (as of February 12, 2026 UTC):
- agentjido/jido: 887 GitHub stars; Hex latest 2.0.0-rc.4, latest stable 1.2.0, 16,125 all-time downloads.
- agentjido/req_llm: 383 GitHub stars; Hex latest 1.5.1, 30,659 all-time downloads.
- agentjido/jido_ai: 114 GitHub stars; Hex latest 0.5.2, 3,930 all-time downloads.
- agentjido/jido_signal: Signal infrastructure library (Bus, Router, Dispatch, Journal) providing the event backbone.
- agentjido/llm_db: LLMDB model metadata covering 45+ providers and 665+ models with per-model context_window, capabilities, input_cost, output_cost data.
- agentjido/jido_browser: Browser automation library for multimodal agent pipelines.
- agentjido/jido_studio: LiveView-based agent observation and HITL control center.

Context & Evolution Early LLM systems optimized for one-shot prompts and short-lived request handlers. Modern agent systems are long-running control loops that need explicit event contracts, bounded retries, budget-aware routing, permission gates, and replayable traces. The shift is from “prompt integration” to “autonomous runtime engineering.” Jido v2 crystallizes this shift with Actions (run/2 with Zoi schema validation and 6 lifecycle hooks), StateOps (deterministic state mutations via Jido.Agent.StateOp.*), two directive families (Core from Jido.Agent.Directive for BEAM operations, AI from Jido.AI.Directive for LLM/tool operations), and the Signal Bus/Router/Dispatch/Journal infrastructure from the jido_signal package.

Old "LLM app" pattern                     New BEAM-native agent pattern (Jido v2)
--------------------------------------     ------------------------------------------------
Prompt -> Provider -> Text                 Signal -> Bus -> Router -> Strategy FSM (Fsmx)
(single call, minimal control)             Action.run(params, ctx) -> StateOps + Directives
No schema validation                       Zoi schema validation at compile + runtime
Side effects mixed with logic              Core Directives (BEAM) + AI Directives (LLM/tool)
No state contract                          DirectiveExec protocol for polymorphic execution
No cost visibility                         AgentServer drain loop under supervision
                                           Feedback signals via Dispatch close the loop
                                           Journal (InMemory|ETS|Mnesia) for replay + audit
                                           Billing.calculate + LLMDB enforce cost boundaries
                                           Plugin.Spec + Skill.allowed_tools for composition

Concept Summary Table

Concept Cluster	What You Need to Internalize
Deterministic Agent Core and Directives	`Jido.Action` behaviour (6 lifecycle hooks, Zoi validation), `StateOp.*` for atomic state mutations, Core Directives (Emit/Spawn/SpawnAgent/Cron/Stop) + AI Directives (LLMStream/ToolExec) via `DirectiveExec` protocol; keep transitions pure and auditable.
Signal Contracts and BEAM Routing	Bus/Router/Dispatch/Journal provide typed event infrastructure; route with deterministic precedence.
Reasoning Strategies as State Machines	ReAct/CoT/ToT/GoT/Adaptive/TRM as Fsmx-backed bounded control-flow systems, not prompt tricks.
ReqLLM + Production Control Plane	45+ providers, 665+ models via LLMDB; `generate_text/3` + `Billing.calculate/2` for budgets, observability, safety.
Action System, Instructions, and Plan DAGs	`Jido.Action` behaviour (`jido_action` package) with Zoi schema pipeline, `Instruction` structs, `Plan` DAG execution with `Jido.Exec` sub-modules (Validator, Retry, Compensation, Chain, Async).
Plugin and Skill Composition	Plugin Specs (`Jido.Plugin.Spec`) with `state_key` isolation, Skill `use Jido.AI.Skill` with `allowed_tools`/`body`/`body_file`, SKILL.md runtime loading, deterministic route merge precedence.

Project-to-Concept Map

Project	Concepts Applied
Project 1	Deterministic Agent Core, Reasoning Strategies, Action System
Project 2	Signal Contracts, Deterministic Agent Core, Action System
Project 3	ReqLLM Control Plane, Deterministic Agent Core
Project 4	Reasoning Strategies, ReqLLM Control Plane
Project 5	ReqLLM Control Plane, Signal Contracts, Action System
Project 6	ReqLLM Control Plane, Deterministic Agent Core
Project 7	Signal Contracts, Deterministic Agent Core, Plugin/Skill Composition, Action System
Project 8	Reasoning Strategies, Deterministic Agent Core, Action System
Project 9	ReqLLM Control Plane, BEAM Routing, Action System
Project 10	ReqLLM Control Plane, Deterministic Agent Core
Project 11	Signal Contracts, BEAM Routing, Plugin/Skill Composition, Action System
Project 12	ReqLLM Control Plane, Signal Contracts
Project 13	Signal Contracts, BEAM Routing
Project 14	Reasoning Strategies, ReqLLM Control Plane
Project 15	Signal Contracts, ReqLLM Control Plane, Plugin/Skill Composition
Project 16	Deterministic Agent Core, ReqLLM Control Plane, Plugin/Skill Composition, Action System
Project 17	Reasoning Strategies, ReqLLM Control Plane
Project 18	ReqLLM Control Plane, Signal Contracts, Plugin/Skill Composition
Project 19	BEAM Routing, ReqLLM Control Plane
Project 20	All six concept clusters

Deep Dive Reading by Concept

Concept	Book and Chapter	Why This Matters
Deterministic Agent Core and Directives	“Designing Elixir Systems with OTP” - supervision and process boundaries chapters	Teaches reliable state/effect separation under OTP.
Signal Contracts and BEAM Routing	“Erlang and OTP in Action” - distributed messaging and supervision chapters	Connects message protocol design with fault boundaries.
Reasoning Strategies as State Machines	“AI Engineering” by Chip Huyen - agent workflows + evaluation sections	Frames reasoning patterns as engineering systems.
ReqLLM + Production Control Plane	“Designing Data-Intensive Applications” - reliability and observability themes	Helps reason about fault tolerance, consistency, and operational tradeoffs.
Action System, Instructions, and Plan DAGs	“Designing Elixir Systems with OTP” - data transformation and validation chapters	Teaches schema-driven pipeline design with compensation.
Plugin and Skill Composition	“Domain-Driven Design” - bounded contexts and context mapping	Frames modular capability composition as bounded context integration.

Quick Start: Your First 48 Hours

Day 1:

Read the entire ## Theory Primer.
Clone jido, jido_ai, and req_llm and inspect their guides.
Build Project 1 and produce deterministic command transcripts.

Day 2:

Validate Project 1 against its Definition of Done.
Start Project 2 and add failure-mode tests for malformed tool arguments.
Record one page of lessons on state invariants and routing mistakes.

Recommended Learning Paths

Path 1: The Reliability Engineer

Project 1 -> Project 3 -> Project 9 -> Project 13 -> Project 19 -> Project 20

Path 2: The AI Product Builder

Project 1 -> Project 2 -> Project 4 -> Project 5 -> Project 6 -> Project 15 -> Project 20

Path 3: The Research-Oriented Agent Engineer

Project 1 -> Project 8 -> Project 14 -> Project 17 -> Project 18 -> Project 20

Success Metrics

You can explain and defend state invariants for every strategy transition.
You can run one workload across at least two providers with stable behavior and tracked cost.
You can recover from at least three injected failures (provider timeout, child crash, netsplit) without manual emergency patches.
You can show one capstone transcript with policy-compliant autonomous behavior.

Optional Domain Appendices

Operational Debugging Checklist

Verify every signal includes correlation_id, causation_id, and schema_version.
Verify every tool execution has an idempotency key and timeout budget.
Verify every terminal state includes a machine-readable termination_reason.
Verify policy denials emit explicit audit events (not silent drops).

Common Failure Signatures

Symptom	Probable Cause	First Diagnostic
Agent loops without completion	Missing or weak termination guards (`max_iterations`, timeout fences)	Inspect strategy state transitions for repeated non-progress edges
Wrong tool result applied to current state	Missing `call_id`/version checks	Compare `pending_tool_calls` list to incoming `tool_result`
Cost spikes during complex prompts	Adaptive router escalates strategy/model too aggressively	Plot usage telemetry by route and enforce downgrade thresholds
Burst events overwhelm runtime	Unbounded ingress or worker saturation	Track mailbox/queue depth and apply batch+backpressure controls
Cron jobs skipped after restart	In-memory timer assumption	Compare scheduled jobs against persisted `last_run_at` checkpoints

Golden Evidence Pack (per project)

One deterministic transcript that includes signal, directive, and terminal state sequence.
One failure-injection transcript showing recovery behavior.
One metric snapshot (latency, retries, cost, policy denials).
One short postmortem note describing what invariant failed or held.

Project Overview Table

#	Project	Focus	Difficulty	Time
1	Signal-Native ReAct Calculator Agent	deterministic Action/StateOp/Directive loop + tools	Level 2	8-12h
2	Tool-Governed Web Research Agent	tool contracts + safe routing	Level 2	10-14h
3	Multi-Provider Failover Gateway	provider abstraction + fallback	Level 3	12-18h
4	Streaming Observability Console	token streaming + telemetry UI	Level 3	12-18h
5	Structured Output Contracts	schema-first object generation	Level 2	8-12h
6	Cost-Aware Model Router	spend-aware policy routing	Level 3	12-18h
7	Skill/Plugin Composition Lab	modular capabilities + routing	Level 3	12-18h
8	Strategy State Machine Switchboard	adaptive strategy orchestration	Level 4	16-24h
9	Worker Pool Load Lab	bounded concurrency + latency	Level 3	12-18h
10	Persistent Thread Memory	checkpoint + journal lifecycle	Level 3	12-20h
11	Sensor-Driven Incident Triage	event bridges + reactive ops	Level 3	12-20h
12	Cron Autonomous Maintenance	recurring jobs + reliability	Level 3	12-20h
13	Distributed Netsplit Recovery Drill	cluster fault recovery	Level 4	18-28h
14	ETS/Mnesia Hybrid Agent Memory	high-speed memory + consistency	Level 4	18-28h
15	LiveView HITL Control Center	human approval workflows	Level 3	14-20h
16	Tool Permission Firewall	safety policy engine	Level 4	16-24h
17	Red-Team Eval Harness	adversarial testing + scoring	Level 4	16-24h
18	Multimodal Agent Pipeline	image+text+tools workflows	Level 4	16-24h
19	Hot Upgrade Release Drill	runtime upgrades under load	Level 5	24-36h
20	BEAM Autonomous Ops Swarm (Capstone)	end-to-end autonomous platform	Level 5	30-50h

Project List

The following projects guide you from single-agent deterministic loops to distributed, policy-controlled autonomous systems on Elixir/BEAM.

Project 1: Signal-Native ReAct Calculator Agent

File: P01-react-calculator-agent.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang, Gleam, Rust (sidecar tools)
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The Micro-SaaS / Pro Tool
Difficulty: Level 2: Intermediate
Knowledge Area: Deterministic agent loops, tool calls, signal routing
Software or Tool: jido, jido_ai, req_llm
Main Book: “Designing Elixir Systems with OTP” by James Edward Gray II and Bruce A. Tate

What you will build: A ReAct calculator agent that only answers through validated tool calls and emits auditable react.* signals.

Why it teaches Jido deeply: You implement the exact Action.run/2 -> {:ok, result, directives} contract with StateOps and observe how react.llm.response and react.tool.result signals close the loop.

Core challenges you will face:

Routing react.* signals correctly -> maps to strategy signal_routes/1
Normalizing JSON tool args -> maps to Jido.AI.Executor behavior
Converging to terminal state -> maps to bounded max_iterations

Real World Outcome

You run one deterministic demo where the model asks for calculator, the tool executes, and the final answer is emitted only after the tool result signal.

$ mix run scripts/p01_react_calculator_demo.exs
[boot] agent=calculator_agent strategy=ReAct model=:fast
[signal] type=react.user_query payload="what is (15*23)+100?"
[directive] LLMStream id=call_001
[signal] type=react.llm.response result=tool_calls tool=calculator args={"a":15,"b":23,"operation":"multiply"}
[directive] ToolExec id=tool_001 name=calculator
[signal] type=react.tool.result tool=calculator ok=true result={"value":345}
[directive] LLMStream id=call_002
[signal] type=react.llm.response result=final_answer text="445"
[done] status=completed iterations=2 total_cost_usd=0.0009

The Core Question You Are Answering

“How do I make model reasoning observable and reproducible instead of magical?”

Concepts You Must Understand First

Action.run/2 determinism, StateOps, and directives
- Can the same input produce different directives in your implementation?
- Book Reference: “Designing Elixir Systems with OTP” - supervision and state boundaries
Tool argument normalization and validation
- How do string-key JSON args become typed action params safely?
- Book Reference: “Clean Architecture” - boundary validation patterns
Signal lifecycle in ReAct
- Which signal means “tool request” versus “final answer”?
- Book Reference: “Operating Systems: Three Easy Pieces” - event-loop mental model

Questions to Guide Your Design

State design
- Which fields are mandatory (status, iteration, pending_tool_calls, usage)?
- What transition is illegal and must fail closed?
Tool execution
- How do you correlate tool_call_id between LLM output and tool result?
- Where do you enforce max_iterations?
Operational evidence
- Which log line proves the answer came from tool output, not hallucination?

Thinking Exercise

Sketch the loop for two turns: user query -> LLM tool call -> tool result -> final answer. Mark exactly where state mutates.

The Interview Questions They Will Ask

“Why not call the tool directly from the strategy?”
“How do you prevent stale ToolResult signals from mutating current state?”
“What metric tells you ReAct loops are stuck?”
“How do you prove deterministic behavior in tests?”
“Where do you cap cost and iteration count?”

Hints in Layers

Hint 1: Start with one tool only Model/tool complexity is easier once one tool path is perfect.

Hint 2: Enforce strict signal types Reject unknown react.* events early.

Hint 3: Keep a call-id ledger Track current_llm_call_id and pending_tool_calls in state.

Hint 4: Capture a golden transcript Use a fixed prompt and model alias for reproducibility.

Books That Will Help

Topic	Book	Chapter
OTP state boundaries	“Designing Elixir Systems with OTP”	Process boundaries chapters
Defensive contracts	“Clean Architecture”	Interface adapters
Event loops	“Operating Systems: Three Easy Pieces”	Concurrency intro

Common Pitfalls and Debugging

Problem 1: “Agent never leaves awaiting_tool”

Why: Tool result signal type mismatch or missing call_id.
Fix: Validate incoming react.tool.result schema and correlation id.
Quick test: Inject a valid and invalid ToolResult and assert only valid transitions.

Problem 2: “Final answer arrives without tool execution”

Why: Tool calls not enforced in prompt/policy.
Fix: Add explicit tool-required policy for arithmetic prompts.
Quick test: Run 20 math prompts and assert at least one tool directive per prompt.

Definition of Done

ReAct loop produces both react.llm.response and react.tool.result
Tool arguments are normalized and validated before action execution
Iteration and cost limits are enforced with explicit terminal reasons
Golden transcript is reproducible across runs

Project 2: Tool-Governed Web Research Agent

File: P02-web-research-tool-agent.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang, Gleam, Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The Service and Support Model
Difficulty: Level 2: Intermediate
Knowledge Area: Policy routing, tool governance, source grounding
Software or Tool: jido_ai, req_llm web search, jido_signal
Main Book: “The Pragmatic Programmer”

What you will build: A research agent that can use web search only through an allowlist and returns citation-first answers.

Why it teaches Jido deeply: It forces you to separate model planning from policy enforcement and to handle tool-deny branches as first-class transitions.

Core challenges you will face:

Allow/deny gate for tools -> maps to directive pre-execution policy
Domain filtering and citation format -> maps to tool result post-processing
Safe fallback when search fails -> maps to fail-closed transitions

Real World Outcome

$ mix run scripts/p02_research_agent_demo.exs "latest jido release notes"
[policy] allowed_tools=[web_search,fetch_url] denied=[bash,fs_write]
[signal] react.user_query "latest jido release notes"
[directive] LLMStream call_id=call_100
[signal] react.llm.response type=tool_calls tool=web_search
[directive] ToolExec tool=web_search args={"query":"agentjido jido changelog"}
[signal] react.tool.result tool=web_search count=5
[citation] 1. github.com/agentjido/jido/CHANGELOG.md
[citation] 2. agentjido.xyz/blog
[final] status=completed grounded=true tool_denials=0

The Core Question You Are Answering

“How do I let an agent search the web without letting it do dangerous things?”

Concepts You Must Understand First

Tool allowlists and deny-by-default
- Book Reference: “Foundations of Information Security” - access control basics
Signal-based policy feedback
- Book Reference: “Clean Architecture” - policy boundaries
Grounded response formatting
- Book Reference: “The Pragmatic Programmer” - traceability mindset

Questions to Guide Your Design

What is the canonical policy object shape?
How do you represent “tool denied” in agent state and user output?
What minimum citation fields are required (title, url, retrieved_at)?

Thinking Exercise

Draw two branches for the same prompt: tool approved and tool denied. Compare terminal statuses.

The Interview Questions They Will Ask

“What does fail-closed look like for tool use?”
“How do you detect citation spoofing?”
“How do you distinguish model failure from policy denial?”
“What metrics indicate policy is too strict?”
“How would you add per-tenant policy overrides safely?”

Hints in Layers

Hint 1: Policy first, prompts second

Hint 2: Emit explicit tool.denied signals

Hint 3: Normalize citations before final answer

Hint 4: Add replay tests for denied paths

Books That Will Help

Topic	Book	Chapter
Access control thinking	“Foundations of Information Security”	Policy and controls chapters
Boundary contracts	“Clean Architecture”	Boundaries
Reliable workflows	“The Pragmatic Programmer”	Tracer bullets

Common Pitfalls and Debugging

Problem 1: “Agent returns uncited claims”

Why: Final answer generated without required citation schema check.
Fix: Validate answer structure before completion.
Quick test: Fail build if citation array is empty.

Problem 2: “Deny rules never trigger”

Why: Tool name mismatch (web-search vs web_search).
Fix: Canonicalize tool names before policy lookup.
Quick test: Unit test alias map for tool names.

Definition of Done

Tool allowlist is enforced with deny-by-default semantics
Denied tools produce explicit user-visible policy output
Final response includes normalized citations
Replay test covers tool-approved and tool-denied paths

Project 3: Multi-Provider Failover Gateway

File: P03-multi-provider-failover-gateway.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 4. The Open Core Infrastructure
Difficulty: Level 3: Advanced
Knowledge Area: Provider abstraction, fallback, reliability budgets
Software or Tool: req_llm, llm_db, jido_ai config aliases
Main Book: “Designing Data-Intensive Applications”

What you will build: A gateway that routes requests by model alias and fails over across providers when latency, error, or budget thresholds are hit.

Why it teaches Jido deeply: It operationalizes model selection as stateful policy, not ad-hoc if/else logic.

Core challenges you will face:

Fallback ordering and circuit state -> maps to deterministic policy state
Provider-specific option translation -> maps to ReqLLM provider adapters
Cost-aware model downgrades -> maps to usage telemetry feedback loop

Real World Outcome

$ mix run scripts/p03_failover_gateway_demo.exs
[gateway] alias=:fast primary=openai:gpt-4o-mini fallback=anthropic:claude-haiku-4-5
[request] id=req_42 timeout_ms=3000
[provider] openai status=429 retry_after=2
[fallback] switching_to=anthropic reason=rate_limit
[provider] anthropic status=200 latency_ms=1187
[usage] input_tokens=312 output_tokens=144 total_cost_usd=0.0017
[result] status=ok provider=anthropic degraded=true

The Core Question You Are Answering

“How do I keep agent behavior stable when model providers are unstable?”

Concepts You Must Understand First

Failure classification (rate limit, timeout, malformed)
- Book Reference: “Designing Data-Intensive Applications” - reliability chapters
Idempotent retries and request correlation
- Book Reference: “The Linux Programming Interface” - robust I/O patterns
Model alias resolution
- Book Reference: “Clean Architecture” - configuration boundaries

Questions to Guide Your Design

Which failures trigger immediate fallback vs retry?
How do you avoid retry storms across all providers?
What telemetry drives automatic downgrade to cheaper models?

Thinking Exercise

Design a fallback matrix: failure type x current provider -> next provider + backoff.

The Interview Questions They Will Ask

“How do you avoid double-billing when retries happen?”
“What is your fallback policy when every provider is degraded?”
“How do you validate provider parity for JSON outputs?”
“Where do you store circuit state and why?”
“How do you test failover deterministically?”

Hints in Layers

Hint 1: Start with two providers and one alias

Hint 2: Persist request and attempt IDs

Hint 3: Separate transport errors from model errors

Hint 4: Add synthetic chaos tests for 429/5xx/timeout

Books That Will Help

Topic	Book	Chapter
Reliability patterns	“Designing Data-Intensive Applications”	Fault tolerance
Timeout/backoff design	“The Linux Programming Interface”	Robust system call patterns
Boundary design	“Clean Architecture”	Policy vs detail

Common Pitfalls and Debugging

Problem 1: “Fallback loop never exits”

Why: No max-attempt guard.
Fix: Enforce bounded attempts per request.
Quick test: Simulate all providers failing; assert terminal fallback failure.

Problem 2: “Costs spike after failover”

Why: Fallback model is more expensive than primary.
Fix: Add policy layer that checks projected token cost before route.
Quick test: Run load with cost cap and assert downgrade events.

Definition of Done

Gateway handles provider 429/5xx/timeout with deterministic fallback
Attempt IDs and routing decisions are logged and queryable
Cost telemetry is captured per provider attempt
Chaos test suite validates fallback matrix

Project 4: Streaming Observability Console

File: P04-streaming-observability-console.md
Main Programming Language: Elixir
Alternative Programming Languages: TypeScript (front-end overlays), Erlang
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The Service and Support Model
Difficulty: Level 3: Advanced
Knowledge Area: Telemetry, trace correlation, LiveView dashboards
Software or Tool: jido_live_dashboard, jido_studio, :telemetry
Main Book: “Clean Architecture”

What you will build: A LiveView console that streams agent state transitions, directive timings, and correlated traces for one end-to-end request.

Why it teaches Jido deeply: You make hidden runtime behavior explicit by wiring Jido telemetry into an operator-facing control plane.

Core challenges you will face:

Correlating spans across signals/directives -> maps to trace_id discipline
Rendering high-rate event streams -> maps to bounded buffers and backpressure
Separating debug vs production verbosity -> maps to observability policy

Real World Outcome

You can open /dashboard and /studio, trigger an agent run, and watch synchronized signal and directive timelines in near real-time.

UI behavior:

Runtime page lists active AgentServer PIDs and statuses.
Traces page groups events by trace_id and exposes span durations.
Clicking a trace shows signal type, directive type, result, and latency.

$ mix phx.server
[info] mounted JidoLiveDashboard pages at /dashboard
[info] mounted JidoStudio at /studio
[telemetry] [:jido,:agent_server,:signal,:start] trace_id=tr_88
[telemetry] [:jido,:agent_server,:directive,:stop] directive_type=ToolExec duration_ms=92

The Core Question You Are Answering

“Can I explain exactly why an agent made a decision during an incident review?”

Concepts You Must Understand First

Telemetry event shape and handler cost
- Book Reference: “Clean Architecture” - observability and boundaries
LiveView event-stream rendering
- Book Reference: “The Pragmatic Programmer” - feedback loops
Trace correlation IDs
- Book Reference: “Designing Data-Intensive Applications” - distributed traces

Questions to Guide Your Design

Which events are mandatory for incident triage?
How do you avoid UI lockups under event bursts?
What retention window is enough for debugging without memory blowup?

Thinking Exercise

Design an incident timeline table with columns: timestamp, signal, directive, result, duration, cost.

The Interview Questions They Will Ask

“What is the minimum telemetry set for production readiness?”
“How do you sample traces without losing critical incidents?”
“How do you correlate child-agent events with parent requests?”
“How do you protect PII in logs?”
“How does this dashboard change your on-call MTTR?”

Hints in Layers

Hint 1: Attach to a small subset of events first

Hint 2: Keep trace buffer bounded

Hint 3: Normalize metadata keys across events

Hint 4: Add one-click trace export for postmortems

Books That Will Help

Topic	Book	Chapter
Observability design	“Clean Architecture”	System quality attributes
Feedback-driven engineering	“The Pragmatic Programmer”	Tracer bullets
Incident analysis	“Designing Data-Intensive Applications”	Monitoring and operations

Common Pitfalls and Debugging

Problem 1: “Trace view misses events”

Why: Inconsistent trace_id propagation.
Fix: Inject trace metadata when signal enters runtime.
Quick test: End-to-end trace must contain both signal and directive spans.

Problem 2: “Dashboard slows app”

Why: Heavy synchronous handlers.
Fix: Forward telemetry to async workers.
Quick test: Load test with and without dashboard; compare latency delta.

Definition of Done

Runtime and trace dashboards show live AgentServer activity
Event correlation works from request signal to final directive result
Buffer limits prevent unbounded memory growth
One incident replay can be reconstructed from captured traces

Project 5: Structured Output Contracts

File: P05-structured-output-contracts.md
Main Programming Language: Elixir
Alternative Programming Languages: Python, TypeScript
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 2. The Micro-SaaS / Pro Tool
Difficulty: Level 2: Intermediate
Knowledge Area: Schema-first AI responses, contract tests
Software or Tool: ReqLLM.generate_object/4, NimbleOptions, Zoi
Main Book: “Clean Architecture”

What you will build: A structured-output gateway that validates every model object against explicit schemas and rejects malformed payloads.

Why it teaches Jido deeply: It transforms model output into typed contracts that strategies can trust.

Core challenges you will face:

Schema drift across providers -> maps to provider compatibility tests
Strict vs relaxed mode behavior -> maps to policy decisions
Error surfacing for retries -> maps to actionable failures

Real World Outcome

$ mix run scripts/p05_structured_output_demo.exs
[schema] ticket={priority:enum,severity:enum,summary:string,actions:list}
[request] model=openai:gpt-4o-mini mode=strict
[result] validation=ok object={"priority":"high","severity":"s2","summary":"db timeout","actions":["restart pool"]}
[request] model=anthropic:claude-haiku-4-5 mode=strict
[result] validation=error field=severity reason="not in enum"
[policy] retry_with_repair_prompt=true attempt=2
[result] validation=ok

The Core Question You Are Answering

“How do I treat LLM output like API data instead of free-form text?”

Concepts You Must Understand First

Schema compilation and validation
- Book Reference: “Clean Architecture” - data contracts
Provider-specific structured output modes
- Book Reference: “The Pragmatic Programmer” - adaptability
Retry with targeted repair prompts
- Book Reference: “Designing Data-Intensive Applications” - robust pipelines

Questions to Guide Your Design

Which fields are hard-required vs optional defaults?
How do you present validation errors to strategy state?
What repair strategy is deterministic and bounded?

Thinking Exercise

Create three malformed payload examples and map each to a repair action.

The Interview Questions They Will Ask

“Why not parse JSON manually?”
“How do you prevent silent schema downgrades?”
“What’s your strict-mode fallback?”
“How do you test cross-provider parity?”
“How do you avoid infinite repair loops?”

Hints in Layers

Hint 1: Start with one small schema

Hint 2: Capture field-level validation errors

Hint 3: Build repair prompts from validation failures

Hint 4: Add provider matrix tests for the same schema

Books That Will Help

Topic	Book	Chapter
Contracts and boundaries	“Clean Architecture”	Entities and DTOs
Robust parsing	“The Pragmatic Programmer”	Design by contracts
Data pipeline reliability	“Designing Data-Intensive Applications”	Data quality

Common Pitfalls and Debugging

Problem 1: “Valid JSON but invalid business object”

Why: JSON parse success mistaken for schema success.
Fix: Separate parse and validation stages.
Quick test: Invalid enum must fail despite valid JSON syntax.

Problem 2: “Repair prompt makes object worse”

Why: Entire object rewritten each retry.
Fix: Ask model to patch only failed fields.
Quick test: Preserve unchanged fields across retries.

Definition of Done

Structured outputs are validated against explicit schemas
Invalid payloads generate field-level errors and bounded retries
Same schema works across at least two providers
Contract tests prevent silent drift

Project 6: Cost-Aware Model Router

File: P06-cost-aware-model-router.md
Main Programming Language: Elixir
Alternative Programming Languages: Go, Python
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The Service and Support Model
Difficulty: Level 3: Advanced
Knowledge Area: AI FinOps, policy routing, usage telemetry
Software or Tool: req_llm usage metadata, llm_db, :telemetry
Main Book: “Designing Data-Intensive Applications”

What you will build: A router that automatically chooses model/provider by budget, latency target, and required capabilities.

Why it teaches Jido deeply: It turns cost from an afterthought into a deterministic control variable in strategy execution.

Core challenges you will face:

Capability constraints vs budget constraints -> maps to policy precedence
Rolling cost windows -> maps to stateful telemetry aggregation
Downgrade safety -> maps to quality guardrails

Real World Outcome

$ mix run scripts/p06_cost_router_demo.exs
[policy] minute_budget_usd=0.05 require={tools:true,json:true}
[route] req=1 model=openai:gpt-4o-mini projected_cost=0.0031
[usage] req=1 total_cost=0.0034 rolling_minute=0.0034
[route] req=7 model=anthropic:claude-haiku-4-5 reason=budget_pressure
[usage] req=7 total_cost=0.0012 rolling_minute=0.0468
[route] req=8 model=anthropic:claude-haiku-4-5 reason=capabilities_ok_budget_guard
[alert] budget_near_limit=true

The Core Question You Are Answering

“How do I keep quality acceptable while preventing runaway model spend?”

Concepts You Must Understand First

Usage and cost fields from ReqLLM.Response
- Book Reference: “Designing Data-Intensive Applications” - metrics and feedback
Capability-based selection (tools, json, streaming)
- Book Reference: “Clean Architecture” - policy decisions
Rolling-window aggregation
- Book Reference: “Algorithms, Fourth Edition” - sliding windows

Questions to Guide Your Design

Which constraints are hard stops vs soft preferences?
How do you model budget by tenant/team/request class?
What quality fallback happens when only cheap models remain?

Thinking Exercise

Define a policy table for three request classes: critical, normal, batch.

The Interview Questions They Will Ask

“How do you prevent policy oscillation between models?”
“How do you price unknown models?”
“What is your outage strategy if cheap models fail?”
“How do you audit routing fairness across tenants?”
“How do you test budget logic deterministically?”

Hints in Layers

Hint 1: Start with static cost metadata

Hint 2: Add rolling-minute state next

Hint 3: Keep routing reason in every response

Hint 4: Add canary quality checks before aggressive downgrades

Books That Will Help

Topic	Book	Chapter
Feedback control in systems	“Designing Data-Intensive Applications”	Monitoring and adaptation
Policy layering	“Clean Architecture”	Use-case policy
Sliding window math	“Algorithms, Fourth Edition”	Data structures for streams

Common Pitfalls and Debugging

Problem 1: “Budget exceeded despite guard”

Why: Guard checks pre-request only; post-request costs ignored.
Fix: Update rolling window on completion and re-evaluate next route.
Quick test: Simulate 100 requests; budget breach must trigger downgrade.

Problem 2: “Low-cost routing breaks output quality”

Why: Capability checks too coarse.
Fix: Add per-task quality floors and fallback to capable model when needed.
Quick test: Regression set with expected JSON correctness.

Definition of Done

Router chooses model/provider by capability + budget policy
Rolling spend windows are tracked and exposed in metrics
Routing decisions include explicit reason codes
Quality regression suite prevents unsafe downgrades

Project 7: Skill and Plugin Composition Lab

File: P07-skill-plugin-composition-lab.md
Main Programming Language: Elixir
Alternative Programming Languages: TypeScript, Python
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The Micro-SaaS / Pro Tool
Difficulty: Level 3: Advanced
Knowledge Area: Capability composition, state isolation, prompt skills
Software or Tool: Jido.Plugin, Jido.AI.Skill, skill registry/loader
Main Book: “Domain-Driven Design”

What you will build: An agent composed from multiple plugins (chat, memory, tools) plus loaded SKILL.md capabilities with tool allowlists.

Why it teaches Jido deeply: You learn the difference between runtime capabilities (plugins/actions) and prompt capabilities (skills).

Core challenges you will face:

Plugin state-key isolation -> maps to modular correctness
Skill prompt rendering + tool filtering -> maps to controlled tool exposure
Mount order and lifecycle hooks -> maps to compositional behavior

Real World Outcome

$ mix run scripts/p07_skill_plugin_lab.exs
[plugin] mounted=chat state_key=:chat
[plugin] mounted=memory state_key=:memory
[skill] loaded=incident-analyst allowed_tools=[search_logs,summarize]
[prompt] rendered_skills=1 filtered_tools=2/6
[query] "summarize latest incident and propose next step"
[result] status=ok used_tools=[search_logs,summarize] blocked_tools=[]

The Core Question You Are Answering

“How do I compose many capabilities without creating a tangled agent monolith?”

Concepts You Must Understand First

Plugin lifecycle (mount, handle_signal, transform_result)
- Book Reference: “Domain-Driven Design” - bounded contexts
Skill manifests and allowlists
- Book Reference: “Clean Architecture” - policy enforcement
State isolation by state_key
- Book Reference: “Design Patterns” - modular composition

Questions to Guide Your Design

Which capabilities belong in plugins vs skills?
How do you detect conflicting signal routes across plugins?
How do you test that a skill cannot escalate tool permissions?

Thinking Exercise

Model a conflict case where two plugins route the same signal type.

The Interview Questions They Will Ask

“When do you choose a plugin over a skill?”
“How do you avoid plugin state collisions?”
“How do skill allowlists interact with global policy?”
“How do you version capabilities safely?”
“How do you test composition order effects?”

Hints in Layers

Hint 1: Start with two plugins and one skill

Hint 2: Log merged route table at startup

Hint 3: Enforce allowed_tools intersection with global policy

Hint 4: Add composition snapshot tests

Books That Will Help

Topic	Book	Chapter
Bounded contexts	“Domain-Driven Design”	Context mapping
Composition patterns	“Design Patterns”	Strategy/Decorator
Policy control	“Clean Architecture”	Use case boundaries

Common Pitfalls and Debugging

Problem 1: “Plugin overrides another plugin unexpectedly”

Why: Route precedence not documented.
Fix: Emit deterministic route order at boot and assert in tests.
Quick test: Snapshot route table.

Problem 2: “Skill loads but tools stay unavailable”

Why: Skill allowlist names don’t match registry tool names.
Fix: Add canonical tool name adapter.
Quick test: Validate every allowlisted tool exists in registry.

Definition of Done

Plugins mount with isolated state keys and no collisions
Skills load from SKILL.md and filter tools correctly
Route precedence is explicit and tested
Capability composition passes regression tests

Project 8: Strategy State Machine Switchboard

File: P08-strategy-state-machine-switchboard.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 3. The Service and Support Model
Difficulty: Level 4: Expert
Knowledge Area: Strategy orchestration, finite states, adaptive routing
Software or Tool: Jido.AI.Strategies.*, Fsmx, strategy snapshots
Main Book: “Operating System Concepts”

What you will build: A switchboard agent that routes tasks across ReAct, CoT, ToT, GoT, and Adaptive based on query traits and runtime constraints.

Why it teaches Jido deeply: You treat reasoning strategies as explicit state machines with deterministic transitions and bounded resources.

Core challenges you will face:

State machine compatibility across strategies -> maps to normalized snapshots
Switch costs and mode transitions -> maps to orchestration policy
Preventing strategy thrash -> maps to hysteresis logic

Real World Outcome

$ mix run scripts/p08_strategy_switchboard.exs
[input] query="compare 3 migration plans with risks"
[classifier] tags=[multi_path,tradeoff]
[switch] selected=tree_of_thoughts reason=requires_branching
[state] status=awaiting_llm iteration=1
[result] candidates=3 best_score=0.81
[switch] selected=graph_of_thoughts reason=synthesis_phase
[final] status=completed strategy_path=[tot,got] cost_usd=0.0062

The Core Question You Are Answering

“How do I choose reasoning strategy as a control problem instead of guesswork?”

Concepts You Must Understand First

Machine states and legal transitions
- Book Reference: “Operating System Concepts” - state models
Strategy-specific cost profiles
- Book Reference: “Designing Data-Intensive Applications” - resource tradeoffs
Snapshot-driven orchestration
- Book Reference: “Clean Architecture” - stable contracts

Questions to Guide Your Design

Which query features trigger strategy changes?
How do you prevent infinite switching between two strategies?
What status values are strategy-agnostic and mandatory?

Thinking Exercise

Build a transition matrix for 5 strategies and mark forbidden transitions.

The Interview Questions They Will Ask

“Why use FSMs for strategy orchestration?”
“How do you detect and stop oscillation?”
“How do you compare outputs from different strategy types?”
“How do you budget token cost across stages?”
“What if one strategy crashes mid-run?”

Hints in Layers

Hint 1: Start with ReAct vs CoT only

Hint 2: Add one normalized snapshot shape

Hint 3: Track last N strategy choices for hysteresis

Hint 4: Add switch reason and confidence to logs

Books That Will Help

Topic	Book	Chapter
State-machine reasoning	“Operating System Concepts”	Process state models
Tradeoff analysis	“Designing Data-Intensive Applications”	Performance and cost
Stable interfaces	“Clean Architecture”	Interface contracts

Common Pitfalls and Debugging

Problem 1: “Switchboard keeps restarting strategies”

Why: No persisted orchestration state.
Fix: Store orchestrator snapshot in agent state.
Quick test: Resume mid-run and assert same strategy path.

Problem 2: “Output format differs by strategy”

Why: No canonical result schema.
Fix: Normalize outputs before aggregation.
Quick test: All strategy outputs pass one schema validator.

Definition of Done

Switchboard selects strategy by explicit policy
State transitions are legal and tested
Strategy switches are logged with reasons and confidence
Oscillation protection and budget controls are active

Project 9: Worker Pool Load Lab

File: P09-worker-pool-load-lab.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The Service and Support Model
Difficulty: Level 3: Advanced
Knowledge Area: Concurrency bounds, queue pressure, latency SLOs
Software or Tool: Jido.Agent.WorkerPool, Telemetry metrics
Main Book: “Operating Systems: Three Easy Pieces”

What you will build: A benchmark harness for WorkerPool.call/4 and with_agent/4 under varying pool sizes and overflow settings.

Why it teaches Jido deeply: It exposes the operational tradeoff between cold starts and stateful pooled workers.

Core challenges you will face:

Pool sizing and overflow strategy -> maps to throughput/latency tuning
State leakage across checkouts -> maps to reset discipline
Timeout layering -> maps to checkout vs call timeout separation

Real World Outcome

$ mix run scripts/p09_worker_pool_bench.exs
[config] pool=:search size=8 max_overflow=4 strategy=lifo
[load] rps=120 duration=60s
[status] available=0 checked_out=8 overflow=3
[metric] p50=31ms p95=89ms p99=144ms timeout_rate=0.7%
[warning] overflow_active=true recommendation="increase size to 10 or lower call_timeout"

The Core Question You Are Answering

“How do I bound concurrency without sacrificing latency under bursts?”

Concepts You Must Understand First

Worker pool semantics and checkout lifecycle
- Book Reference: “Operating Systems: Three Easy Pieces” - scheduling
Stateful worker reuse risks
- Book Reference: “The Linux Programming Interface” - process/resource lifecycle
Tail-latency measurement
- Book Reference: “Algorithms, Fourth Edition” - percentile/statistics basics

Questions to Guide Your Design

What is your target p95 latency and why?
Which state fields must reset between requests?
How should overload be signaled to callers?

Thinking Exercise

Calculate initial pool size from expected RPS and mean service time, then validate experimentally.

The Interview Questions They Will Ask

“Why choose :lifo vs :fifo?”
“How do you detect leaked checkouts?”
“How do you model burst capacity?”
“What does healthy overflow usage look like?”
“How would you autoscale pool size safely?”

Hints in Layers

Hint 1: Benchmark one pool profile at a time

Hint 2: Record pool status every 5 seconds

Hint 3: Add reset action before each call

Hint 4: Separate timeout errors by phase (checkout vs processing)

Books That Will Help

Topic	Book	Chapter
Scheduling and queues	“Operating Systems: Three Easy Pieces”	Scheduling chapters
Resource lifecycle	“The Linux Programming Interface”	Process/resource management
Performance analysis	“Algorithms, Fourth Edition”	Statistical analysis basics

Common Pitfalls and Debugging

Problem 1: “Unexpected state from previous request”

Why: Reused pooled worker state not reset.
Fix: Add deterministic reset signal in with_agent transaction.
Quick test: Repeated calls must produce independent outcomes.

Problem 2: “Timeouts despite low CPU”

Why: Checkout timeout too short for burst queue.
Fix: Tune checkout timeout or increase pool size.
Quick test: Compare timeout rate across timeout values.

Definition of Done

Benchmark report includes p50/p95/p99 and timeout breakdown
Pool status metrics are captured and graphed
State reset strategy prevents cross-request contamination
Capacity recommendation is justified by measured data

Project 10: Persistent Thread Memory

File: P10-persistent-thread-memory.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The Service and Support Model
Difficulty: Level 3: Advanced
Knowledge Area: Checkpointing, journal pointers, resume safety
Software or Tool: Jido.Storage.File, Jido.Agent.Persistence.hibernate/4, Jido.Agent.Persistence.thaw/3
Main Book: “The Linux Programming Interface”

What you will build: A memory-capable agent that survives restarts using thread journal + checkpoint pointer invariants.

Why it teaches Jido deeply: It forces you to reason about durability correctness (thread_rev pointer) instead of just serialization convenience.

Core challenges you will face:

Checkpoint/thread consistency -> maps to thaw safety checks
Custom checkpoint/restore callbacks -> maps to backward compatibility
Manual vs auto lifecycle (InstanceManager) -> maps to runtime control

Real World Outcome

$ mix run scripts/p10_persistence_demo.exs
[state] session=user-123 messages=5 thread_rev=42
[persist] hibernate=true adapter=Jido.Storage.File path=priv/jido/storage
[simulate] process_restart=true
[restore] thaw=true session=user-123 thread_rev=42
[check] last_message="deploy approved" pointer_match=true

The Core Question You Are Answering

“How do I persist conversational agent state without corrupting event history?”

Concepts You Must Understand First

Checkpoint pointer invariant (thread_id,thread_rev)
- Book Reference: “The Linux Programming Interface” - file/data integrity
Journal-first then snapshot write ordering
- Book Reference: “Designing Data-Intensive Applications” - log + snapshot pattern
Restore-time mismatch handling
- Book Reference: “Clean Architecture” - explicit error boundaries

Questions to Guide Your Design

Which state is durable vs ephemeral?
How do you migrate checkpoint schema versions?
What do you do on :thread_mismatch?

Thinking Exercise

Draw recovery flow for three cases: happy path, missing thread, revision mismatch.

The Interview Questions They Will Ask

“Why not embed full thread in checkpoint?”
“How do you guarantee replay consistency?”
“What migration strategy do you use for checkpoint versions?”
“How do you test crash recovery deterministically?”
“When do you use auto hibernation?”

Hints in Layers

Hint 1: Keep checkpoint schema versioned

Hint 2: Never persist transient cache fields

Hint 3: Verify pointer revision during thaw

Hint 4: Build one crash-restart integration test

Books That Will Help

Topic	Book	Chapter
Durable state handling	“The Linux Programming Interface”	File and process durability
Log/snapshot systems	“Designing Data-Intensive Applications”	Storage and recovery
Migration design	“Clean Architecture”	Evolution of interfaces

Common Pitfalls and Debugging

Problem 1: “Agent restores but history is missing”

Why: Thread pointer not persisted or thread not flushed.
Fix: Flush journal before checkpoint write.
Quick test: Assert non-empty thread after thaw.

Problem 2: “Restore crashes after schema changes”

Why: Unversioned checkpoint payload.
Fix: Add version field and migration path.
Quick test: Restore old fixture checkpoint in CI.

Definition of Done

Hibernate/thaw works across process restart
Thread pointer and revision checks pass
Checkpoint schema is versioned and migration-tested
Crash-recovery test proves durable behavior

Project 11: Sensor-Driven Incident Triage

File: P11-sensor-driven-incident-triage.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The Service and Support Model
Difficulty: Level 3: Advanced
Knowledge Area: Sensor runtime, event ingestion, triage workflows
Software or Tool: Jido.Sensor.Runtime, JidoCode.GitHub.Sensors.WebhookSensor
Main Book: “The Pragmatic Programmer”

What you will build: A webhook-to-signal ingestion pipeline where a sensor emits github.issue.* signals that trigger triage agents.

Why it teaches Jido deeply: You bridge external systems into Jido’s deterministic loop with clear transformation contracts.

Core challenges you will face:

Polling and idempotent delivery marking -> maps to robust sensor design
Signal type normalization -> maps to routing reliability
Backpressure on burst deliveries -> maps to batching strategy

Real World Outcome

$ mix run scripts/p11_sensor_triage_demo.exs
[sensor] started name=github_webhook poll_interval=5000 batch_size=10
[poll] pending_deliveries=3
[emit] type=github.issue.opened repo=acme/api delivery_id=del_01
[route] coordinator=issue_run_coordinator signal=issue.start
[triage] classification=bug severity=s2
[ack] delivery_id=del_01 status=processed

The Core Question You Are Answering

“How do I ingest real-world events into agents without losing or duplicating work?”

Concepts You Must Understand First

Sensor callbacks and directives (:schedule, :emit)
- Book Reference: “The Pragmatic Programmer” - automation reliability
Idempotent processing markers
- Book Reference: “Designing Data-Intensive Applications” - exactly/at-most-once tradeoffs
Batch poll patterns
- Book Reference: “Algorithms, Fourth Edition” - batching and queues

Questions to Guide Your Design

What delivery state model do you need (pending, processed, failed)?
How do you route event type/action into signal names?
How do you handle partial batch failures?

Thinking Exercise

Model a burst of 200 webhook events and design batching + retry strategy.

The Interview Questions They Will Ask

“Why use a polling sensor instead of direct webhook handlers?”
“How do you avoid reprocessing after crash?”
“How do you map external event schemas safely?”
“What happens if marking processed fails?”
“How would you scale this for many repos?”

Hints in Layers

Hint 1: Build deterministic build_signal_type/2 helper

Hint 2: Mark processed only after successful emit path

Hint 3: Capture per-batch metrics

Hint 4: Add dead-letter queue for repeated failures

Books That Will Help

Topic	Book	Chapter
Reliable automation	“The Pragmatic Programmer”	Pragmatic automation
Event delivery guarantees	“Designing Data-Intensive Applications”	Messaging semantics
Queue/batch behavior	“Algorithms, Fourth Edition”	Queue structures

Common Pitfalls and Debugging

Problem 1: “Same delivery processed twice”

Why: Non-atomic mark-processed flow.
Fix: Guard by delivery status and idempotency key.
Quick test: Re-run same batch; no duplicate downstream artifacts.

Problem 2: “Sensor floods agent with bursts”

Why: Batch size too high and no pacing.
Fix: Tune batch size and poll interval; add per-batch limit.
Quick test: Measure queue depth under synthetic burst.

Definition of Done

Sensor emits correct github.* signal types from delivery records
Delivery marking is idempotent and crash-safe
Burst handling maintains bounded queue growth
End-to-end triage run is triggered from emitted signal

Project 12: Cron Autonomous Maintenance

File: P12-cron-autonomous-maintenance.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The Service and Support Model
Difficulty: Level 3: Advanced
Knowledge Area: Scheduling, idempotency, recurring operations
Software or Tool: Directive.cron, Directive.schedule, Directive.cron_cancel
Main Book: “The Linux Programming Interface”

What you will build: A maintenance agent that runs recurring health jobs, emits reports, and avoids duplicate work during restarts.

Why it teaches Jido deeply: You learn Jido’s timer semantics (in-memory, at-most-once) and design safe idempotent tasks.

Core challenges you will face:

Missed-run handling -> maps to explicit last-run state
Cron upsert semantics -> maps to job id lifecycle
Safe cancellation -> maps to operational controls

Real World Outcome

$ mix run scripts/p12_cron_maintenance_demo.exs
[setup] cron job_id=:nightly_health expr="0 2 * * *" timezone=Etc/UTC
[tick] signal=maintenance.run run_id=run_2026_02_12
[task] checks={queue_depth,dead_letters,cost_spend} status=ok
[state] last_run_at=2026-02-12T02:00:01Z report_count=14
[control] cron_cancel job_id=:nightly_health result=ok

The Core Question You Are Answering

“How do I run autonomous recurring jobs safely when timers are non-persistent?”

Concepts You Must Understand First

Schedule/Cron at-most-once semantics
- Book Reference: “The Linux Programming Interface” - timer behavior
Idempotency keys for recurring jobs
- Book Reference: “Designing Data-Intensive Applications” - idempotent processing
Timezone and job identity
- Book Reference: “The Pragmatic Programmer” - operational correctness

Questions to Guide Your Design

Which jobs are safe to skip vs must replay externally?
How do you generate deterministic run IDs?
How do you disable a job in emergencies?

Thinking Exercise

Simulate crash at 01:59:59 for a 02:00 cron job and design compensating logic.

The Interview Questions They Will Ask

“What guarantees does Jido Cron provide and not provide?”
“How do you avoid duplicate reports?”
“How do you handle timezone drift across environments?”
“When would you switch to external scheduler (Oban/Quantum)?”
“How do you test cron behavior deterministically?”

Hints in Layers

Hint 1: Use explicit job_id always

Hint 2: Persist last_run_at and dedupe key

Hint 3: Separate scheduler from business action

Hint 4: Add manual trigger signal for debugging

Books That Will Help

Topic	Book	Chapter
Timer behavior	“The Linux Programming Interface”	Time and timers
Idempotent jobs	“Designing Data-Intensive Applications”	Reliable batch processing
Practical operations	“The Pragmatic Programmer”	Automation discipline

Common Pitfalls and Debugging

Problem 1: “Cron runs twice after config reload”

Why: Duplicate job IDs or duplicate registration path.
Fix: Enforce single registration and rely on upsert semantics.
Quick test: Reload config repeatedly; only one run per schedule.

Problem 2: “Expected run missing after restart”

Why: In-memory timers do not catch up.
Fix: Add startup reconciliation check based on last_run_at.
Quick test: Restart before tick and verify compensating run policy.

Definition of Done

Recurring jobs run with explicit job IDs and timezone configuration
Idempotency prevents duplicate side effects
Missed-run policy is documented and tested
Jobs can be cancelled and resumed operationally

Project 13: Distributed Netsplit Recovery Drill

File: P13-distributed-netsplit-recovery.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The Open Core Infrastructure
Difficulty: Level 4: Expert
Knowledge Area: Distributed BEAM, partition handling, reconciliation
Software or Tool: Distributed Erlang, jido_signal replay/snapshots
Main Book: “Erlang and OTP in Action”

What you will build: A two-node Jido deployment that detects netsplits, buffers/replays missed signals, and reconciles child-agent state after healing.

Why it teaches Jido deeply: You practice failure-first operations where message ordering and eventual consistency matter more than happy-path throughput.

Core challenges you will face:

Detecting nodedown and degraded mode entry -> maps to cluster health policy
Signal backlog replay and dedupe -> maps to idempotency and causal ordering
Parent/child state reconciliation -> maps to explicit merge strategies

Real World Outcome

$ iex --sname node_a -S mix
$ iex --sname node_b -S mix
[node_a] connected=node_b@127.0.0.1 status=healthy
[inject] simulate_netsplit=true
[node_a] event=nodedown node=node_b@127.0.0.1 mode=degraded
[node_a] buffered_signals=17
[heal] nodeup=node_b@127.0.0.1
[replay] replayed=17 deduped=3 failed=0
[reconcile] children_synced=true divergence=0

The Core Question You Are Answering

“How do I keep autonomous workflows safe when the cluster partitions?”

Concepts You Must Understand First

Netsplit failure modes
- Book Reference: “Erlang and OTP in Action” - distributed nodes
Replayable event logs and dedupe keys
- Book Reference: “Designing Data-Intensive Applications” - log-based recovery
Conflict resolution strategies
- Book Reference: “Domain-Driven Design” - aggregate consistency

Questions to Guide Your Design

Which actions are safe during degraded mode?
How do you order replay after healing?
What conflict resolution rule wins on diverged state?

Thinking Exercise

Write a recovery runbook with 5 steps from nodedown to steady state.

The Interview Questions They Will Ask

“How do you distinguish slow node from partition?”
“How do you guarantee replay idempotency?”
“What data can diverge and why?”
“How do you test netsplits in CI?”
“When do you abort automation and require human approval?”

Hints in Layers

Hint 1: Start with read-only degraded mode

Hint 2: Buffer signals with monotonic sequence IDs

Hint 3: Reconcile before accepting new writes

Hint 4: Add post-heal consistency check command

Books That Will Help

Topic	Book	Chapter
Distributed BEAM patterns	“Erlang and OTP in Action”	Distribution and supervision
Replay and recovery	“Designing Data-Intensive Applications”	Event logs
Consistency modeling	“Domain-Driven Design”	Aggregates

Common Pitfalls and Debugging

Problem 1: “Replay causes duplicate effects”

Why: Missing idempotency keys for emitted directives.
Fix: Attach stable dedupe keys to side-effect signals.
Quick test: Re-run replay twice; no new side effects second time.

Problem 2: “Cluster heals but state still diverged”

Why: No deterministic merge rule.
Fix: Define conflict policy (timestamp, version vector, authority node).
Quick test: Inject divergent writes and verify deterministic winner.

Definition of Done

Netsplit detection transitions system into safe degraded mode
Buffered signals replay successfully after heal with dedupe
State reconciliation policy is deterministic and tested
Recovery runbook is executable by on-call engineers

Project 14: ETS and Mnesia Hybrid Agent Memory

File: P14-ets-mnesia-hybrid-agent-memory.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The Open Core Infrastructure
Difficulty: Level 4: Expert
Knowledge Area: Hot-cache + durable memory layering
Software or Tool: ETS tables, Mnesia journal, Jido checkpoint pointers
Main Book: “Operating Systems: Three Easy Pieces”

What you will build: A hybrid memory subsystem with ETS for low-latency working set and Mnesia for durable event/journal state.

Why it teaches Jido deeply: It demonstrates memory-tier tradeoffs in long-lived agents and explicit promotion/eviction policy.

Core challenges you will face:

Cache coherence across tiers -> maps to invalidation rules
Write path durability guarantees -> maps to journal-first policy
Recovery speed vs correctness -> maps to snapshot cadence

Real World Outcome

$ mix run scripts/p14_hybrid_memory_demo.exs
[mem] ets_hits=1842 ets_misses=211 hit_rate=89.7%
[mem] mnesia_writes=211 checkpoint_interval=500 events
[evict] policy=lru evicted=120
[restart] recover_from=mnesia_journal restored_entries=211
[check] consistency=ok cache_warmup_ms=340

The Core Question You Are Answering

“How do I get fast memory access without sacrificing restart safety?”

Concepts You Must Understand First

ETS strengths and limits
- Book Reference: “Operating Systems: Three Easy Pieces” - in-memory data access
Mnesia durability semantics
- Book Reference: “Erlang and OTP in Action” - distributed storage basics
Snapshot and replay tradeoffs
- Book Reference: “Designing Data-Intensive Applications” - storage architecture

Questions to Guide Your Design

Which keys belong in hot cache only vs durable store?
When do you flush and checkpoint?
How do you verify consistency after restart?

Thinking Exercise

Model a failure during write path and decide what can be lost.

The Interview Questions They Will Ask

“Why hybrid instead of one store?”
“How do you avoid stale cache reads?”
“What is your crash-consistency model?”
“How do you tune checkpoint frequency?”
“How do you test tier coherence?”

Hints in Layers

Hint 1: Implement read-through cache first

Hint 2: Use write-ahead journal before cache mutation

Hint 3: Add periodic consistency sweeps

Hint 4: Measure warmup time and hit-rate separately

Books That Will Help

Topic	Book	Chapter
Memory hierarchy thinking	“Operating Systems: Three Easy Pieces”	Memory chapters
Durable logs and snapshots	“Designing Data-Intensive Applications”	Storage engines
BEAM data systems	“Erlang and OTP in Action”	Mnesia/distribution

Common Pitfalls and Debugging

Problem 1: “Cache returns old value after restart”

Why: Cache restored from stale snapshot without replay.
Fix: Replay journal deltas after snapshot load.
Quick test: Verify latest version after forced restart.

Problem 2: “Writes succeed but disappear”

Why: Cache mutated before durable write commit.
Fix: Journal-first write policy.
Quick test: Crash immediately after write; verify persisted value.

Definition of Done

Hybrid memory read/write path is implemented with explicit tier policy
Restart recovery restores durable state and warms cache safely
Hit-rate and warmup metrics are observable
Crash-consistency tests pass

Project 15: LiveView Human-in-the-Loop Control Center

File: P15-liveview-hitl-control-center.md
Main Programming Language: Elixir
Alternative Programming Languages: TypeScript
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The Service and Support Model
Difficulty: Level 3: Advanced
Knowledge Area: Human approval gates, operator UX, traceability
Software or Tool: Phoenix LiveView, jido_studio, resolver-based access control
Main Book: “The Pragmatic Programmer”

What you will build: A HITL control center where risky directives pause for operator review (approve/reject/escalate) before execution.

Why it teaches Jido deeply: It connects autonomous agent loops to real governance and operational accountability.

Core challenges you will face:

Pause/resume semantics for pending directives -> maps to strategy status transitions
Role-based access (:all, :read_only) -> maps to resolver policy
Audit trails for approvals -> maps to compliance-grade observability

Real World Outcome

Users open /studio, inspect pending actions, and choose approval outcomes with real-time state updates.

Screen behavior:

Queue tab shows pending directives with risk score and policy reason.
Action panel offers Approve, Reject, Escalate.
Timeline records operator identity, decision, and resulting signal.

$ mix phx.server
[studio] mounted at /studio resolver=MyApp.StudioResolver
[approval] directive_id=dir_77 risk=high status=pending
[user] role=admin action=approve
[signal] type=policy.approved directive_id=dir_77
[result] directive_executed=true audit_event_id=audit_901

The Core Question You Are Answering

“How do I keep humans in control of high-risk autonomous actions without slowing everything down?”

Concepts You Must Understand First

Approval-state modeling
- Book Reference: “Domain-Driven Design” - aggregate state transitions
LiveView real-time UX constraints
- Book Reference: “The Pragmatic Programmer” - user feedback loops
Access control resolvers
- Book Reference: “Foundations of Information Security” - authorization

Questions to Guide Your Design

Which directive types require human approval?
What timeout policy applies to unreviewed items?
How do you make audit logs immutable and searchable?

Thinking Exercise

Define risk categories and required approver role for each.

The Interview Questions They Will Ask

“How do you prevent bypassing approval gates?”
“How do you handle stale approvals?”
“How do you model multi-approver workflows?”
“How do you design for low-latency operator feedback?”
“What belongs in the audit event schema?”

Hints in Layers

Hint 1: Start with one approval queue

Hint 2: Represent decisions as signals, not direct mutations

Hint 3: Gate only risky directives first

Hint 4: Add role-based UI states (read-only vs approve)

Books That Will Help

Topic	Book	Chapter
Workflow modeling	“Domain-Driven Design”	Aggregates and invariants
Human-centered operations	“The Pragmatic Programmer”	Feedback and automation
Authorization principles	“Foundations of Information Security”	Access controls

Common Pitfalls and Debugging

Problem 1: “Approved action executes twice”

Why: Duplicate approval signals.
Fix: Idempotent approval decision key.
Quick test: Re-submit same decision; only first takes effect.

Problem 2: “Read-only users can approve”

Why: Resolver not enforced server-side.
Fix: Check access in action handlers, not only UI.
Quick test: Attempt approval with read-only role; assert forbidden.

Definition of Done

Risky directives pause for explicit human decisions
Role-based access control is enforced server-side
Approval/rejection events are audit logged
End-to-end approve and reject paths are tested

Project 16: Tool Permission Firewall

File: P16-tool-permission-firewall.md
Main Programming Language: Elixir
Alternative Programming Languages: Rust (sandbox), Go
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The Open Core Infrastructure
Difficulty: Level 4: Expert
Knowledge Area: Policy engines, sandbox boundaries, least privilege
Software or Tool: Jido.AI.Skill allowlists, jido_claude allowed_tools, policy middleware
Main Book: “Foundations of Information Security”

What you will build: A tool firewall enforcing per-agent and per-skill permission policies with reasoned deny events.

Why it teaches Jido deeply: It operationalizes least privilege at directive execution time.

Core challenges you will face:

Merging policy layers (global, agent, skill, user role) -> maps to deterministic precedence
Context-aware approvals for risky tools -> maps to runtime policy hooks
Clear deny diagnostics -> maps to debuggable security posture

Real World Outcome

$ mix run scripts/p16_tool_firewall_demo.exs
[policy] global_allow=[Read,Glob,Grep] global_deny=[Bash,FsWrite]
[request] tool=Bash actor=agent/researcher
[decision] denied code=tool_not_allowed reason="blocked by global policy"
[signal] type=policy.tool.denied tool=Bash
[request] tool=Read actor=agent/researcher
[decision] allowed

The Core Question You Are Answering

“How do I guarantee an agent cannot execute tools outside policy even if prompted to?”

Concepts You Must Understand First

Least-privilege tool design
- Book Reference: “Foundations of Information Security” - access control
Policy precedence rules
- Book Reference: “Clean Architecture” - policy layers
Security observability signals
- Book Reference: “The Pragmatic Programmer” - operational diagnostics

Questions to Guide Your Design

What policy source has highest precedence?
Which denied actions need escalation instead of silent block?
How do you prove policy tamper resistance?

Thinking Exercise

Write a precedence matrix: global, tenant, skill, session override.

The Interview Questions They Will Ask

“Where is policy enforced: prompt, strategy, or runtime?”
“How do you prevent policy bypass through aliases?”
“How do you audit denied tool attempts?”
“How do you support temporary emergency exceptions?”
“How do you test firewall correctness?”

Hints in Layers

Hint 1: Canonicalize tool names first

Hint 2: Emit structured deny events

Hint 3: Keep policy evaluation pure and testable

Hint 4: Add shadow-mode policy before enforcement rollout

Books That Will Help

Topic	Book	Chapter
Access control	“Foundations of Information Security”	Authorization models
Policy engine architecture	“Clean Architecture”	Policy vs detail
Operational rollout	“The Pragmatic Programmer”	Incremental deployment

Common Pitfalls and Debugging

Problem 1: “Allowed tool blocked unexpectedly”

Why: Policy precedence bug.
Fix: Return evaluation trace in debug mode.
Quick test: Unit tests for all precedence combinations.

Problem 2: “Tool alias bypasses deny rule”

Why: Matching pre-normalization.
Fix: Normalize aliases to canonical tool ID before evaluate.
Quick test: Attempt blocked tool through alias variants.

Definition of Done

Policy firewall enforces least privilege at runtime
Denied attempts emit structured security signals
Policy evaluation precedence is explicit and tested
Shadow-mode and enforce-mode behavior are both validated

Project 17: Red-Team Evaluation Harness

File: P17-red-team-evaluation-harness.md
Main Programming Language: Elixir
Alternative Programming Languages: Python
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 3. The Service and Support Model
Difficulty: Level 4: Expert
Knowledge Area: Adversarial testing, scorecards, regression safety
Software or Tool: jido_eval (experimental), custom eval suites, telemetry
Main Book: “Practical Malware Analysis” (for adversarial mindset)

What you will build: A harness that runs adversarial prompts/tool attacks and scores policy compliance, grounding quality, and recovery behavior.

Why it teaches Jido deeply: It shifts evaluation from anecdotal demos to repeatable security and reliability benchmarks.

Core challenges you will face:

Scenario design quality -> maps to realistic threat models
Deterministic scoring despite LLM variance -> maps to rubric design
Regression gating in CI -> maps to production safety culture

Real World Outcome

$ mix run scripts/p17_redteam_harness.exs
[suite] scenarios=32 categories=[prompt_injection,tool_escalation,schema_fuzz]
[run] case=inj_07 expected=deny_tool actual=deny_tool score=1.0
[run] case=schema_03 expected=repair actual=repair score=1.0
[run] case=ground_04 expected=citations>=2 actual=1 score=0.0
[summary] pass_rate=87.5% critical_failures=1
[gate] ci_status=failed threshold=95%

The Core Question You Are Answering

“How do I know my agent is still safe after every prompt, tool, or model change?”

Concepts You Must Understand First

Adversarial test taxonomy
- Book Reference: “Practical Malware Analysis” - adversarial patterns mindset
Scoring rubrics and confidence thresholds
- Book Reference: “Algorithms, Fourth Edition” - scoring/statistics
Regression gates in delivery pipelines
- Book Reference: “The Pragmatic Programmer” - quality automation

Questions to Guide Your Design

Which failures are release blockers?
How do you score partially correct responses?
How do you keep the suite representative over time?

Thinking Exercise

Design 10 attack scenarios across three categories and define expected safe behavior.

The Interview Questions They Will Ask

“How do you prevent eval overfitting?”
“What makes a red-team scenario realistic?”
“How do you score non-deterministic outputs?”
“How do you tie evals to release decisions?”
“How do you prioritize failing scenarios?”

Hints in Layers

Hint 1: Start with 5 high-value scenarios

Hint 2: Separate hard rules from soft quality metrics

Hint 3: Version your eval datasets

Hint 4: Publish trendline metrics over weekly runs

Books That Will Help

Topic	Book	Chapter
Adversarial thinking	“Practical Malware Analysis”	Attack mindset
Scoring systems	“Algorithms, Fourth Edition”	Metrics/statistics
Continuous quality	“The Pragmatic Programmer”	Automation and testing

Common Pitfalls and Debugging

Problem 1: “Pass rate fluctuates wildly”

Why: Rubric depends on free-form wording.
Fix: Score structural signals (policy action, citations, tool usage) first.
Quick test: Re-run same suite 5x and inspect variance.

Problem 2: “CI too noisy”

Why: Thresholds not tiered by severity.
Fix: Separate critical blocker metrics from advisory metrics.
Quick test: Inject one advisory fail and confirm release policy behavior.

Definition of Done

Red-team suite runs reproducibly with versioned scenarios
Critical policy failures gate releases
Score reports include per-category breakdowns
Historical trend tracking is available

Project 18: Multimodal Agent Pipeline

File: P18-multimodal-agent-pipeline.md
Main Programming Language: Elixir
Alternative Programming Languages: TypeScript
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 3. The Service and Support Model
Difficulty: Level 4: Expert
Knowledge Area: Browser automation, vision input, multimodal context
Software or Tool: jido_browser, req_llm multimodal APIs
Main Book: “Clean Architecture”

What you will build: An agent that navigates a web page, captures screenshot/content, and produces structured incident summaries from multimodal inputs.

Why it teaches Jido deeply: It combines external interaction, extraction, and model reasoning into one controlled pipeline.

Core challenges you will face:

Session lifecycle for browser tools -> maps to robust setup/cleanup
Large multimodal context shaping -> maps to token control
Cross-modal grounding -> maps to verifiable outputs

Real World Outcome

$ mix run scripts/p18_multimodal_pipeline.exs --url https://status.example.com
[browser] session_started adapter=vibium
[navigate] ok url=https://status.example.com
[extract] markdown_chars=9421 screenshot_bytes=183204
[llm] model=openai:gpt-4o-mini input_modalities=[text,image]
[result] severity=s2 affected_services=3 confidence=0.84
[artifact] wrote=artifacts/p18_incident_summary.json
[browser] session_ended=true

The Core Question You Are Answering

“How do I make multimodal agent outputs grounded in what was actually seen on screen?”

Concepts You Must Understand First

Browser action sequence and waits
- Book Reference: “The Pragmatic Programmer” - automation reliability
Multimodal message construction
- Book Reference: “Clean Architecture” - adapter patterns
Output grounding checks
- Book Reference: “Designing Data-Intensive Applications” - data quality

Questions to Guide Your Design

Which browser actions are mandatory before extraction?
How do you avoid stale page capture?
How do you link output claims to screenshot/text evidence?

Thinking Exercise

Design a provenance object linking every final claim to source modality.

The Interview Questions They Will Ask

“How do you ensure deterministic browser flows?”
“What is your strategy for long page content?”
“How do you test multimodal grounding quality?”
“How do you recover from browser-session failures?”
“How do you secure browser automation in production?”

Hints in Layers

Hint 1: Add explicit wait_for_selector steps

Hint 2: Capture both markdown and screenshot every run

Hint 3: Use structured output schema for summary

Hint 4: Add failure path for partial extraction

Books That Will Help

Topic	Book	Chapter
Reliable automation	“The Pragmatic Programmer”	Automation workflows
Adapter architecture	“Clean Architecture”	Interface adapters
Data quality controls	“Designing Data-Intensive Applications”	Data correctness

Common Pitfalls and Debugging

Problem 1: “Vision summary contradicts extracted text”

Why: Different capture timestamps.
Fix: Capture all modalities in one atomic step sequence.
Quick test: Assert same page URL/timestamp in all artifacts.

Problem 2: “Browser process leaks”

Why: Session not closed on error path.
Fix: Ensure cleanup in finally/termination callback.
Quick test: Stress run 100 sessions; no orphan processes.

Definition of Done

Browser session lifecycle is reliable under success/failure
Multimodal prompt includes synchronized text and image artifacts
Structured summary output is validated and stored
Grounding/provenance fields are present for key claims

Project 19: Hot Upgrade Release Drill

File: P19-hot-upgrade-release-drill.md
Main Programming Language: Elixir/Erlang
Alternative Programming Languages: N/A
Coolness Level: Level 5: Pure Magic
Business Potential: 4. The Open Core Infrastructure
Difficulty: Level 5: Master
Knowledge Area: OTP release handling, appup/relup, zero-downtime ops
Software or Tool: OTP releases, release_handler, appup, relup
Main Book: “Erlang and OTP in Action”

What you will build: A controlled hot-upgrade drill where a running Jido agent system is upgraded with no lost in-flight work and validated rollback path.

Why it teaches Jido deeply: It connects agent runtime guarantees with BEAM’s operational superpower: live upgrades.

Core challenges you will face:

State transformation between versions -> maps to code_change safety
Release packaging correctness (.appup, relup) -> maps to deploy reliability
Rollback under partial failure -> maps to operational resilience

Real World Outcome

$ _build/prod/rel/my_app/bin/my_app upgrade 0.2.0
[release] install_release from=0.1.0 to=0.2.0
[agent] in_flight_requests=12
[upgrade] code_change module=MyApp.Agent.Runtime result=ok
[upgrade] directives_queue_dropped=0
[health] status=green p95_latency_ms=96
[rollback_test] install_release 0.1.0 result=ok

The Core Question You Are Answering

“Can I evolve a live autonomous system without stopping it or corrupting state?”

Concepts You Must Understand First

OTP release handling basics (appup, relup)
- Book Reference: “Erlang and OTP in Action” - releases and operations
State migration (code_change/3)
- Book Reference: “Clean Architecture” - versioned contracts
Upgrade/rollback runbooks
- Book Reference: “The Pragmatic Programmer” - operational discipline

Questions to Guide Your Design

Which modules require state transformation?
What pre-upgrade health checks are mandatory?
How do you prove no in-flight loss?

Thinking Exercise

Draft a go/no-go checklist for upgrade initiation and rollback trigger criteria.

The Interview Questions They Will Ask

“What can and cannot be hot-upgraded safely?”
“How do you test code_change paths?”
“How do you detect silent state corruption post-upgrade?”
“What is your rollback SLO?”
“How do you coordinate upgrades across nodes?”

Hints in Layers

Hint 1: Upgrade one non-critical module first

Hint 2: Add explicit state version tag

Hint 3: Track in-flight request counters before/after

Hint 4: Practice rollback drill in staging weekly

Books That Will Help

Topic	Book	Chapter
Release handling	“Erlang and OTP in Action”	OTP releases
State evolution	“Clean Architecture”	Interface evolution
Ops runbooks	“The Pragmatic Programmer”	Pragmatic operations

Common Pitfalls and Debugging

Problem 1: “Upgrade succeeds but behavior regresses”

Why: Missing post-upgrade verification suite.
Fix: Run synthetic workload immediately after upgrade.
Quick test: Compare golden workload before/after upgrade.

Problem 2: “Rollback fails due to incompatible state”

Why: One-way state migration.
Fix: Design reversible migration where required.
Quick test: Upgrade then rollback in staging on every release candidate.

Definition of Done

Hot upgrade succeeds with no dropped in-flight work
Rollback path is tested and documented
State migration functions are versioned and validated
Upgrade runbook includes health gates and abort criteria

Project 20: BEAM Autonomous Ops Swarm

File: P20-beam-autonomous-ops-swarm.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang, Rust sidecars
Coolness Level: Level 5: Pure Magic
Business Potential: 5. The Industry Disruptor
Difficulty: Level 5: Master
Knowledge Area: End-to-end multi-agent operations platform
Software or Tool: jido, jido_ai, jido_signal, req_llm, jido_studio, jido_runic
Main Book: “Designing Data-Intensive Applications”

What you will build: A capstone swarm with coordinator + specialist agents handling incidents autonomously with policy gates, dashboards, persistence, and distributed recovery.

Why it teaches Jido deeply: It integrates every core concept into one production-like system under fault injection.

Core challenges you will face:

Cross-agent protocol design -> maps to typed signals and causality
Governed autonomy -> maps to permission firewall + HITL control
Operational resilience -> maps to failover, replay, upgrades, and observability

Real World Outcome

$ mix run scripts/p20_ops_swarm_capstone.exs --scenario incident_simulation
[swarm] agents={coordinator:1,triage:3,repair:4,verify:2}
[incident] id=inc_2026_02_12_01 severity=s2 source=github.issue.opened
[phase] triage -> research -> patch -> quality -> approval -> deploy
[policy] high_risk_action requires_human_approval=true
[operator] approved action=deploy_patch
[resilience] provider_failover=true netsplit_recovered=true
[summary] mttr_minutes=14 cost_usd=0.42 policy_violations=0
[result] status=completed postmortem=artifacts/p20_postmortem.md

The Core Question You Are Answering

“Can I run a policy-governed autonomous operations system that is fast, explainable, and resilient?”

Concepts You Must Understand First

Multi-agent orchestration and handoffs
- Book Reference: “Designing Data-Intensive Applications” - distributed workflows
Signal causality and replay
- Book Reference: “Erlang and OTP in Action” - distributed message handling
Governance and safety controls
- Book Reference: “Foundations of Information Security” - policy and audit

Questions to Guide Your Design

Which phases are fully autonomous vs approval-gated?
How do you define success/failure SLOs for the swarm?
How do you keep postmortems auto-generated and trustworthy?

Thinking Exercise

Define end-to-end SLOs: availability, MTTR, cost budget, policy violation rate.

The Interview Questions They Will Ask

“How do you keep swarm behavior explainable?”
“What prevents cascading failures across agents?”
“How do you enforce global policy across heterogeneous agents?”
“How do you validate resilience claims?”
“How do you transition this capstone to production governance?”

Hints in Layers

Hint 1: Build one vertical slice first (triage->approve->notify)

Hint 2: Add specialist agents incrementally

Hint 3: Use one canonical signal schema registry

Hint 4: Run weekly chaos drills and publish metrics

Books That Will Help

Topic	Book	Chapter
Distributed platform design	“Designing Data-Intensive Applications”	Reliability and scale
BEAM operations	“Erlang and OTP in Action”	Distribution and supervision
Governance and controls	“Foundations of Information Security”	Risk and policy

Common Pitfalls and Debugging

Problem 1: “Swarm finishes but postmortem is incomplete”

Why: Missing causal links in signal metadata.
Fix: Enforce trace_id and parent_signal_id on all signals.
Quick test: Postmortem generator must reconstruct full phase chain.

Problem 2: “Automation stalls at approval boundaries”

Why: No timeout/escalation policy for pending approvals.
Fix: Add escalation signals and fallback responders.
Quick test: Simulate absent operator; verify escalation path.

Definition of Done

Full swarm run completes with traceable phase transitions
Policy gates and human approvals are enforced for risky actions
Resilience drills (failover + netsplit + restart) pass
Capstone emits measurable SLO report and postmortem artifact

Project Comparison Table

Project	Difficulty	Time	Depth of Understanding	Fun Factor
1. Signal-Native ReAct Calculator Agent	Level 2	Weekend	Medium	★★★★☆
2. Tool-Governed Web Research Agent	Level 2	Weekend	Medium	★★★★☆
3. Multi-Provider Failover Gateway	Level 3	1-2 weeks	High	★★★★☆
4. Streaming Observability Console	Level 3	1-2 weeks	High	★★★★★
5. Structured Output Contracts	Level 2	Weekend	Medium	★★★★☆
6. Cost-Aware Model Router	Level 3	1-2 weeks	High	★★★★★
7. Skill and Plugin Composition Lab	Level 3	1-2 weeks	High	★★★★☆
8. Strategy State Machine Switchboard	Level 4	2-3 weeks	Very High	★★★★★
9. Worker Pool Load Lab	Level 3	1-2 weeks	High	★★★★☆
10. Persistent Thread Memory	Level 3	1-2 weeks	High	★★★★☆
11. Sensor-Driven Incident Triage	Level 3	1-2 weeks	High	★★★★★
12. Cron Autonomous Maintenance	Level 3	1-2 weeks	High	★★★★☆
13. Distributed Netsplit Recovery Drill	Level 4	2-3 weeks	Very High	★★★★★
14. ETS and Mnesia Hybrid Agent Memory	Level 4	2-3 weeks	Very High	★★★★★
15. LiveView HITL Control Center	Level 3	1-2 weeks	High	★★★★☆
16. Tool Permission Firewall	Level 4	2-3 weeks	Very High	★★★★★
17. Red-Team Evaluation Harness	Level 4	2-3 weeks	Very High	★★★★★
18. Multimodal Agent Pipeline	Level 4	2-3 weeks	Very High	★★★★★
19. Hot Upgrade Release Drill	Level 5	3-4 weeks	Expert	★★★★★
20. BEAM Autonomous Ops Swarm	Level 5	4-6 weeks	Expert+	★★★★★

Recommendation

If you are new to Jido/BEAM agents: Start with Project 1, then Project 5, then Project 6 to build deterministic fundamentals plus cost awareness.

If you are an SRE/platform engineer: Start with Project 9, Project 13, and Project 19 to focus on runtime guarantees, partition behavior, and safe upgrades.

If you want to build production AI products quickly: Start with Project 2, Project 4, Project 15, then move to Project 20.

Final Overall Project: Autonomous Reliability Control Plane

The Goal: Combine Projects 3, 6, 13, 16, and 20 into a single autonomous operations system.

Build multi-provider routing with policy controls and fallback.
Add distributed supervisor topology with netsplit detection and reconciliation.
Enforce directive safety gates and human approval paths for high-risk actions.
Add live observability with usage/cost telemetry and replayable traces.
Run staged hot-upgrade drills and prove no unsafe state transitions.

Success Criteria: The system remains available and policy-compliant during injected provider failures, child crashes, and node partition events while keeping budget and latency within defined bounds.

From Learning to Production

Your Project	Production Equivalent	Gap to Fill
Project 1	Tool-using assistant microservice	policy and observability hardening
Project 3	Multi-provider inference gateway	enterprise auth + SLA governance
Project 6	AI FinOps control service	org-level budgeting and chargeback
Project 13	Geo-distributed autonomous cluster	formal reconciliation + compliance controls
Project 19	Continuous upgrade pipeline	change management and canary policy
Project 20	Autonomous operations platform	team process, governance, and on-call maturity

Summary

This learning path covers Jido + BEAM-native agent engineering through 20 hands-on projects.

#	Project Name	Main Language	Difficulty	Time Estimate
1	Signal-Native ReAct Calculator Agent	Elixir	Level 2	8-12h
2	Tool-Governed Web Research Agent	Elixir	Level 2	10-14h
3	Multi-Provider Failover Gateway	Elixir	Level 3	12-18h
4	Streaming Observability Console	Elixir	Level 3	12-18h
5	Structured Output Contracts	Elixir	Level 2	8-12h
6	Cost-Aware Model Router	Elixir	Level 3	12-18h
7	Skill and Plugin Composition Lab	Elixir	Level 3	12-18h
8	Strategy State Machine Switchboard	Elixir	Level 4	16-24h
9	Worker Pool Load Lab	Elixir	Level 3	12-18h
10	Persistent Thread Memory	Elixir	Level 3	12-20h
11	Sensor-Driven Incident Triage	Elixir	Level 3	12-20h
12	Cron Autonomous Maintenance	Elixir	Level 3	12-20h
13	Distributed Netsplit Recovery Drill	Elixir	Level 4	18-28h
14	ETS and Mnesia Hybrid Agent Memory	Elixir	Level 4	18-28h
15	LiveView Human-in-the-Loop Control Center	Elixir	Level 3	14-20h
16	Tool Permission Firewall	Elixir	Level 4	16-24h
17	Red-Team Evaluation Harness	Elixir	Level 4	16-24h
18	Multimodal Agent Pipeline	Elixir	Level 4	16-24h
19	Hot Upgrade Release Drill	Elixir	Level 5	24-36h
20	BEAM Autonomous Ops Swarm	Elixir	Level 5	30-50h

Expected Outcomes

You can design deterministic and auditable agent state machines.
You can operate multi-provider LLM systems with explicit budgets and safety policy gates.
You can run distributed, supervised autonomous workflows that recover from real failures.

Additional Resources and References

Standards and Specifications

Primary Jido Ecosystem Sources

Jido Repository - Core framework: Action, Instruction, Plan, Exec, Plugin, AgentServer
Jido.AI Repository - AI strategies: ReAct, CoT, ToT, GoT, Adaptive, TRM; Directives; Skills
ReqLLM Repository - Multi-provider LLM abstraction (45+ providers, 665+ models)
jido_signal Repository - Signal infrastructure: Bus, Router, Dispatch, Journal
LLMDB Repository - Model metadata database (context_window, capabilities, costs)
jido_browser Repository - Browser automation for multimodal agent pipelines
jido_studio Repository - LiveView-based agent observation and HITL control center
Agent Jido Website
ReqLLM 1.0 Announcement
Hex Package: jido
Hex Package: jido_ai
Hex Package: req_llm
Hex Package: jido_signal
Hex Package: llm_db

Research Papers and Technical Foundations

Industry Signals and Metrics (as of 2026-02-12)