Sprint: JIDO Mastery - Real World Projects

Goal: Build a deep, Elixir-first command of the JIDO ecosystem by combining req_llm (provider-agnostic model orchestration) and jido (pure-functional agent runtime). You will learn how to design deterministic agent logic, safe side-effect handling, workflow composition, and production-grade observability with BEAM strengths. By the end, you will be able to ship practical LLM workflows that are testable, policy-aware, resource-efficient, and resilient under retries, model outages, and multi-agent concurrency.

Introduction

What is JIDO? A growing ecosystem around jido (core agent behavior) and req_llm (provider-agnostic LLM API access) for building practical AI products in Elixir.
What problem does it solve today? It replaces duplicated HTTP glue, ad-hoc message formats, and side-effect-heavy agent code with standardized models, command contracts, and OTP supervision.
What you will build across the projects: a portfolio of agentic systems from API routers to autonomous workflows, each with measurable behavior and clear failure handling.
In scope: practical architecture, state and directive design, tool integration, streaming, testing, and production operations. Out of scope: deep tuning of every model, and building brand-new LLM providers.

Big-picture model:

┌──────────────────────────────────────────────────────────────┐
│                    Product Intent / Policy                    │
│  (business rules, budgets, safety boundaries, success criteria)│
└───────────────────────────────┬──────────────────────────────┘
                                │
               ┌────────────────▼────────────────┐
               │        jido Agents + Signals    │
               │ (pure commands, deterministic)   │
               └────────────────┬────────────────┘
                                │
               ┌────────────────▼────────────────┐
               │         Workflow Engine          │
               │ (plans, routing, checkpoints)    │
               └────────────────┬────────────────┘
                                │
               ┌────────────────▼────────────────┐
               │      req_llm Provider Layer      │
               │ (models, tools, output schemas)  │
               └────────────────┬────────────────┘
                                │
               ┌────────────────▼────────────────┐
               │   Observability + Persistence    │
               │ (cost, tokens, traces, states)   │
               └─────────────────────────────────┘

Use this as your default debugging lens: if a behavior is surprising, locate the boundary that crossed from policy -> command, command -> directive, or directive -> execution.

How to Use This Guide

Read the primer before any project. Keep a running notes file for each concept chapter because projects reference chapter names directly.
Pick one of the learning paths by your role and start from the first project in that path.
Validate progress after each project using its Definition of Done section and keep test evidence (transcripts, outputs, and decision logs).
Alternate between two project modes:
- Build mode: implement the project.
- Hardening mode: run through failure scenarios, retry logic, and cost checks before moving forward.

Prerequisites & Background Knowledge

Essential Prerequisites (Must Have)

Strong reading comprehension of Elixir syntax and data transformation.
Comfortable with maps, structs, protocols, and pattern matching.
Basic understanding of HTTP APIs and JSON payload design.
Familiarity with command-line debugging and process-based systems.
Recommended Reading: “The Little Elixir & OTP Guidebook” by Benjamin Tan Wei Hao - Ch. 1-6

Helpful But Not Required

Supervision trees in Phoenix/BEAM applications (learn during Projects 6-8)
Structured data validation (learn during Projects 1-4)

Self-Assessment Questions

Can you explain how a function can be pure and still produce useful side-effectful behavior in an agent system?
Can you identify where model provider differences leak into your product and where they should be abstracted?
Can you propose a minimal checkpoint model for retries and timeouts across multiple steps?

Development Environment Setup

Required Tools:

Elixir 1.17+
Erlang/OTP 26+
Git
mix

Recommended Tools:

PostgreSQL or SQLite for state logs
curl for API smoke checks
telemetry stack for tracing

Testing Your Setup:

$ mix hex.info elixir
$ mix local.hex
$ mix local.rebar

Expected output snippets:

Erlang/OTP 26+
Elixir 1.17+
Rebar and Hex: ok

Time Investment

Simple projects: 4-8 hours each
Moderate projects: 10-20 hours each
Complex projects: 20-40 hours each
Total sprint: ~10-14 weeks for all 8 projects and a review project

Important Reality Check The hardest part is not calling LLM APIs; it is maintaining invariants: what is state, what is side effect, and what is guaranteed to be reproducible under retries.

Big Picture / Mental Model

The ecosystem split is simple:

req_llm solves how to talk to different model providers without your application owning all protocol differences.
jido solves how to think, decide, and route these model outputs through deterministic agents under OTP.

The two fit when agents emit structured plans and directives instead of raw imperative behavior.

+-------------------+      +--------------------+      +------------------+
|  req_llm Providers| ---> |  Agent Commands    | ---> | Runtime Directives|
|  (openai,
|  anthropic,
|  gemini, etc.)    |      | cmd/2 + state      |      | spawn, emit,
+-------------------+      +--------------------+      | schedule, stop    |
          │                        │                      +---------+--------+
          ▼                        ▼                                |
+-------------------+      +--------------------+                   |
|  Provider Sync    | ---> | Workflow + Signals  | <-----------------+
|  Model metadata   |      | routing + state     |
+-------------------+      +--------------------+

This makes the system resilient to model churn:

Model selection changes stay in provider layer.
Business behavior stays in agent/action layer.
Runtime behavior stays in directives and OTP policies.

Theory Primer

Chapter 1: Canonical Provider Abstraction with req_llm

Fundamentals req_llm exposes one high-level path for different providers while preserving provider-specific capabilities where needed. It supports a model spec format and standardized request/response shapes. In practice, this means your product can start with two API clients and scale to many. The key idea is not “one universal API always works,” but “one universal contract plus deterministic provider translation.” The two-layer design lets you choose between generate_text/3, generate_object/4, and stream_text/3 helpers and direct low-level Req plugin use when you need custom HTTP behavior.

Deep Dive Provider abstraction is an economic and operational contract. In production code, every direct model call creates three risk planes: compatibility, semantics, and cost control. Compatibility risk includes schema drift across vendors (parameter names, token accounting, response format). Semantic risk includes differences in tool-calling and JSON support. Cost risk includes hidden charges from tool usage or image generation. The package addresses all three with three mechanisms:

Canonical models and model specs. By pinning each call to explicit model identifiers and context structures, you avoid hard-coded request payload assumptions.
Provider metadata and sync. Model registry data (cost, limits, context windows, modalities) is synchronized and can power budget decisions and runtime guards.
Canonical/ provider-aware translation layers. Default options are translated into provider-specific field names and behavior, while your application writes against stable high-level terms.

The practical effect is visible in testability. You can test with fixtures, stubs, and cached responses because your call shape is stable. You can also route workloads by provider capabilities: for example, send tool-heavy extraction jobs to providers with stronger structured output support, while sending long-context summarization to models with favorable context windows. req_llm’s feature set in v1.5.1 also adds streaming adapters and explicit usage telemetry, so you can track both UX responsiveness and spend.

Failure modes to design for:

Model downgrade or outage: fallback path from preferred provider to secondary.
Invalid schema returns: structured output mode fails hard if schema assumptions are wrong.
Timeout amplification: long prompts with strict synchronous calls can stall if connection pools are misconfigured.
Key precedence ambiguity: wrong key source injection causing tests to use production keys.

A stable strategy is to treat provider differences as policy-managed capabilities, not as call-site conditionals.

Definitions & key terms

Canonical model: standardized schema used for request and response.
Model spec: <provider>:<model> shorthand plus optional config.
Structured generation: map/object output constrained by schema.
Streaming session: token-level transport that preserves immediate UX feedback.

Mental model diagram

User request
     |
     v
+-------------------+      +-----------------+
| req_llm high layer | ---> | provider-specific |
| generate_text/3     |      | translation       |
+-------------------+      +-----------------+
     |   
     v    
+----------------------------+
| canonical message / context |
| schema validation + retries |
+----------------------------+
     |
     v
+----------------------------+
| provider API + telemetry     |
| usage, cost, tool metadata   |
+----------------------------+

How it works

Parse incoming model spec and messages.
Resolve provider module and model metadata.
Apply provider options and capability constraints.
Generate request payload in canonical form, then translate.
Execute request and decode into canonical response.
Return deterministic response with usage/cost metadata.

Invariants: invalid spec fails early; structured mode cannot silently return malformed shape; streaming responses expose tokens and metrics. Failure modes: provider HTTP faults, schema decode failures, key resolution errors.

Minimal concrete example

$ mix deps.get
$ mix req_llm.gen "Summarize last 3000 chars" --model anthropic:claude-3-7
{agent: ok}
$ mix req_llm.gen "Generate JSON invoice line" --json --model openai:gpt-4o-mini
{"invoice_id":"...","total":12.34}

Common misconceptions

“One provider abstraction means all providers behave identically.” They don’t; abstraction hides differences but must expose capability differences.
“Structured outputs are always guaranteed.” Schema mode can still produce edge-case validation failures.
“Streaming is only for UI.” Streaming is also a diagnostic channel for partial inference timing.

Check-your-understanding questions

What does provider translation buy you that a raw HTTP client cannot?
When would low-level Req usage be mandatory?
How do you detect and react to usage cost drift over time?

Check-your-understanding answers

Stable call contract and capability-aware options.
Custom headers/auth flows, unusual endpoints, or strict request tracing.
Compare usage metadata over history and route model/provider decisions with cost budgets.

Real-world applications

LLM routers in SaaS support portals.
Multi-provider fallback in enterprise copilots.
API translation layers for tool-calling toolchains.

Where you’ll apply it

Projects 1, 2, 3, 4.

References

ReqLLM overview v1.5.1 — hexdocs.pm/req_llm/1.5.1/overview.html
ReqLLM.Provider behavior — hexdocs.pm/req_llm/ReqLLM.Provider.html
OOP of API adapters in modern distributed systems (secondary): “Designing Data-Intensive Applications” by Kleppmann.

Key insight Provider abstraction succeeds when it reduces cognitive load without hiding required capability distinctions.

Summary Without canonical contracts, each model call becomes bespoke technical debt. With req_llm, the contract is standardized, and capability differences become explicit strategy data.

Homework/Exercises to practice the concept

List three providers and map which option names differ.
Sketch a fallback matrix for latency, cost, and capability.
Design a policy that blocks image generations after budget threshold.

Solutions to the homework/exercises

Example names: max_tokens, max_completion_tokens, image endpoint variant names.
Use a table with score-based provider ranking per task type.
Add middleware before request execution that checks cumulative cost before every generate_* call.

Chapter 2: Typed Message, Tool, and Output Modeling

Fundamentals LLM workflows break when teams mix free-text prompts, non-deterministic tool shapes, and ad-hoc JSON assumptions. req_llm offers canonical types (Context, Message, ContentPart, Tool, Response, Usage) and integrates with structured generation. This pushes your architecture toward contracts, which is especially important as multiple agents begin producing output that feeds other agents.

Deep Dive Type modeling in agentic systems is not about syntax aesthetics; it is about blast-radius control. An untyped output from one model becomes a fault source in all downstream workers. A typed canonical layer solves this by moving uncertainty to validation boundaries.

In this ecosystem, typed modeling appears at two layers:

Provider payload normalization in req_llm.
Agent command/state semantics in jido.

For req_llm, schema-driven generation is your defense against “almost JSON.” This means declaring shape expectations before generation and letting the library validate decoded output. For long pipelines, the output model should be versioned because fields evolve while tasks remain stable.

For JIDO-style agents, message and signal typing is similarly central. Signals are envelopes for routing, audit, and observability. When each signal has a predictable envelope shape, supervisors, bus adapters, and external tools can interoperate without implicit assumptions.

Failure modes and patterns:

Loose contracts create silent downstream failures.
Overly strict contracts reject recoverable provider variants and trigger unnecessary retries.
Unversioned schemas make migrations painful and break older persisted outputs.

Design pattern: dual-layer schema strategy.

Internal schemas: strict and typed for business logic.
External schemas: permissive ingestion to canonical projection layer.

This allows you to preserve compatibility and reliability while still improving contract quality over time.

Definitions & key terms

Canonical schema: stable internal representation.
Schema drift: incompatible change in field presence or meaning.
Tool schema: declaration of function/tool input expected by model or agent.
Envelope: wrapper metadata around content, source, and trace context.

Mental model diagram

Incoming payload
   |
   v
+------------------+      +--------------------+
| Typed request     |---->| Validation boundary |
| schema + context  |     | canonical output    |
+------------------+      +--------------------+
         |                        |
         v                        v
+------------------+      +--------------------+
| Tool call payload |      | Signal / directive |
| model output      |      | execution metadata |
+------------------+      +--------------------+

How it works

Define contracts for each stage before implementation.
Encode expected output shape in req_llm schema options.
Enforce validation at generation boundaries.
Emit typed signals from agent actions.
Translate typed outputs into agent state transitions.

Invariants: no stage may consume unknown/unvalidated payloads. Retry only idempotent steps. Failure modes include schema mismatch, unknown provider fields, and signal routing to non-existent handlers.

Minimal concrete example

Schema draft:
- invoice_id (required, string)
- total_cents (required, integer)
- currency (required, one of usd/eur/gbp)

Run:
$ mix req_llm.gen "Parse invoice" --json --model openai:gpt-4o-mini
{"invoice_id":"INV-1001","total_cents":1299,"currency":"usd"}

Common misconceptions

“Schemas remove creativity.” They don’t; they constrain just the machine contract.
“Validation is optional in prototypes.” It is optional only for throwaway scripts.
“One schema per project is enough.” High maturity requires multiple versions.

Check-your-understanding questions

Why should schema versions exist in long-lived workflows?
Where is the best place to validate external tool results?
What is the failure mode when strict schema mode is too rigid?

Check-your-understanding answers

Persistence and replay become predictable.
At boundary modules that own the external integration.
Retry loops and user-visible false failures when model output format changes.

Real-world applications

Invoice, incident, and procurement parsers.
Policy-driven orchestration where signals route by typed tags.
Contract tests for multi-agent tool pipelines.

Where you’ll apply it

Projects 1, 2, 6, 7.

References

ReqLLM structured outputs section — hexdocs.pm/req_llm/1.5.1/overview.html
Jido Action model docs — jido Action guide
“Domain-Driven Design” by Evans (context boundaries and contracts)

Key insight Strong interfaces are your leverage point: they let agents reason with less ambiguity and recover faster from drift.

Summary Use strict contracts where behavior matters, and resilient parsing at external boundaries where reality is noisy.

Homework/Exercises to practice the concept

Draft two versions of an invoice output schema.
Propose backward compatibility for a changed total field name.
Design one signal envelope for human approval.

Solutions to the homework/exercises

v1 (total_cents, currency), v2 (line_total, currency_code) with explicit migration.
Keep aliases at translation stage with deprecation logging.
Include approval_id, actor, resource, and deadline_ms fields in signal envelope.

Chapter 3: Pure Command + Directive Runtime in jido

Fundamentals jido centers on the idea that an agent command should be pure state transition logic, while effects remain explicit directives. cmd/2 returns updated agent state and a directive list. This separation enables deterministic tests and predictable execution semantics.

Deep Dive In raw GenServer code, logic and side effects often co-locate. That makes testing difficult because every branch may touch external systems. jido formalizes a command architecture where each action’s role is clear: transform state; emit intent. This is conceptually similar to Elm and Redux reducers paired with effect descriptors.

For practical systems, this has three payoffs:

Determinism: same input -> same state change.
Auditability: directives reveal external impact intent.
Runtime swapability: directive execution policy can change without changing business logic.

A typical confusion is thinking directives are mandatory external code inside action functions. In this ecosystem, directives are returned data. Execution occurs through runtime strategies (AgentServer, strategy modules, supervisors). This makes it easier to test with fixtures and to introduce policy changes.

Failure modes:

State + directive ambiguity: writing external APIs directly inside action functions breaks determinism.
Over-directed commands: returning too many directives without explicit ordering can hide flow bugs.
Unhandled directive types: runtime may drop unknown directives silently if not registered.

Countermeasures:

Keep each action small and composable.
Classify directives by effect domain (emit/schedule/spawn/stop).
Add compile-time and runtime guards for directive schema.

Definitions & key terms

cmd/2: core operation returning {agent, directives}.
Action: command unit with schema and run/2 behavior.
Directive: external effect description.
StateOp: in-command internal state update.

Mental model diagram

+-----------+      +-----------------+      +--------------------+
| Action fn | ---> | cmd/2 contract  | ---> | Directives Pipeline |
| run/2     |      | (state only)    |      | runtime interpreter  |
+-----------+      +-----------------+      +--------------------+
                                              |
                                              v
                                       +----------------------+
                                       | External side effects |
                                       +----------------------+

How it works

Receive command and validate action schema.
Compute next state from immutable rules.
Construct directive list with no direct side effects.
Return state + directives.
Runtime strategy executes directives atomically or with policies.

Invariant: state changes are pure functions of input. Failure modes: invalid action params, missing directive handler, directive execution timeout.

Minimal concrete example

$ make run-counter
before: count=0
command increment(+1)
after:  count=1 directive=[Emit(:progress_event)]

Common misconceptions

“Directives are slow because they are indirect.” Indirection increases clarity and testability.
“Pure logic means no useful side effects.” It means side effects are explicit and controlled.
“Actions must be classes/modules only.” They are composable units with schema and deterministic behavior.

Check-your-understanding questions

Why return directives instead of executing actions directly?
What is the value of StateOps?
How do you test action purity?

Check-your-understanding answers

So runtime can control ordering, retries, and supervision.
They represent safe, internal state transition operations.
Snapshot-before/snapshot-after comparisons and property-style invariants.

Real-world applications

Financial workflows with audit trails.
Approval systems where actions should be reproducible.
Internal tool orchestration with explicit side effects.

Where you’ll apply it

Projects 5, 6, 7, 8.

References

Jido Overview and core examples — hexdocs.pm/jido/2.0.0-rc.4/readme.html
Jido Action docs — hexdocs.pm/jido/jido/Action.html
“Functional Design” by Resig (pure core + effect boundary)

Key insight The hardest production bugs disappear when you force state changes to be pure and effects to be declared.

Summary cmd/2 is the seam between reasoning and action. Guard it well and most system behavior becomes inspectable.

Homework/Exercises to practice the concept

Convert one side-effecting action into command+directive form.
Define three directive types for one domain.
Define test case for idempotence.

Solutions to the homework/exercises

Move API call out of action, emit HttpCall directive.
Emit, Schedule, Stop.
Run command twice on same state and verify expected stable state transitions.

Chapter 4: OTP Runtime, State Machines, and Signals

Fundamentals jido runs on BEAM/OTP, so you inherit supervision, fault recovery, and scheduling primitives. The differentiator is that agent behavior is structured before supervision: directives and signals are first-class messages and execution strategies are pluggable (Direct execution, FSM, custom strategies).

Deep Dive In OTP systems, resilience is achieved by supervision and restart patterns; in agent systems, resilience is also about intent preservation. If an agent step crashes, should you replay from the start or checkpoint and resume? If messages are unordered, can causality still hold? jido addresses this by coupling immutable agent state with explicit signals and strategies.

A practical framework for workflow reliability has four layers:

Agent state: pure domain state and transition invariants.
Signal envelope: typed routing metadata.
Strategy: execution policy including retries, queueing, and timing.
Supervisor tree: process lifecycle.

In a multi-agent setup, signals are crucial. They carry operation intent across boundaries without forcing each agent to know every peer’s internals. This allows horizontal growth and easier policy enforcement.

Failure modes:

Signal loss from misconfigured routes.
Signal storms from unbounded retries.
Unsupervised children that survive only because upstream state still references them.

Mitigation:

Explicit backoff and max-retry budgets.
Supervisor limits and child lifecycle tracing.
Bounded queues and dead-letter/error policies.

Definitions & key terms

FSM strategy: finite-state-machine based execution style.
Signal: portable message envelope for eventing/routing.
Restart policy: how failed processes are recovered.
Supervision tree: hierarchical ownership model of BEAM.

Mental model diagram

+----------------+
|  AgentServer   |
|  cmd/state map |
+--------+-------+
         |
  +------+-------------------+
  |      strategy layer     |
  +------+-------------------+
         |
   +-----+------+      +-------------------+
   | Directive  | ---> | Signal Router     |
   | Execution  |      | (priority+route)  |
   +-----------+      +-------------------+
         |
   +-----+------+
   | Supervisor |
   | restart    |
   +------------+

How it works

Receive a command and produce directives.
Strategy chooses execution order (direct/fsm/custom).
Signals route to other agents or worker adapters.
On failure, supervisor applies strategy.
Metrics and traces record each lifecycle transition.

Invariants: no uncontrolled process spawning and no indefinite loops; signals must be routeable. Failure modes: unbounded queue growth, invalid restart thresholds, poison messages.

Minimal concrete example

$ mix agent.harness.run --scenario timeout-retry
step1: command accepted
step2: worker timeout -> retry with delay
step3: max_retries=2 reached -> emit error directive

Common misconceptions

“OTP is enough; no extra strategy needed.” Strategy is the policy layer between generic OTP and domain flow control.
“Signals are just messages.” A signal is message + metadata + intended route semantics.
“FSMs are overkill.” They are useful where retries and transitions must be visible.

Check-your-understanding questions

What happens to command determinism when strategy changes?
Why are signals better than direct pid calls for scale?
Where do you enforce idempotency?

Check-your-understanding answers

Keep command semantics stable; strategy can alter execution only.
Signals keep topology decoupled and routeable.
In state transitions and deduplication keys.

Real-world applications

Asynchronous workflows in support desks.
Multi-step automation that requires pause/resume behavior.
Human-in-the-loop escalation chains.

Where you’ll apply it

Projects 6, 7, 8.

References

Jido Core Loop / Agents / Signals docs links in hexdocs
OTP principles from “Designing for Scalability with Erlang/OTP” (secondary)

Key insight OTP gives failure recovery; jido’s strategy and signal layers give intentional agent behavior.

Summary Your systems become scalable when process control and business semantics are separated into explicit layers.

Homework/Exercises to practice the concept

Draw a restart policy for a transient API timeout.
Define route rules for two failure signals.
Propose max queue depth and backoff values.

Solutions

1) retry, 2) wait-and-retry, 3) dead-letter after budget.
:agent_busy, :downstream_timeout with different retry budgets.
Start with 200 queue depth; exponential backoff with cap.

Chapter 5: Workflow Orchestration + Testing + Observability

Fundamentals Observability and testability are non-negotiable when multiple models and agents coordinate. A workflow is only useful if you can answer: what happened, why, and how much it cost. jido workflows provide structure; req_llm provides request-level telemetry and usage metadata.

Deep Dive Workflows become brittle when branching and tool calls are embedded in opaque functions. In this ecosystem, use explicit instruction graphs and bounded transitions. You then test three layers:

Command tests: deterministic state transitions.
Workflow tests: sequencing, timeouts, and failure branching.
Runtime tests: directive execution and strategy semantics.

Cost observability is similar: collect token usage and tool usage per stage. If stream usage jumps after a code change, you have a regression. If a model route changes due to config, usage should be expected and visible.

Failure modes:

Latent nondeterminism from non-typed inputs.
Observability gaps where side effects happen without trace IDs.
Test suites that only pass in happy-path mode.

Countermeasures:

Tag each workflow run with correlation IDs.
Assert telemetry events in tests (token counts, latency buckets).
Inject fixture responses and run provider matrix tests.

Definitions & key terms

Instruction sequence: ordered action list executed by workflow engine.
Workflow telemetry: execution-level events beyond model tokens.
Fixture matrix: deterministic recorded responses for provider simulation.
Correlation id: request-level link across stages.

Mental model diagram

Run Request
   |
   v
+----------------------+      +--------------------+
| Validation + Contract | ---> | Workflow Executor  |
+----------------------+      +--------------------+
                                  |
                      +-----------+-----------+
                      |                       |
                [Branch A]               [Branch B]
                      |                       |
             +--------v---------+     +------v--------+
             | req_llm call      |     | jido cmd      |
             +------------------+     +---------------+
                      
                      v
                Trace + Usage Ledger

How it works

Define workflow graph and policy constraints.
Validate each step input/ output contract.
Execute with timeout and retry options.
Emit events at step boundaries.
Aggregate cost and state transitions for audits.

Invariants: every executed step must emit an event; costs must be logged; failed steps should be classifiable. Failure modes: missing events, lost trace context, provider timeout storms.

Minimal concrete example

$ mix test test/workflow_matrix_test.exs
test: workflow_rejects_invalid_payload ... ok
test: workflow_handles_llm_timeout ... ok
metrics: avg_cost_delta_ms +2ms

Common misconceptions

“If it works locally, it works under load.” Workflows fail differently under concurrency.
“Telemetry slows things down and should be deferred.” Light traces are essential for production behavior.
“Integration tests replace unit tests.” They are complementary.

Check-your-understanding questions

Why must workflow tests include timeout and retry cases?
How does token telemetry influence architecture?
Why version workflow outcomes?

Check-your-understanding answers

They reproduce production failure modes.
It reveals cost regressions and provider route shifts.
For replay, rollback, and incident postmortems.

Real-world applications

Customer support agent fleets.
Data extraction + verification systems.
Internal copilots with approval gates.

Where you’ll apply it

Projects 1, 2, 3, 4, 6, 8.

References

ReqLLM usage and streaming sections — ReqLLM 1.5.1
Jido testing guides — jido testing
“Site Reliability Engineering” by Beyer et al.

Key insight If a workflow is not observable and testable as a sequence, it is not production safe.

Summary Treat each run as a financial and reliability unit, not just an execution.

Homework/Exercises to practice the concept

Build three traces for one workflow: success, timeout, retry.
Add correlation ids to every step.
Add a simple budget rule and observe behavior.

Solutions

Sequence with success and two failure branches.
Prefix logs with run id and action id.
Rule: abort if cumulative token cost exceeds threshold.

Chapter 6: BEAM Production Tradeoffs: Concurrency, Isolation, and Throughput

Fundamentals The BEAM runtime is strong at lightweight processes, fault isolation, and supervisor-driven recovery. jido + req_llm succeeds when architecture is built to exploit these properties: many short-lived agents, bounded concurrency, and isolation boundaries.

Deep Dive This is where pragmatic architecture matters. A single long-running task with shared mutable state is fragile; many small isolated agents with explicit messages are safer. jido gives explicit state and directive models; req_llm can be configured with connection pools and stream handling for throughput control.

A mature design pattern uses five knobs:

Concurrency caps to avoid provider saturation.
Pooling strategy for HTTP clients.
Backpressure in signals and queues.
Hibernation / persistence boundaries for inactive agents.
Route-level cost budgets before request dispatch.

You are balancing latency and reliability tradeoffs. Direct synchronous calls improve simplicity but block throughput; asynchronous streaming and queueing improve throughput but require stronger ordering semantics.

Failure patterns:

Overcommitted pools causing request queuing and timeouts.
Retry loops creating thundering herds.
Starvation in shared scheduler when one project over-runs resources.

Use req_llm streaming and Finch pool configuration as operational controls, and use jido strategies to serialize sensitive transitions.

Definitions & key terms

Backpressure: explicit control to slow intake when downstream is saturated.
Process isolation: independent state and failure boundary per BEAM process.
Prefetch/concurrency cap: maximum in-flight work.
Hibernation: memory/CPU optimization for idle agent processes.

Mental model diagram

Incoming jobs      Concurrency queue       Agent pools
   |                    |                     |
   +------------------->+-------------------->+
                        v                     |
                  +-----------+        +-------------------+
                  |  Strategy | ----->  | req_llm workers    |
                  |  policy   |        | Finch connections  |
                  +-----------+        +-------------------+
                        |                     |
                        v                     v
                 +-------------------------------+
                 | Backpressure + Retry Controls |
                 +-------------------------------+

How it works

Accept job and classify with policy (rate, priority).
Route to bounded queue and agent strategy.
Apply concurrency cap before dispatch.
Execute in worker with timeout + backoff.
Record metrics and release capacity.

Invariants: bounded memory usage, bounded queue, and bounded retries. Failure modes: unbounded queue, dropped signals, incorrect fairness.

Minimal concrete example

$ mix jido.ops.report --window 1m
throughput_ok: 142 req/min
timeouts: 1
queue_depth: 12/200
retry_count: 8

Common misconceptions

“BEAM guarantees no bottlenecks.” It guarantees isolation, not infinite throughput.
“More concurrency always means faster.” Without budget controls, it means more failures.
“Streaming is cheap.” Streams require proper pool sizing and lifecycle control.

Check-your-understanding questions

What does bounded concurrency prevent?
Why are provider pools part of business reliability?
Why do we need queue depth limits in agent systems?

Check-your-understanding answers

Prevents collapse under bursty traffic.
Provider limits directly impact success rates and latency budgets.
To protect downstream and preserve fairness under load.

Real-world applications

High-volume support and QA automation.
Document processing with model-heavy stages.
Asynchronous enterprise copilots used across teams.

Where you’ll apply it

Projects 1, 3, 4, 7, 8.

References

Finch and streaming notes in ReqLLM overview
OTP/BEAM architecture references from Erlang/OTP runtime docs
“The Book of Elixir” by Dave Thomas (supervision semantics)

Key insight Throughput is a contract: every layer must expose and enforce limits.

Summary Use bounded concurrency and explicit scheduling as first-class design parameters, not tuning after launch.

Homework/Exercises to practice the concept

Estimate safe max in-flight requests for provider pools.
Simulate burst load and note queue depth behavior.
Introduce a fallback route when queue is near capacity.

Solutions

Start low, then raise via controlled load testing.
Keep queue and error metrics visible during test.
Route low-priority requests to delayed queue with backoff.

Glossary

Action: A validated operation that returns an agent state transition and optional directives.
Canonical model: Standardized internal representation for model requests and outputs.
Directive: Declarative instruction describing a side effect.
Directive strategy: Runtime policy that executes directives.
FSM: Finite state machine execution style.
Provider adapter: Module translating canonical request to provider-specific protocol.
Signal: Structured event envelope used to route between agents and services.
StreamResponse: Streaming response abstraction that yields tokens and usage metadata.
Workflow: Ordered set of actions and instructions.

Why JIDO Matters

By 2025, AI systems moved from single-shot prompts to orchestrated agentic loops. The ecosystem response has shifted to reusable, observable runtimes that can operate across providers and agents.

Standards pressure: interoperability moved from a “nice-to-have” to a necessity as tool ecosystems grow.
Cost pressure: providers vary in latency, output quality, and pricing; observability is required.
Reliability pressure: agents execute with side effects; non-determinism must be bounded.

A practical old-vs-new view:

Old Stack                                  New Stack
----------                                  --------
Prompt scripts only                         Standardized LLM client + command agents
Provider-specific request code               Provider-agnostic + model metadata
Side effects inline                         Pure cmd + explicit directives
Implicit logs                               Trace-aware signals + cost telemetry
Best-effort retries                         Strategy/rate-aware retries with policies

Real-world stats and sources (with year):

ReqLLM v1.5.1 documents 45 providers / 665+ models in a synced model registry and shared metadata, improving route choices and coverage planning. Source: ReqLLM overview, 2025 docs page
Jido package docs are published as v2.0.0-rc.4 and describe production-facing runtime primitives around AgentServer, directives, and stateful execution. Source: Jido home docs, 2025
Public ecosystem shift: official platform announcements show MCP support in major model providers by 2025, indicating pressure toward interoperable agent tooling and protocol-based tool access. Sources: Google Gemini MCP support announcement, 2025, Anthropic MCP announcement context, Nov 2024

The practical conclusion: if your project depends on one provider or one side-effect pattern, you will rework often. A disciplined ecosystem-level design compounds once.

Concept Summary Table

Concept Cluster	What You Need to Internalize
Provider-agnostic LLM abstraction	Use `req_llm` to normalize call patterns while preserving provider-specific capabilities and controls.
Typed contracts and schemas	Treat prompts, tool arguments, and outputs as explicit schemas with versioned boundaries.
Pure cmd and directive pattern	Keep state transitions deterministic, route side effects through explicit directive descriptions.
Signals, workflows, and orchestration	Model agent-to-agent communication and multi-step flow with routing, retries, and checkpoints.
Observability and testing	Measure usage, latency, and failure classes at both workflow and model boundaries.
BEAM-driven production controls	Use supervision, bounded concurrency, and process isolation as architectural primitives, not infrastructure afterthoughts.

Project-to-Concept Map

Project	Concepts Applied
Project 1: Multi-Model Gateway with req_llm	Provider-agnostic LLM abstraction, Typed contracts, BEAM production controls
Project 2: Schema-First Contract Extractor	Typed contracts and schemas, Observability and testing
Project 3: Streaming Copilot Operations Center	Provider-agnostic LLM abstraction, Observability and testing
Project 4: Provider Failover and Cost Router	Provider-agnostic LLM abstraction, BEAM production controls
Project 5: Deterministic Counter Agent	Pure cmd and directive pattern
Project 6: Workflow-First Orchestrator	Signals, workflows, and orchestration, Pure cmd and directive pattern
Project 7: Signal Mesh for Multi-Agent Delegation	Signals, workflows, and orchestration, BEAM-driven production controls
Project 8: Audit-Ready State Machine Agent	Pure cmd/directive, Observability/testing, Signals/workflows

Deep Dive Reading by Concept

Concept	Book and Chapter	Why This Matters
Provider-agnostic LLM abstraction	“Designing Data-Intensive Applications” by Martin Kleppmann - Ch. 3	Helps evaluate tradeoffs in protocol layering and data contracts.
Typed contracts and schemas	“Domain-Driven Design” by Eric Evans - Ch. 4-6	Prevents semantic drift when integrating LLM outputs with business logic.
Pure cmd and directive pattern	“Designing Elixir Systems with OTP” by James Edward Gray II - Ch. 5-8	Grounds immutability and explicit side-effect boundaries in production systems.
Signals and workflow orchestration	“Building Microservices” by Sam Newman - Ch. 8	Clarifies message-driven integration, routing, and failure domains.
Observability and testing	“Site Reliability Engineering” by Betsy Beyer et al. - Ch. 4	Establishes practical measurement and reliability patterns for operations.
BEAM production controls	Elixir/Erlang docs (Supervision and tasks) - v1.17+	Explains process isolation, supervision, and failure recovery models.

Quick Start: Your First 48 Hours

Day 1:

Read this guide up to Chapter 6.
Install prerequisites and scaffold one mix app with jido and req_llm dependencies.
Start Project 1 and Project 5 to internalize two different abstraction styles.

Day 2:

Validate Project 1 by dispatching same prompt across two providers.
Validate Project 5 by proving command determinism with repeated runs.
Log one failure and one retry event.

Day 3–4:

Move to Project 2 and Project 6.
Define schema contracts and workflow checkpoints.

End of weekend: you should be able to explain why directive-based designs remain testable where pure functions end and process runtime begins.

Recommended Learning Paths

Path 1: Product Builder (Recommended)

Project 1 → Project 2 → Project 3 → Project 6 → Project 8

Path 2: Platform/LLM Operator

Project 1 → Project 4 → Project 3 → Project 7 → Project 8

Path 3: Agent Architect

Project 5 → Project 6 → Project 7 → Project 8

Success Metrics

You can design and explain at least three provider-selection policies that include fallback and budget rules.
At least 90% of project workflows include explicit failure handling paths with recorded outcomes.
You can run a mixed test matrix for at least one workflow with happy-path, timeout, invalid payload, and directive-missing cases.
You can route between at least two agents using typed signals and verify end-to-end trace continuity.

Project Overview Table

#	Project	Package Focus	Difficulty	Time	Key Focus
1	Multi-Model Gateway with req_llm	req_llm	2	1.5-2 weeks	Provider abstraction + unified contracts
2	Schema-First Contract Extractor	req_llm	3	2 weeks	Structured outputs + schema contracts
3	Streaming Copilot Operations Center	req_llm	3	1.5-2 weeks	Streaming + telemetry + UX feedback
4	Provider Failover and Cost Router	req_llm	4	2-3 weeks	Cost/routing + key management
5	Deterministic Counter Agent	jido	2	1.5 weeks	cmd/2 and directive separation
6	Workflow-First Orchestrator	jido	4	2-3 weeks	Workflow engine + action sequencing
7	Signal Mesh for Multi-Agent Delegation	jido	4	2-3 weeks	Signals, routing, and parent-child dynamics
8	Audit-Ready State Machine Agent	jido	5	3-4 weeks	FSM strategies + observability + production hardening

Project List

The following projects guide you from a provider-first architecture to agentic production systems.

Project 1: Multi-Model Gateway with req_llm

File: P01-multi-model-gateway.md
Main Programming Language: Elixir
Alternative Programming Languages: Gleam, F#
Coolness Level: 4/5 (high)
Business Potential: 4/5
Difficulty: 2
Knowledge Area: Protocol adaptation, LLM orchestration
Software or Tool: req_llm, mix
Main Book: The Little Elixir and OTP Guidebook

What you will build: A provider-agnostic inference gateway that normalizes prompts, logs standardized usage, and fails over between providers.

Why it teaches JIDO: It shows how stable provider contracts make agent workflows portable and controllable.