Sprint: JIDO Mastery - Real World Projects
Goal: Build a deep, Elixir-first command of the JIDO ecosystem by combining
req_llm(provider-agnostic model orchestration) andjido(pure-functional agent runtime). You will learn how to design deterministic agent logic, safe side-effect handling, workflow composition, and production-grade observability with BEAM strengths. By the end, you will be able to ship practical LLM workflows that are testable, policy-aware, resource-efficient, and resilient under retries, model outages, and multi-agent concurrency.
Introduction
- What is JIDO? A growing ecosystem around
jido(core agent behavior) andreq_llm(provider-agnostic LLM API access) for building practical AI products in Elixir. - What problem does it solve today? It replaces duplicated HTTP glue, ad-hoc message formats, and side-effect-heavy agent code with standardized models, command contracts, and OTP supervision.
- What you will build across the projects: a portfolio of agentic systems from API routers to autonomous workflows, each with measurable behavior and clear failure handling.
- In scope: practical architecture, state and directive design, tool integration, streaming, testing, and production operations. Out of scope: deep tuning of every model, and building brand-new LLM providers.
Big-picture model:
┌──────────────────────────────────────────────────────────────┐
│ Product Intent / Policy │
│ (business rules, budgets, safety boundaries, success criteria)│
└───────────────────────────────┬──────────────────────────────┘
│
┌────────────────▼────────────────┐
│ jido Agents + Signals │
│ (pure commands, deterministic) │
└────────────────┬────────────────┘
│
┌────────────────▼────────────────┐
│ Workflow Engine │
│ (plans, routing, checkpoints) │
└────────────────┬────────────────┘
│
┌────────────────▼────────────────┐
│ req_llm Provider Layer │
│ (models, tools, output schemas) │
└────────────────┬────────────────┘
│
┌────────────────▼────────────────┐
│ Observability + Persistence │
│ (cost, tokens, traces, states) │
└─────────────────────────────────┘
Use this as your default debugging lens: if a behavior is surprising, locate the boundary that crossed from policy -> command, command -> directive, or directive -> execution.
How to Use This Guide
- Read the primer before any project. Keep a running notes file for each concept chapter because projects reference chapter names directly.
- Pick one of the learning paths by your role and start from the first project in that path.
- Validate progress after each project using its Definition of Done section and keep test evidence (transcripts, outputs, and decision logs).
- Alternate between two project modes:
- Build mode: implement the project.
- Hardening mode: run through failure scenarios, retry logic, and cost checks before moving forward.
Prerequisites & Background Knowledge
Essential Prerequisites (Must Have)
- Strong reading comprehension of Elixir syntax and data transformation.
- Comfortable with maps, structs, protocols, and pattern matching.
- Basic understanding of HTTP APIs and JSON payload design.
- Familiarity with command-line debugging and process-based systems.
- Recommended Reading: “The Little Elixir & OTP Guidebook” by Benjamin Tan Wei Hao - Ch. 1-6
Helpful But Not Required
- Supervision trees in Phoenix/BEAM applications (learn during Projects 6-8)
- Structured data validation (learn during Projects 1-4)
Self-Assessment Questions
- Can you explain how a function can be pure and still produce useful side-effectful behavior in an agent system?
- Can you identify where model provider differences leak into your product and where they should be abstracted?
- Can you propose a minimal checkpoint model for retries and timeouts across multiple steps?
Development Environment Setup
Required Tools:
- Elixir 1.17+
- Erlang/OTP 26+
- Git
mix
Recommended Tools:
- PostgreSQL or SQLite for state logs
curlfor API smoke checkstelemetrystack for tracing
Testing Your Setup:
$ mix hex.info elixir
$ mix local.hex
$ mix local.rebar
Expected output snippets:
Erlang/OTP 26+
Elixir 1.17+
Rebar and Hex: ok
Time Investment
- Simple projects: 4-8 hours each
- Moderate projects: 10-20 hours each
- Complex projects: 20-40 hours each
- Total sprint: ~10-14 weeks for all 8 projects and a review project
Important Reality Check The hardest part is not calling LLM APIs; it is maintaining invariants: what is state, what is side effect, and what is guaranteed to be reproducible under retries.
Big Picture / Mental Model
The ecosystem split is simple:
req_llmsolves how to talk to different model providers without your application owning all protocol differences.jidosolves how to think, decide, and route these model outputs through deterministic agents under OTP.
The two fit when agents emit structured plans and directives instead of raw imperative behavior.
+-------------------+ +--------------------+ +------------------+
| req_llm Providers| ---> | Agent Commands | ---> | Runtime Directives|
| (openai,
| anthropic,
| gemini, etc.) | | cmd/2 + state | | spawn, emit,
+-------------------+ +--------------------+ | schedule, stop |
│ │ +---------+--------+
▼ ▼ |
+-------------------+ +--------------------+ |
| Provider Sync | ---> | Workflow + Signals | <-----------------+
| Model metadata | | routing + state |
+-------------------+ +--------------------+
This makes the system resilient to model churn:
- Model selection changes stay in provider layer.
- Business behavior stays in agent/action layer.
- Runtime behavior stays in directives and OTP policies.
Theory Primer
Chapter 1: Canonical Provider Abstraction with req_llm
Fundamentals
req_llm exposes one high-level path for different providers while preserving provider-specific capabilities where needed. It supports a model spec format and standardized request/response shapes. In practice, this means your product can start with two API clients and scale to many. The key idea is not “one universal API always works,” but “one universal contract plus deterministic provider translation.” The two-layer design lets you choose between generate_text/3, generate_object/4, and stream_text/3 helpers and direct low-level Req plugin use when you need custom HTTP behavior.
Deep Dive Provider abstraction is an economic and operational contract. In production code, every direct model call creates three risk planes: compatibility, semantics, and cost control. Compatibility risk includes schema drift across vendors (parameter names, token accounting, response format). Semantic risk includes differences in tool-calling and JSON support. Cost risk includes hidden charges from tool usage or image generation. The package addresses all three with three mechanisms:
- Canonical models and model specs. By pinning each call to explicit model identifiers and context structures, you avoid hard-coded request payload assumptions.
- Provider metadata and sync. Model registry data (cost, limits, context windows, modalities) is synchronized and can power budget decisions and runtime guards.
- Canonical/ provider-aware translation layers. Default options are translated into provider-specific field names and behavior, while your application writes against stable high-level terms.
The practical effect is visible in testability. You can test with fixtures, stubs, and cached responses because your call shape is stable. You can also route workloads by provider capabilities: for example, send tool-heavy extraction jobs to providers with stronger structured output support, while sending long-context summarization to models with favorable context windows. req_llm’s feature set in v1.5.1 also adds streaming adapters and explicit usage telemetry, so you can track both UX responsiveness and spend.
Failure modes to design for:
- Model downgrade or outage: fallback path from preferred provider to secondary.
- Invalid schema returns: structured output mode fails hard if schema assumptions are wrong.
- Timeout amplification: long prompts with strict synchronous calls can stall if connection pools are misconfigured.
- Key precedence ambiguity: wrong key source injection causing tests to use production keys.
A stable strategy is to treat provider differences as policy-managed capabilities, not as call-site conditionals.
Definitions & key terms
- Canonical model: standardized schema used for request and response.
- Model spec:
<provider>:<model>shorthand plus optional config. - Structured generation: map/object output constrained by schema.
- Streaming session: token-level transport that preserves immediate UX feedback.
Mental model diagram
User request
|
v
+-------------------+ +-----------------+
| req_llm high layer | ---> | provider-specific |
| generate_text/3 | | translation |
+-------------------+ +-----------------+
|
v
+----------------------------+
| canonical message / context |
| schema validation + retries |
+----------------------------+
|
v
+----------------------------+
| provider API + telemetry |
| usage, cost, tool metadata |
+----------------------------+
How it works
- Parse incoming model spec and messages.
- Resolve provider module and model metadata.
- Apply provider options and capability constraints.
- Generate request payload in canonical form, then translate.
- Execute request and decode into canonical response.
- Return deterministic response with usage/cost metadata.
Invariants: invalid spec fails early; structured mode cannot silently return malformed shape; streaming responses expose tokens and metrics. Failure modes: provider HTTP faults, schema decode failures, key resolution errors.
Minimal concrete example
$ mix deps.get
$ mix req_llm.gen "Summarize last 3000 chars" --model anthropic:claude-3-7
{agent: ok}
$ mix req_llm.gen "Generate JSON invoice line" --json --model openai:gpt-4o-mini
{"invoice_id":"...","total":12.34}
Common misconceptions
- “One provider abstraction means all providers behave identically.” They don’t; abstraction hides differences but must expose capability differences.
- “Structured outputs are always guaranteed.” Schema mode can still produce edge-case validation failures.
- “Streaming is only for UI.” Streaming is also a diagnostic channel for partial inference timing.
Check-your-understanding questions
- What does provider translation buy you that a raw HTTP client cannot?
- When would low-level Req usage be mandatory?
- How do you detect and react to usage cost drift over time?
Check-your-understanding answers
- Stable call contract and capability-aware options.
- Custom headers/auth flows, unusual endpoints, or strict request tracing.
- Compare usage metadata over history and route model/provider decisions with cost budgets.
Real-world applications
- LLM routers in SaaS support portals.
- Multi-provider fallback in enterprise copilots.
- API translation layers for tool-calling toolchains.
Where you’ll apply it
- Projects 1, 2, 3, 4.
References
- ReqLLM overview v1.5.1 — hexdocs.pm/req_llm/1.5.1/overview.html
ReqLLM.Providerbehavior — hexdocs.pm/req_llm/ReqLLM.Provider.html- OOP of API adapters in modern distributed systems (secondary): “Designing Data-Intensive Applications” by Kleppmann.
Key insight Provider abstraction succeeds when it reduces cognitive load without hiding required capability distinctions.
Summary
Without canonical contracts, each model call becomes bespoke technical debt. With req_llm, the contract is standardized, and capability differences become explicit strategy data.
Homework/Exercises to practice the concept
- List three providers and map which option names differ.
- Sketch a fallback matrix for latency, cost, and capability.
- Design a policy that blocks image generations after budget threshold.
Solutions to the homework/exercises
- Example names:
max_tokens,max_completion_tokens, image endpoint variant names. - Use a table with score-based provider ranking per task type.
- Add middleware before request execution that checks cumulative cost before every
generate_*call.
Chapter 2: Typed Message, Tool, and Output Modeling
Fundamentals
LLM workflows break when teams mix free-text prompts, non-deterministic tool shapes, and ad-hoc JSON assumptions. req_llm offers canonical types (Context, Message, ContentPart, Tool, Response, Usage) and integrates with structured generation. This pushes your architecture toward contracts, which is especially important as multiple agents begin producing output that feeds other agents.
Deep Dive Type modeling in agentic systems is not about syntax aesthetics; it is about blast-radius control. An untyped output from one model becomes a fault source in all downstream workers. A typed canonical layer solves this by moving uncertainty to validation boundaries.
In this ecosystem, typed modeling appears at two layers:
- Provider payload normalization in
req_llm. - Agent command/state semantics in
jido.
For req_llm, schema-driven generation is your defense against “almost JSON.” This means declaring shape expectations before generation and letting the library validate decoded output. For long pipelines, the output model should be versioned because fields evolve while tasks remain stable.
For JIDO-style agents, message and signal typing is similarly central. Signals are envelopes for routing, audit, and observability. When each signal has a predictable envelope shape, supervisors, bus adapters, and external tools can interoperate without implicit assumptions.
Failure modes and patterns:
- Loose contracts create silent downstream failures.
- Overly strict contracts reject recoverable provider variants and trigger unnecessary retries.
- Unversioned schemas make migrations painful and break older persisted outputs.
Design pattern: dual-layer schema strategy.
- Internal schemas: strict and typed for business logic.
- External schemas: permissive ingestion to canonical projection layer.
This allows you to preserve compatibility and reliability while still improving contract quality over time.
Definitions & key terms
- Canonical schema: stable internal representation.
- Schema drift: incompatible change in field presence or meaning.
- Tool schema: declaration of function/tool input expected by model or agent.
- Envelope: wrapper metadata around content, source, and trace context.
Mental model diagram
Incoming payload
|
v
+------------------+ +--------------------+
| Typed request |---->| Validation boundary |
| schema + context | | canonical output |
+------------------+ +--------------------+
| |
v v
+------------------+ +--------------------+
| Tool call payload | | Signal / directive |
| model output | | execution metadata |
+------------------+ +--------------------+
How it works
- Define contracts for each stage before implementation.
- Encode expected output shape in
req_llmschema options. - Enforce validation at generation boundaries.
- Emit typed signals from agent actions.
- Translate typed outputs into agent state transitions.
Invariants: no stage may consume unknown/unvalidated payloads. Retry only idempotent steps. Failure modes include schema mismatch, unknown provider fields, and signal routing to non-existent handlers.
Minimal concrete example
Schema draft:
- invoice_id (required, string)
- total_cents (required, integer)
- currency (required, one of usd/eur/gbp)
Run:
$ mix req_llm.gen "Parse invoice" --json --model openai:gpt-4o-mini
{"invoice_id":"INV-1001","total_cents":1299,"currency":"usd"}
Common misconceptions
- “Schemas remove creativity.” They don’t; they constrain just the machine contract.
- “Validation is optional in prototypes.” It is optional only for throwaway scripts.
- “One schema per project is enough.” High maturity requires multiple versions.
Check-your-understanding questions
- Why should schema versions exist in long-lived workflows?
- Where is the best place to validate external tool results?
- What is the failure mode when strict schema mode is too rigid?
Check-your-understanding answers
- Persistence and replay become predictable.
- At boundary modules that own the external integration.
- Retry loops and user-visible false failures when model output format changes.
Real-world applications
- Invoice, incident, and procurement parsers.
- Policy-driven orchestration where signals route by typed tags.
- Contract tests for multi-agent tool pipelines.
Where you’ll apply it
- Projects 1, 2, 6, 7.
References
- ReqLLM structured outputs section — hexdocs.pm/req_llm/1.5.1/overview.html
- Jido Action model docs — jido Action guide
- “Domain-Driven Design” by Evans (context boundaries and contracts)
Key insight Strong interfaces are your leverage point: they let agents reason with less ambiguity and recover faster from drift.
Summary Use strict contracts where behavior matters, and resilient parsing at external boundaries where reality is noisy.
Homework/Exercises to practice the concept
- Draft two versions of an invoice output schema.
- Propose backward compatibility for a changed
totalfield name. - Design one signal envelope for human approval.
Solutions to the homework/exercises
- v1 (
total_cents,currency), v2 (line_total,currency_code) with explicit migration. - Keep aliases at translation stage with deprecation logging.
- Include
approval_id,actor,resource, anddeadline_msfields in signal envelope.
Chapter 3: Pure Command + Directive Runtime in jido
Fundamentals
jido centers on the idea that an agent command should be pure state transition logic, while effects remain explicit directives. cmd/2 returns updated agent state and a directive list. This separation enables deterministic tests and predictable execution semantics.
Deep Dive
In raw GenServer code, logic and side effects often co-locate. That makes testing difficult because every branch may touch external systems. jido formalizes a command architecture where each action’s role is clear: transform state; emit intent. This is conceptually similar to Elm and Redux reducers paired with effect descriptors.
For practical systems, this has three payoffs:
- Determinism: same input -> same state change.
- Auditability: directives reveal external impact intent.
- Runtime swapability: directive execution policy can change without changing business logic.
A typical confusion is thinking directives are mandatory external code inside action functions. In this ecosystem, directives are returned data. Execution occurs through runtime strategies (AgentServer, strategy modules, supervisors). This makes it easier to test with fixtures and to introduce policy changes.
Failure modes:
- State + directive ambiguity: writing external APIs directly inside action functions breaks determinism.
- Over-directed commands: returning too many directives without explicit ordering can hide flow bugs.
- Unhandled directive types: runtime may drop unknown directives silently if not registered.
Countermeasures:
- Keep each action small and composable.
- Classify directives by effect domain (emit/schedule/spawn/stop).
- Add compile-time and runtime guards for directive schema.
Definitions & key terms
- cmd/2: core operation returning
{agent, directives}. - Action: command unit with schema and
run/2behavior. - Directive: external effect description.
- StateOp: in-command internal state update.
Mental model diagram
+-----------+ +-----------------+ +--------------------+
| Action fn | ---> | cmd/2 contract | ---> | Directives Pipeline |
| run/2 | | (state only) | | runtime interpreter |
+-----------+ +-----------------+ +--------------------+
|
v
+----------------------+
| External side effects |
+----------------------+
How it works
- Receive command and validate action schema.
- Compute next state from immutable rules.
- Construct directive list with no direct side effects.
- Return state + directives.
- Runtime strategy executes directives atomically or with policies.
Invariant: state changes are pure functions of input. Failure modes: invalid action params, missing directive handler, directive execution timeout.
Minimal concrete example
$ make run-counter
before: count=0
command increment(+1)
after: count=1 directive=[Emit(:progress_event)]
Common misconceptions
- “Directives are slow because they are indirect.” Indirection increases clarity and testability.
- “Pure logic means no useful side effects.” It means side effects are explicit and controlled.
- “Actions must be classes/modules only.” They are composable units with schema and deterministic behavior.
Check-your-understanding questions
- Why return directives instead of executing actions directly?
- What is the value of StateOps?
- How do you test action purity?
Check-your-understanding answers
- So runtime can control ordering, retries, and supervision.
- They represent safe, internal state transition operations.
- Snapshot-before/snapshot-after comparisons and property-style invariants.
Real-world applications
- Financial workflows with audit trails.
- Approval systems where actions should be reproducible.
- Internal tool orchestration with explicit side effects.
Where you’ll apply it
- Projects 5, 6, 7, 8.
References
- Jido Overview and core examples — hexdocs.pm/jido/2.0.0-rc.4/readme.html
- Jido Action docs — hexdocs.pm/jido/jido/Action.html
- “Functional Design” by Resig (pure core + effect boundary)
Key insight The hardest production bugs disappear when you force state changes to be pure and effects to be declared.
Summary
cmd/2 is the seam between reasoning and action. Guard it well and most system behavior becomes inspectable.
Homework/Exercises to practice the concept
- Convert one side-effecting action into command+directive form.
- Define three directive types for one domain.
- Define test case for idempotence.
Solutions to the homework/exercises
- Move API call out of action, emit
HttpCalldirective. Emit,Schedule,Stop.- Run command twice on same state and verify expected stable state transitions.
Chapter 4: OTP Runtime, State Machines, and Signals
Fundamentals
jido runs on BEAM/OTP, so you inherit supervision, fault recovery, and scheduling primitives. The differentiator is that agent behavior is structured before supervision: directives and signals are first-class messages and execution strategies are pluggable (Direct execution, FSM, custom strategies).
Deep Dive
In OTP systems, resilience is achieved by supervision and restart patterns; in agent systems, resilience is also about intent preservation. If an agent step crashes, should you replay from the start or checkpoint and resume? If messages are unordered, can causality still hold? jido addresses this by coupling immutable agent state with explicit signals and strategies.
A practical framework for workflow reliability has four layers:
- Agent state: pure domain state and transition invariants.
- Signal envelope: typed routing metadata.
- Strategy: execution policy including retries, queueing, and timing.
- Supervisor tree: process lifecycle.
In a multi-agent setup, signals are crucial. They carry operation intent across boundaries without forcing each agent to know every peer’s internals. This allows horizontal growth and easier policy enforcement.
Failure modes:
- Signal loss from misconfigured routes.
- Signal storms from unbounded retries.
- Unsupervised children that survive only because upstream state still references them.
Mitigation:
- Explicit backoff and max-retry budgets.
- Supervisor limits and child lifecycle tracing.
- Bounded queues and dead-letter/error policies.
Definitions & key terms
- FSM strategy: finite-state-machine based execution style.
- Signal: portable message envelope for eventing/routing.
- Restart policy: how failed processes are recovered.
- Supervision tree: hierarchical ownership model of BEAM.
Mental model diagram
+----------------+
| AgentServer |
| cmd/state map |
+--------+-------+
|
+------+-------------------+
| strategy layer |
+------+-------------------+
|
+-----+------+ +-------------------+
| Directive | ---> | Signal Router |
| Execution | | (priority+route) |
+-----------+ +-------------------+
|
+-----+------+
| Supervisor |
| restart |
+------------+
How it works
- Receive a command and produce directives.
- Strategy chooses execution order (direct/fsm/custom).
- Signals route to other agents or worker adapters.
- On failure, supervisor applies strategy.
- Metrics and traces record each lifecycle transition.
Invariants: no uncontrolled process spawning and no indefinite loops; signals must be routeable. Failure modes: unbounded queue growth, invalid restart thresholds, poison messages.
Minimal concrete example
$ mix agent.harness.run --scenario timeout-retry
step1: command accepted
step2: worker timeout -> retry with delay
step3: max_retries=2 reached -> emit error directive
Common misconceptions
- “OTP is enough; no extra strategy needed.” Strategy is the policy layer between generic OTP and domain flow control.
- “Signals are just messages.” A signal is message + metadata + intended route semantics.
- “FSMs are overkill.” They are useful where retries and transitions must be visible.
Check-your-understanding questions
- What happens to command determinism when strategy changes?
- Why are signals better than direct pid calls for scale?
- Where do you enforce idempotency?
Check-your-understanding answers
- Keep command semantics stable; strategy can alter execution only.
- Signals keep topology decoupled and routeable.
- In state transitions and deduplication keys.
Real-world applications
- Asynchronous workflows in support desks.
- Multi-step automation that requires pause/resume behavior.
- Human-in-the-loop escalation chains.
Where you’ll apply it
- Projects 6, 7, 8.
References
- Jido Core Loop / Agents / Signals docs links in hexdocs
- OTP principles from “Designing for Scalability with Erlang/OTP” (secondary)
Key insight OTP gives failure recovery; jido’s strategy and signal layers give intentional agent behavior.
Summary Your systems become scalable when process control and business semantics are separated into explicit layers.
Homework/Exercises to practice the concept
- Draw a restart policy for a transient API timeout.
- Define route rules for two failure signals.
- Propose max queue depth and backoff values.
Solutions
- 1) retry, 2) wait-and-retry, 3) dead-letter after budget.
:agent_busy,:downstream_timeoutwith different retry budgets.- Start with 200 queue depth; exponential backoff with cap.
Chapter 5: Workflow Orchestration + Testing + Observability
Fundamentals
Observability and testability are non-negotiable when multiple models and agents coordinate. A workflow is only useful if you can answer: what happened, why, and how much it cost. jido workflows provide structure; req_llm provides request-level telemetry and usage metadata.
Deep Dive Workflows become brittle when branching and tool calls are embedded in opaque functions. In this ecosystem, use explicit instruction graphs and bounded transitions. You then test three layers:
- Command tests: deterministic state transitions.
- Workflow tests: sequencing, timeouts, and failure branching.
- Runtime tests: directive execution and strategy semantics.
Cost observability is similar: collect token usage and tool usage per stage. If stream usage jumps after a code change, you have a regression. If a model route changes due to config, usage should be expected and visible.
Failure modes:
- Latent nondeterminism from non-typed inputs.
- Observability gaps where side effects happen without trace IDs.
- Test suites that only pass in happy-path mode.
Countermeasures:
- Tag each workflow run with correlation IDs.
- Assert telemetry events in tests (token counts, latency buckets).
- Inject fixture responses and run provider matrix tests.
Definitions & key terms
- Instruction sequence: ordered action list executed by workflow engine.
- Workflow telemetry: execution-level events beyond model tokens.
- Fixture matrix: deterministic recorded responses for provider simulation.
- Correlation id: request-level link across stages.
Mental model diagram
Run Request
|
v
+----------------------+ +--------------------+
| Validation + Contract | ---> | Workflow Executor |
+----------------------+ +--------------------+
|
+-----------+-----------+
| |
[Branch A] [Branch B]
| |
+--------v---------+ +------v--------+
| req_llm call | | jido cmd |
+------------------+ +---------------+
v
Trace + Usage Ledger
How it works
- Define workflow graph and policy constraints.
- Validate each step input/ output contract.
- Execute with timeout and retry options.
- Emit events at step boundaries.
- Aggregate cost and state transitions for audits.
Invariants: every executed step must emit an event; costs must be logged; failed steps should be classifiable. Failure modes: missing events, lost trace context, provider timeout storms.
Minimal concrete example
$ mix test test/workflow_matrix_test.exs
test: workflow_rejects_invalid_payload ... ok
test: workflow_handles_llm_timeout ... ok
metrics: avg_cost_delta_ms +2ms
Common misconceptions
- “If it works locally, it works under load.” Workflows fail differently under concurrency.
- “Telemetry slows things down and should be deferred.” Light traces are essential for production behavior.
- “Integration tests replace unit tests.” They are complementary.
Check-your-understanding questions
- Why must workflow tests include timeout and retry cases?
- How does token telemetry influence architecture?
- Why version workflow outcomes?
Check-your-understanding answers
- They reproduce production failure modes.
- It reveals cost regressions and provider route shifts.
- For replay, rollback, and incident postmortems.
Real-world applications
- Customer support agent fleets.
- Data extraction + verification systems.
- Internal copilots with approval gates.
Where you’ll apply it
- Projects 1, 2, 3, 4, 6, 8.
References
- ReqLLM usage and streaming sections — ReqLLM 1.5.1
- Jido testing guides — jido testing
- “Site Reliability Engineering” by Beyer et al.
Key insight If a workflow is not observable and testable as a sequence, it is not production safe.
Summary Treat each run as a financial and reliability unit, not just an execution.
Homework/Exercises to practice the concept
- Build three traces for one workflow: success, timeout, retry.
- Add correlation ids to every step.
- Add a simple budget rule and observe behavior.
Solutions
- Sequence with success and two failure branches.
- Prefix logs with run id and action id.
- Rule: abort if cumulative token cost exceeds threshold.
Chapter 6: BEAM Production Tradeoffs: Concurrency, Isolation, and Throughput
Fundamentals
The BEAM runtime is strong at lightweight processes, fault isolation, and supervisor-driven recovery. jido + req_llm succeeds when architecture is built to exploit these properties: many short-lived agents, bounded concurrency, and isolation boundaries.
Deep Dive
This is where pragmatic architecture matters. A single long-running task with shared mutable state is fragile; many small isolated agents with explicit messages are safer. jido gives explicit state and directive models; req_llm can be configured with connection pools and stream handling for throughput control.
A mature design pattern uses five knobs:
- Concurrency caps to avoid provider saturation.
- Pooling strategy for HTTP clients.
- Backpressure in signals and queues.
- Hibernation / persistence boundaries for inactive agents.
- Route-level cost budgets before request dispatch.
You are balancing latency and reliability tradeoffs. Direct synchronous calls improve simplicity but block throughput; asynchronous streaming and queueing improve throughput but require stronger ordering semantics.
Failure patterns:
- Overcommitted pools causing request queuing and timeouts.
- Retry loops creating thundering herds.
- Starvation in shared scheduler when one project over-runs resources.
Use req_llm streaming and Finch pool configuration as operational controls, and use jido strategies to serialize sensitive transitions.
Definitions & key terms
- Backpressure: explicit control to slow intake when downstream is saturated.
- Process isolation: independent state and failure boundary per BEAM process.
- Prefetch/concurrency cap: maximum in-flight work.
- Hibernation: memory/CPU optimization for idle agent processes.
Mental model diagram
Incoming jobs Concurrency queue Agent pools
| | |
+------------------->+-------------------->+
v |
+-----------+ +-------------------+
| Strategy | -----> | req_llm workers |
| policy | | Finch connections |
+-----------+ +-------------------+
| |
v v
+-------------------------------+
| Backpressure + Retry Controls |
+-------------------------------+
How it works
- Accept job and classify with policy (rate, priority).
- Route to bounded queue and agent strategy.
- Apply concurrency cap before dispatch.
- Execute in worker with timeout + backoff.
- Record metrics and release capacity.
Invariants: bounded memory usage, bounded queue, and bounded retries. Failure modes: unbounded queue, dropped signals, incorrect fairness.
Minimal concrete example
$ mix jido.ops.report --window 1m
throughput_ok: 142 req/min
timeouts: 1
queue_depth: 12/200
retry_count: 8
Common misconceptions
- “BEAM guarantees no bottlenecks.” It guarantees isolation, not infinite throughput.
- “More concurrency always means faster.” Without budget controls, it means more failures.
- “Streaming is cheap.” Streams require proper pool sizing and lifecycle control.
Check-your-understanding questions
- What does bounded concurrency prevent?
- Why are provider pools part of business reliability?
- Why do we need queue depth limits in agent systems?
Check-your-understanding answers
- Prevents collapse under bursty traffic.
- Provider limits directly impact success rates and latency budgets.
- To protect downstream and preserve fairness under load.
Real-world applications
- High-volume support and QA automation.
- Document processing with model-heavy stages.
- Asynchronous enterprise copilots used across teams.
Where you’ll apply it
- Projects 1, 3, 4, 7, 8.
References
- Finch and streaming notes in ReqLLM overview
- OTP/BEAM architecture references from
Erlang/OTPruntime docs - “The Book of Elixir” by Dave Thomas (supervision semantics)
Key insight Throughput is a contract: every layer must expose and enforce limits.
Summary Use bounded concurrency and explicit scheduling as first-class design parameters, not tuning after launch.
Homework/Exercises to practice the concept
- Estimate safe max in-flight requests for provider pools.
- Simulate burst load and note queue depth behavior.
- Introduce a fallback route when queue is near capacity.
Solutions
- Start low, then raise via controlled load testing.
- Keep queue and error metrics visible during test.
- Route low-priority requests to delayed queue with backoff.
Glossary
- Action: A validated operation that returns an agent state transition and optional directives.
- Canonical model: Standardized internal representation for model requests and outputs.
- Directive: Declarative instruction describing a side effect.
- Directive strategy: Runtime policy that executes directives.
- FSM: Finite state machine execution style.
- Provider adapter: Module translating canonical request to provider-specific protocol.
- Signal: Structured event envelope used to route between agents and services.
- StreamResponse: Streaming response abstraction that yields tokens and usage metadata.
- Workflow: Ordered set of actions and instructions.
Why JIDO Matters
By 2025, AI systems moved from single-shot prompts to orchestrated agentic loops. The ecosystem response has shifted to reusable, observable runtimes that can operate across providers and agents.
- Standards pressure: interoperability moved from a “nice-to-have” to a necessity as tool ecosystems grow.
- Cost pressure: providers vary in latency, output quality, and pricing; observability is required.
- Reliability pressure: agents execute with side effects; non-determinism must be bounded.
A practical old-vs-new view:
Old Stack New Stack
---------- --------
Prompt scripts only Standardized LLM client + command agents
Provider-specific request code Provider-agnostic + model metadata
Side effects inline Pure cmd + explicit directives
Implicit logs Trace-aware signals + cost telemetry
Best-effort retries Strategy/rate-aware retries with policies
Real-world stats and sources (with year):
ReqLLMv1.5.1 documents 45 providers / 665+ models in a synced model registry and shared metadata, improving route choices and coverage planning. Source: ReqLLM overview, 2025 docs pageJidopackage docs are published asv2.0.0-rc.4and describe production-facing runtime primitives aroundAgentServer, directives, and stateful execution. Source: Jido home docs, 2025- Public ecosystem shift: official platform announcements show MCP support in major model providers by 2025, indicating pressure toward interoperable agent tooling and protocol-based tool access. Sources: Google Gemini MCP support announcement, 2025, Anthropic MCP announcement context, Nov 2024
The practical conclusion: if your project depends on one provider or one side-effect pattern, you will rework often. A disciplined ecosystem-level design compounds once.
Concept Summary Table
| Concept Cluster | What You Need to Internalize |
|---|---|
| Provider-agnostic LLM abstraction | Use req_llm to normalize call patterns while preserving provider-specific capabilities and controls. |
| Typed contracts and schemas | Treat prompts, tool arguments, and outputs as explicit schemas with versioned boundaries. |
| Pure cmd and directive pattern | Keep state transitions deterministic, route side effects through explicit directive descriptions. |
| Signals, workflows, and orchestration | Model agent-to-agent communication and multi-step flow with routing, retries, and checkpoints. |
| Observability and testing | Measure usage, latency, and failure classes at both workflow and model boundaries. |
| BEAM-driven production controls | Use supervision, bounded concurrency, and process isolation as architectural primitives, not infrastructure afterthoughts. |
Project-to-Concept Map
| Project | Concepts Applied |
|---|---|
| Project 1: Multi-Model Gateway with req_llm | Provider-agnostic LLM abstraction, Typed contracts, BEAM production controls |
| Project 2: Schema-First Contract Extractor | Typed contracts and schemas, Observability and testing |
| Project 3: Streaming Copilot Operations Center | Provider-agnostic LLM abstraction, Observability and testing |
| Project 4: Provider Failover and Cost Router | Provider-agnostic LLM abstraction, BEAM production controls |
| Project 5: Deterministic Counter Agent | Pure cmd and directive pattern |
| Project 6: Workflow-First Orchestrator | Signals, workflows, and orchestration, Pure cmd and directive pattern |
| Project 7: Signal Mesh for Multi-Agent Delegation | Signals, workflows, and orchestration, BEAM-driven production controls |
| Project 8: Audit-Ready State Machine Agent | Pure cmd/directive, Observability/testing, Signals/workflows |
Deep Dive Reading by Concept
| Concept | Book and Chapter | Why This Matters |
|---|---|---|
| Provider-agnostic LLM abstraction | “Designing Data-Intensive Applications” by Martin Kleppmann - Ch. 3 | Helps evaluate tradeoffs in protocol layering and data contracts. |
| Typed contracts and schemas | “Domain-Driven Design” by Eric Evans - Ch. 4-6 | Prevents semantic drift when integrating LLM outputs with business logic. |
| Pure cmd and directive pattern | “Designing Elixir Systems with OTP” by James Edward Gray II - Ch. 5-8 | Grounds immutability and explicit side-effect boundaries in production systems. |
| Signals and workflow orchestration | “Building Microservices” by Sam Newman - Ch. 8 | Clarifies message-driven integration, routing, and failure domains. |
| Observability and testing | “Site Reliability Engineering” by Betsy Beyer et al. - Ch. 4 | Establishes practical measurement and reliability patterns for operations. |
| BEAM production controls | Elixir/Erlang docs (Supervision and tasks) - v1.17+ | Explains process isolation, supervision, and failure recovery models. |
Quick Start: Your First 48 Hours
Day 1:
- Read this guide up to Chapter 6.
- Install prerequisites and scaffold one mix app with
jidoandreq_llmdependencies. - Start Project 1 and Project 5 to internalize two different abstraction styles.
Day 2:
- Validate Project 1 by dispatching same prompt across two providers.
- Validate Project 5 by proving command determinism with repeated runs.
- Log one failure and one retry event.
Day 3–4:
- Move to Project 2 and Project 6.
- Define schema contracts and workflow checkpoints.
End of weekend: you should be able to explain why directive-based designs remain testable where pure functions end and process runtime begins.
Recommended Learning Paths
Path 1: Product Builder (Recommended)
- Project 1 → Project 2 → Project 3 → Project 6 → Project 8
Path 2: Platform/LLM Operator
- Project 1 → Project 4 → Project 3 → Project 7 → Project 8
Path 3: Agent Architect
- Project 5 → Project 6 → Project 7 → Project 8
Success Metrics
- You can design and explain at least three provider-selection policies that include fallback and budget rules.
- At least 90% of project workflows include explicit failure handling paths with recorded outcomes.
- You can run a mixed test matrix for at least one workflow with happy-path, timeout, invalid payload, and directive-missing cases.
- You can route between at least two agents using typed signals and verify end-to-end trace continuity.
Project Overview Table
| # | Project | Package Focus | Difficulty | Time | Key Focus |
|---|---|---|---|---|---|
| 1 | Multi-Model Gateway with req_llm | req_llm | 2 | 1.5-2 weeks | Provider abstraction + unified contracts |
| 2 | Schema-First Contract Extractor | req_llm | 3 | 2 weeks | Structured outputs + schema contracts |
| 3 | Streaming Copilot Operations Center | req_llm | 3 | 1.5-2 weeks | Streaming + telemetry + UX feedback |
| 4 | Provider Failover and Cost Router | req_llm | 4 | 2-3 weeks | Cost/routing + key management |
| 5 | Deterministic Counter Agent | jido | 2 | 1.5 weeks | cmd/2 and directive separation |
| 6 | Workflow-First Orchestrator | jido | 4 | 2-3 weeks | Workflow engine + action sequencing |
| 7 | Signal Mesh for Multi-Agent Delegation | jido | 4 | 2-3 weeks | Signals, routing, and parent-child dynamics |
| 8 | Audit-Ready State Machine Agent | jido | 5 | 3-4 weeks | FSM strategies + observability + production hardening |
Project List
The following projects guide you from a provider-first architecture to agentic production systems.
Project 1: Multi-Model Gateway with req_llm
- File:
P01-multi-model-gateway.md - Main Programming Language: Elixir
- Alternative Programming Languages: Gleam, F#
- Coolness Level: 4/5 (high)
- Business Potential: 4/5
- Difficulty: 2
- Knowledge Area: Protocol adaptation, LLM orchestration
- Software or Tool:
req_llm,mix - Main Book: The Little Elixir and OTP Guidebook
What you will build: A provider-agnostic inference gateway that normalizes prompts, logs standardized usage, and fails over between providers.
Why it teaches JIDO: It shows how stable provider contracts make agent workflows portable and controllable.
Core challenges you will face:
- Model portability -> map providers without branching logic at call sites.
- Rate and reliability policy -> define fallback and quotas.
- Usage accountability -> expose token and cost trace lines per request.
Project 2: Schema-First Contract Extractor
- File:
P02-schema-first-contract-extractor.md - Main Programming Language: Elixir
- Alternative Programming Languages: TypeScript, Python
- Coolness Level: 5/5
- Business Potential: 4/5
- Difficulty: 3
- Knowledge Area: Validation, structured outputs, contracts
- Software or Tool:
req_llm, JSON tooling - Main Book: Domain-Driven Design
What you will build: A contract parser that turns messy text into validated typed outputs with schema versioning and rejection pathways.
Why it teaches JIDO: It maps model outputs into deterministic agent inputs and demonstrates contract governance.
Core challenges you will face:
- Structured output strictness -> avoid false positives and silent drift.
- Schema evolution -> backward-compatible migration.
- Downstream safety -> prevent malformed payload propagation.
Project 3: Streaming Copilot Operations Center
- File:
P03-streaming-copilot-operations-center.md - Main Programming Language: Elixir
- Alternative Programming Languages: JavaScript, Go
- Coolness Level: 4/5
- Business Potential: 5/5
- Difficulty: 3
- Knowledge Area: Streaming, observability, UX feedback
- Software or Tool:
req_llm, dashboard tooling - Main Book: Site Reliability Engineering
What you will build: A terminal-style operations copilot with real-time streaming, usage reporting, and cost-aware cutoffs.
Why it teaches JIDO: It teaches production telemetry and responsive workflows for model calls.
Core challenges you will face:
- Stream lifecycle -> avoid blocking token loops.
- Telemetry completeness -> collect usage even on streamed flows.
- Quality thresholds -> enforce response quality controls.
Project 4: Provider Failover and Cost Router
- File:
P04-provider-failover-and-cost-router.md - Main Programming Language: Elixir
- Alternative Programming Languages: Kotlin, Rust
- Coolness Level: 4/5
- Business Potential: 5/5
- Difficulty: 4
- Knowledge Area: Reliability, cost governance, routing
- Software or Tool:
req_llm,jido - Main Book: Designing Data-Intensive Applications
What you will build: A policy engine that switches providers by latency, budget, failure history, and capability.
Why it teaches JIDO: It combines req_llm routing with jido policy-driven execution.
Core challenges you will face:
- Routing policy ambiguity -> choose deterministic tie-breakers.
- Budget enforcement -> protect operations from runaway tokens.
- Error class taxonomy -> differentiate retryable and terminal faults.
Project 5: Deterministic Counter Agent
- File:
P05-deterministic-counter-agent.md - Main Programming Language: Elixir
- Alternative Programming Languages: Elixir scripts, Lua
- Coolness Level: 3/5
- Business Potential: 3/5
- Difficulty: 2
- Knowledge Area: cmd/2, StateOp, directives
- Software or Tool:
jido,jido_action - Main Book: Programming Elixir
What you will build: A pure action-based agent whose behavior is tested via command outputs and generated directives.
Why it teaches JIDO: It teaches the core cmd/2 mental model and why purity enables repeatability.
Core challenges you will face:
- Pure state design -> no side effects in action logic.
- Directive design -> clear effect contracts.
- Test determinism -> stable assertions under repeated commands.
Project 6: Workflow-First Orchestrator
- File:
P06-workflow-first-orchestrator.md - Main Programming Language: Elixir
- Alternative Programming Languages: Go, Java
- Coolness Level: 4/5
- Business Potential: 5/5
- Difficulty: 4
- Knowledge Area: workflows, instruction sets, agent cooperation
- Software or Tool:
jido,req_llm,req_llmtools - Main Book: Monolith to Microservices
What you will build: A multi-step workflow engine using instructions and timeout-aware branching.
Why it teaches JIDO: It demonstrates how real workflows differ from linear scripts.
Core challenges you will face:
- Branching logic -> reproducible retries.
- Timeouts and fallback -> maintain checkpoint integrity.
- Composability -> reuse actions with clean schemas.
Project 7: Signal Mesh for Multi-Agent Delegation
- File:
P07-signal-mesh-multi-agent-delegation.md - Main Programming Language: Elixir
- Alternative Programming Languages: Rust, Python
- Coolness Level: 5/5
- Business Potential: 5/5
- Difficulty: 4
- Knowledge Area: signals, routing, pub/sub, delegation
- Software or Tool:
jido,jido_signal - Main Book: Release It!
What you will build: A delegated agent mesh with typed signals, route policies, and parent-child lifecycle tracking.
Why it teaches JIDO: It shows how to scale from one agent to many without shared hidden coupling.
Core challenges you will face:
- Routing correctness -> deterministic destination logic.
- Event storms -> guardrails and dead-letter handling.
- Policy drift -> consistent behavior across many agents.
Project 8: Audit-Ready State Machine Agent
- File:
P08-audit-ready-state-machine-agent.md - Main Programming Language: Elixir
- Alternative Programming Languages: Clojure, Scala
- Coolness Level: 5/5
- Business Potential: 4/5
- Difficulty: 5
- Knowledge Area: FSM strategy, supervision, auditing, enterprise controls
- Software or Tool:
jido,telemetry,req_llm - Main Book: Software Engineering at Google (casebook reading)
What you will build: A production-like multi-stage state machine agent with approval gates, audit trails, and failure recovery.
Why it teaches JIDO: It integrates all concepts into a realistic operational system.
Core challenges you will face:
- Compliance visibility -> every side effect must be accounted for.
- Concurrent updates -> safe transitions under load.
- Incident replayability -> deterministic trace and state snapshots.
Project Comparison Table
| Project | Difficulty | Time | Depth of Understanding | Fun Factor |
|---|---|---|---|---|
| 1. Multi-Model Gateway with req_llm | Medium | 1.5-2 weeks | High | ★★★★☆ |
| 2. Schema-First Contract Extractor | Medium-High | 2 weeks | High | ★★★★★ |
| 3. Streaming Copilot Operations Center | Medium | 1.5-2 weeks | High | ★★★★☆ |
| 4. Provider Failover and Cost Router | High | 2-3 weeks | Very High | ★★★★☆ |
| 5. Deterministic Counter Agent | Low-Medium | 1.5 weeks | Medium | ★★★★☆ |
| 6. Workflow-First Orchestrator | High | 2-3 weeks | Very High | ★★★★★ |
| 7. Signal Mesh for Multi-Agent Delegation | High | 2-3 weeks | Very High | ★★★★★ |
| 8. Audit-Ready State Machine Agent | Very High | 3-4 weeks | Very High | ★★★☆☆ |
Recommendation
If you are new to agentic systems: start with Project 5, then Project 1, then Project 2. This avoids heavy orchestration before you own invariants.
If you are a platform engineer: start with Projects 4, 7, and 8 to build routing and reliability patterns.
If you want production outcomes quickly: start with Projects 3, 6, and 8.
If you want to build robust LLM products: follow Projects 1, 2, 3, 4, then 6.
Final Overall Project: JIDO Incident Response Product
The Goal: Build a single integrated system combining all 8 patterns:
- Provider abstraction and contract parsing.
- Streaming ops UI.
- Cost routing and fallback.
- Directive-led deterministic core agent.
- Workflow orchestration with signal routing.
- Audit-grade FSM for human approvals.
Success Criteria: the final output must process 100+ sample tickets, produce reproducible logs, enforce cost caps, and resume correctly after simulated worker restarts.
From Learning to Production: What Is Next
| Your Project | Production Equivalent | Gap to Fill |
|---|---|---|
| Projects 1-4 | Provider gateway service | Multi-region failover policies |
| Projects 5-8 | Agent coordination platform | Centralized policy service, security hardening |
| Full sprint | Internal automation platform | SSO, multi-tenant quotas, release governance |
Suggested production upgrades:
- Add distributed tracing (OpenTelemetry).
- Add queue-based backlog + DLQ.
- Introduce per-team permission model and audit export.
- Add chaos testing for provider and process faults.
Summary
This learning path covers the JIDO ecosystem end-to-end through practical projects. You practiced provider abstractions, typed contracts, directive-driven agents, workflow orchestration, signal routing, observability, and production reliability under BEAM.
| # | Project Name | Main Language | Difficulty | Time Estimate |
|---|---|---|---|---|
| 1 | Multi-Model Gateway with req_llm | Elixir | 2 | 1.5-2 weeks |
| 2 | Schema-First Contract Extractor | Elixir | 3 | 2 weeks |
| 3 | Streaming Copilot Operations Center | Elixir | 3 | 1.5-2 weeks |
| 4 | Provider Failover and Cost Router | Elixir | 4 | 2-3 weeks |
| 5 | Deterministic Counter Agent | Elixir | 2 | 1.5 weeks |
| 6 | Workflow-First Orchestrator | Elixir | 4 | 2-3 weeks |
| 7 | Signal Mesh for Multi-Agent Delegation | Elixir | 4 | 2-3 weeks |
| 8 | Audit-Ready State Machine Agent | Elixir | 5 | 3-4 weeks |
Expected Outcomes
- You can design deterministic, testable agent commands.
- You can route LLM usage safely across providers.
- You can produce operational telemetry and replay traces.
Additional Resources and References
Standards and Specifications
Industry Analysis
- Google Gemini MCP support announcement (2025): blog.google
- Anthropic MCP announcement coverage (2024): techcrunch
Books
- The Little Elixir and OTP Guidebook — Ben Brummelen
- The Book of Elixir — Dave Thomas
- Designing Data-Intensive Applications — Martin Kleppmann