Project 8: LLM Structured Output

Project 8: LLM Structured Output

Build a system that uses Pydantic to define structured outputs for LLMs, ensuring the AI returns validated, type-safe data instead of arbitrary text.


Learning Objectives

By completing this project, you will:

  1. Understand the problem of unstructured LLM output - Why raw text responses are unreliable for production systems
  2. Master JSON Schema generation with Pydantic - Use model_json_schema() to create schemas that guide LLM responses
  3. Implement schema injection in prompts - Techniques for instructing LLMs to follow specific output formats
  4. Integrate with OpenAI and Anthropic APIs - Use function calling and structured output features
  5. Apply the Instructor library pattern - Understand how Instructor patches LLM clients for automatic validation
  6. Build retry and self-correction strategies - Handle malformed responses gracefully with automatic retries

Deep Theoretical Foundation

The Problem of Unstructured LLM Output

Large Language Models are fundamentally text generation systems. When you ask an LLM to โ€œextract the personโ€™s name and age from this text,โ€ you might get:

Attempt 1: "The person's name is John and they are 30 years old."
Attempt 2: "Name: John, Age: 30"
Attempt 3: "John (30)"
Attempt 4: "I found that the individual named John is thirty years old."

All correct semantically, but none are reliably parseable by code. This unpredictability is catastrophic for production systems that need to:

  • Store extracted data in databases
  • Chain LLM outputs to other services
  • Validate business rules on extracted information
  • Provide consistent API responses
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                 THE STRUCTURED OUTPUT PROBLEM                    โ”‚
โ”‚                                                                  โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                                         โ”‚
โ”‚   โ”‚    LLM Prompt     โ”‚                                         โ”‚
โ”‚   โ”‚  "Extract user    โ”‚                                         โ”‚
โ”‚   โ”‚   info from text" โ”‚                                         โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                         โ”‚
โ”‚             โ”‚                                                    โ”‚
โ”‚             โ–ผ                                                    โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                                         โ”‚
โ”‚   โ”‚   LLM Response    โ”‚                                         โ”‚
โ”‚   โ”‚  (Free-form text) โ”‚                                         โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                         โ”‚
โ”‚             โ”‚                                                    โ”‚
โ”‚    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                                          โ”‚
โ”‚    โ”‚                 โ”‚                                          โ”‚
โ”‚    โ–ผ                 โ–ผ                                          โ”‚
โ”‚  "Name: John"    "The user                                      โ”‚
โ”‚  "Age: 30"        John is 30"                                   โ”‚
โ”‚                                                                  โ”‚
โ”‚             โ”‚                 โ”‚                                  โ”‚
โ”‚             โ–ผ                 โ–ผ                                  โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚   โ”‚                    YOUR CODE                             โ”‚   โ”‚
โ”‚   โ”‚   regex? string parsing? prayer?                        โ”‚   โ”‚
โ”‚   โ”‚                                                          โ”‚   โ”‚
โ”‚   โ”‚   name = ???  # How do you reliably extract this?       โ”‚   โ”‚
โ”‚   โ”‚   age = ???   # What if it says "thirty" instead of 30? โ”‚   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                                                                  โ”‚
โ”‚                        THE NIGHTMARE                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The Solution: Schema-Guided Generation

The solution is to tell the LLM exactly what structure we expect, and have the LLM API enforce that structure:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                 THE STRUCTURED OUTPUT SOLUTION                   โ”‚
โ”‚                                                                  โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”              โ”‚
โ”‚   โ”‚  Pydantic Model   โ”‚ โ”€โ”€โ”€โ–บ โ”‚   JSON Schema     โ”‚              โ”‚
โ”‚   โ”‚                   โ”‚      โ”‚                   โ”‚              โ”‚
โ”‚   โ”‚  class User:      โ”‚      โ”‚  {                โ”‚              โ”‚
โ”‚   โ”‚    name: str      โ”‚      โ”‚    "properties":  โ”‚              โ”‚
โ”‚   โ”‚    age: int       โ”‚      โ”‚      "name": {}   โ”‚              โ”‚
โ”‚   โ”‚                   โ”‚      โ”‚      "age": {}    โ”‚              โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜              โ”‚
โ”‚                                        โ”‚                         โ”‚
โ”‚                                        โ–ผ                         โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚   โ”‚                      LLM API                             โ”‚   โ”‚
โ”‚   โ”‚                                                          โ”‚   โ”‚
โ”‚   โ”‚   - Prompt: "Extract user info from: 'John is 30'"      โ”‚   โ”‚
โ”‚   โ”‚   - Schema: {"name": str, "age": int}                   โ”‚   โ”‚
โ”‚   โ”‚   - Mode: JSON / Function Calling / Structured Output   โ”‚   โ”‚
โ”‚   โ”‚                                                          โ”‚   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                              โ”‚                                   โ”‚
โ”‚                              โ–ผ                                   โ”‚
โ”‚                    {"name": "John", "age": 30}                  โ”‚
โ”‚                              โ”‚                                   โ”‚
โ”‚                              โ–ผ                                   โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚   โ”‚                   Pydantic Validation                    โ”‚   โ”‚
โ”‚   โ”‚                                                          โ”‚   โ”‚
โ”‚   โ”‚   user = User.model_validate_json(response)              โ”‚   โ”‚
โ”‚   โ”‚   # Guaranteed to have correct types!                    โ”‚   โ”‚
โ”‚   โ”‚                                                          โ”‚   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                                                                  โ”‚
โ”‚                         RELIABLE OUTPUT                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

JSON Schema Generation with model_json_schema()

Pydantic can generate JSON Schema from any model, which becomes the bridge between your Python types and the LLM:

from pydantic import BaseModel, Field
from typing import Literal, Optional
from datetime import date

class Person(BaseModel):
    """A person extracted from text."""
    name: str = Field(..., description="The person's full name")
    age: int = Field(..., ge=0, le=150, description="Age in years")
    occupation: Optional[str] = Field(None, description="Job or profession")

# Generate JSON Schema
schema = Person.model_json_schema()
print(json.dumps(schema, indent=2))

Output:

{
  "title": "Person",
  "description": "A person extracted from text.",
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "description": "The person's full name",
      "title": "Name"
    },
    "age": {
      "type": "integer",
      "minimum": 0,
      "maximum": 150,
      "description": "Age in years",
      "title": "Age"
    },
    "occupation": {
      "type": "string",
      "description": "Job or profession",
      "title": "Occupation",
      "default": null
    }
  },
  "required": ["name", "age"]
}

Key Insights:

  1. Field descriptions become schema descriptions - The LLM reads these to understand what each field means
  2. Constraints are encoded - ge=0, le=150 becomes minimum and maximum
  3. Optional fields have defaults - The LLM knows it can omit them
  4. Types are enforced - age: int means the JSON must have an integer, not a string

Schema Injection Strategies

There are three main strategies for getting LLMs to produce structured output:

Strategy 1: System Prompt with Schema

The simplest approach: include the schema in the system prompt and ask for JSON:

def build_prompt(schema: dict, user_prompt: str) -> list[dict]:
    return [
        {
            "role": "system",
            "content": f"""You are a helpful assistant that always responds with valid JSON.

Your response must conform to this JSON schema:
{json.dumps(schema, indent=2)}

Respond ONLY with valid JSON, no other text or explanation."""
        },
        {
            "role": "user",
            "content": user_prompt
        }
    ]

Pros:

  • Works with any LLM that can output JSON
  • Simple to implement

Cons:

  • LLM might still produce invalid JSON
  • No guarantee schema is followed exactly
  • Nested or complex schemas often fail

Strategy 2: OpenAI Function Calling

OpenAIโ€™s function calling feature was originally designed for tool use, but works excellently for structured extraction:

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Extract person info from: John Smith is 30 years old."}
    ],
    functions=[
        {
            "name": "extract_person",
            "description": "Extract person information from text",
            "parameters": Person.model_json_schema()
        }
    ],
    function_call={"name": "extract_person"}  # Force this function
)

# Response is in function_call.arguments as JSON string
person_json = response.choices[0].message.function_call.arguments
person = Person.model_validate_json(person_json)

Pros:

  • More reliable than raw JSON mode
  • LLM is โ€œtrainedโ€ to produce function arguments

Cons:

  • Function calling adds token overhead
  • Not all models support it

Strategy 3: OpenAI Structured Outputs (Newest)

OpenAIโ€™s newest feature guarantees schema compliance:

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-2024-08-06",  # Must be this model or newer
    messages=[
        {"role": "user", "content": "Extract person info from: John Smith is 30."}
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "person_response",
            "strict": True,
            "schema": Person.model_json_schema()
        }
    }
)

person = Person.model_validate_json(response.choices[0].message.content)

Pros:

  • Guaranteed valid JSON matching schema
  • Fastest and most reliable

Cons:

  • Only available on newest models
  • Some schema features not supported in strict mode

The Instructor Library Pattern

The Instructor library wraps OpenAI/Anthropic clients to automate the structured output pattern:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   INSTRUCTOR ARCHITECTURE                        โ”‚
โ”‚                                                                  โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚   โ”‚                     Your Code                              โ”‚ โ”‚
โ”‚   โ”‚                                                            โ”‚ โ”‚
โ”‚   โ”‚   @instructor.patch                                        โ”‚ โ”‚
โ”‚   โ”‚   client = OpenAI()                                        โ”‚ โ”‚
โ”‚   โ”‚                                                            โ”‚ โ”‚
โ”‚   โ”‚   user = client.chat.completions.create(                   โ”‚ โ”‚
โ”‚   โ”‚       model="gpt-4",                                       โ”‚ โ”‚
โ”‚   โ”‚       response_model=User,  # <- Pydantic model!           โ”‚ โ”‚
โ”‚   โ”‚       messages=[...]                                       โ”‚ โ”‚
โ”‚   โ”‚   )                                                        โ”‚ โ”‚
โ”‚   โ”‚   # user is a validated User instance                      โ”‚ โ”‚
โ”‚   โ”‚                                                            โ”‚ โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                              โ”‚                                   โ”‚
โ”‚                              โ–ผ                                   โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚   โ”‚                   Instructor Internals                     โ”‚ โ”‚
โ”‚   โ”‚                                                            โ”‚ โ”‚
โ”‚   โ”‚   1. Extract JSON Schema from User model                   โ”‚ โ”‚
โ”‚   โ”‚   2. Choose strategy (function calling / JSON mode)        โ”‚ โ”‚
โ”‚   โ”‚   3. Add schema to API call                                โ”‚ โ”‚
โ”‚   โ”‚   4. Make API request                                      โ”‚ โ”‚
โ”‚   โ”‚   5. Parse JSON response                                   โ”‚ โ”‚
โ”‚   โ”‚   6. Validate with User.model_validate()                   โ”‚ โ”‚
โ”‚   โ”‚   7. If validation fails, retry with error feedback        โ”‚ โ”‚
โ”‚   โ”‚   8. Return validated User instance                        โ”‚ โ”‚
โ”‚   โ”‚                                                            โ”‚ โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                              โ”‚                                   โ”‚
โ”‚                              โ–ผ                                   โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚   โ”‚                     OpenAI API                             โ”‚ โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                                                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Instructor provides:

  • Automatic schema injection - No manual JSON Schema handling
  • Retry with feedback - If LLM produces invalid JSON, it retries with the error message
  • Multiple modes - Function calling, JSON mode, tool use
  • Streaming support - Partial objects as they generate
  • Validation hooks - Custom validators that trigger retries

Retry and Self-Correction Strategies

LLMs are probabilistic - even with schemas, they sometimes produce invalid output. A robust system needs retry logic:

class RetryStrategy:
    """Configurable retry strategy for LLM structured output."""

    def __init__(
        self,
        max_retries: int = 3,
        include_error_in_retry: bool = True,
        exponential_backoff: bool = True
    ):
        self.max_retries = max_retries
        self.include_error_in_retry = include_error_in_retry
        self.exponential_backoff = exponential_backoff

The self-correction pattern:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    SELF-CORRECTION LOOP                          โ”‚
โ”‚                                                                  โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                                               โ”‚
โ”‚   โ”‚   Attempt 1 โ”‚                                               โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                               โ”‚
โ”‚          โ”‚                                                       โ”‚
โ”‚          โ–ผ                                                       โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                   โ”‚
โ”‚   โ”‚  LLM Response   โ”‚ โ”€โ”€โ–บ โ”‚    Validate     โ”‚                   โ”‚
โ”‚   โ”‚  {"name": "Jo   โ”‚     โ”‚    with         โ”‚                   โ”‚
โ”‚   โ”‚   "age": "30"}  โ”‚     โ”‚    Pydantic     โ”‚                   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                   โ”‚
โ”‚                                    โ”‚                             โ”‚
โ”‚                              โœ— Invalid!                          โ”‚
โ”‚                    "age should be int, got str"                  โ”‚
โ”‚                                    โ”‚                             โ”‚
โ”‚                                    โ–ผ                             โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚   โ”‚   Attempt 2 (with error feedback)                        โ”‚   โ”‚
โ”‚   โ”‚                                                          โ”‚   โ”‚
โ”‚   โ”‚   System: "Your previous response had errors:            โ”‚   โ”‚
โ”‚   โ”‚   - age: Input should be a valid integer                โ”‚   โ”‚
โ”‚   โ”‚   Please fix and try again."                            โ”‚   โ”‚
โ”‚   โ”‚                                                          โ”‚   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                                    โ”‚                             โ”‚
โ”‚                                    โ–ผ                             โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                   โ”‚
โ”‚   โ”‚  LLM Response   โ”‚ โ”€โ”€โ–บ โ”‚    Validate     โ”‚                   โ”‚
โ”‚   โ”‚  {"name": "John"โ”‚     โ”‚    with         โ”‚                   โ”‚
โ”‚   โ”‚   "age": 30}    โ”‚     โ”‚    Pydantic     โ”‚                   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                   โ”‚
โ”‚                                    โ”‚                             โ”‚
โ”‚                              โœ“ Valid!                            โ”‚
โ”‚                                    โ”‚                             โ”‚
โ”‚                                    โ–ผ                             โ”‚
โ”‚                           Return User instance                   โ”‚
โ”‚                                                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Handling Complex Nested Types with LLMs

Complex schemas present challenges for LLMs. Consider:

from pydantic import BaseModel, Field
from typing import List, Optional, Literal
from datetime import datetime

class Address(BaseModel):
    street: str
    city: str
    country: str = Field(..., description="ISO 3166-1 alpha-2 country code")
    postal_code: Optional[str] = None

class ContactMethod(BaseModel):
    type: Literal["email", "phone", "social"]
    value: str
    is_primary: bool = False

class Person(BaseModel):
    name: str
    age: int
    addresses: List[Address] = Field(default_factory=list)
    contacts: List[ContactMethod] = Field(default_factory=list)
    metadata: dict = Field(default_factory=dict)

This schema has:

  • Nested objects (Address, ContactMethod)
  • Lists of objects
  • Literal types for enums
  • Optional fields with defaults
  • Arbitrary dict fields

Strategies for Complex Schemas:

  1. Break into steps - Extract simple fields first, then complex ones:
    # Step 1: Extract basic info
    basic_info = extract(BasicPerson, text)
    
    # Step 2: Extract addresses with context
    addresses = extract(List[Address], text, context=basic_info)
    
    # Step 3: Combine
    full_person = Person(**basic_info.model_dump(), addresses=addresses)
    
  2. Use description heavily - LLMs rely on descriptions for context:
    class Address(BaseModel):
        """A physical mailing address. Extract from the text any mention
        of where the person lives or works."""
    
        street: str = Field(..., description="Street address including number")
        city: str = Field(..., description="City name, not abbreviated")
    
  3. Provide examples - Include example outputs in the prompt:
    EXAMPLES = """
    Example input: "John lives at 123 Main St in NYC"
    Example output: {"street": "123 Main St", "city": "New York City", "country": "US"}
    """
    
  4. Simplify when possible - Use flatter schemas if nesting isnโ€™t essential

Comparing LLM Providers for Structured Output

Provider Method Reliability Speed Cost
OpenAI GPT-4o Structured Outputs Highest Fast $$$
OpenAI GPT-4 Function Calling High Medium $$$
OpenAI GPT-3.5 JSON Mode Medium Fast $
Anthropic Claude Tool Use High Medium $$
Local (Ollama) JSON Mode Variable Depends Free

Token Efficiency Considerations

Structured output has token overhead:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    TOKEN BREAKDOWN                               โ”‚
โ”‚                                                                  โ”‚
โ”‚   Standard Prompt:                                              โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚   โ”‚ System: "You are a helpful assistant"           ~10 tokens โ”‚
โ”‚   โ”‚ User: "Extract name and age from: John is 30"  ~15 tokens โ”‚
โ”‚   โ”‚ Response: "Name: John, Age: 30"                ~10 tokens โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚   Total: ~35 tokens                                             โ”‚
โ”‚                                                                  โ”‚
โ”‚   Structured Output:                                            โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚   โ”‚ System: "You are a helpful assistant that outputs JSON" โ”‚   โ”‚
โ”‚   โ”‚ System: + JSON Schema (varies)                ~50-200 tokensโ”‚
โ”‚   โ”‚ User: "Extract name and age from: John is 30"  ~15 tokens โ”‚
โ”‚   โ”‚ Response: {"name": "John", "age": 30}          ~15 tokens โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚   Total: ~100-250 tokens                                        โ”‚
โ”‚                                                                  โ”‚
โ”‚   Trade-off: 3-7x more tokens for guaranteed structure          โ”‚
โ”‚                                                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Project Specification

Functional Requirements

Build a structured LLM output system that:

  1. Defines extraction schemas with Pydantic - Multiple domains (people, events, products)
  2. Supports multiple LLM providers - OpenAI, Anthropic, with a consistent interface
  3. Implements retry with self-correction - Automatic retries with error feedback
  4. Handles complex nested types - Lists, nested objects, optional fields
  5. Provides validation feedback - Clear error messages when extraction fails
  6. Supports streaming - Partial results for long extractions

Use Cases to Implement

Use Case 1: Document Entity Extraction

Extract structured entities from documents:

class Entity(BaseModel):
    """An entity mentioned in the document."""
    name: str = Field(..., description="Entity name as it appears in text")
    type: Literal["person", "organization", "location", "date", "product"]
    context: str = Field(..., description="Sentence where entity appears")
    confidence: float = Field(..., ge=0, le=1)

class DocumentAnalysis(BaseModel):
    """Complete analysis of a document."""
    summary: str = Field(..., max_length=500)
    entities: List[Entity]
    key_topics: List[str]
    sentiment: Literal["positive", "negative", "neutral"]

Use Case 2: Structured Data Transformation

Convert unstructured text to database-ready records:

class ProductListing(BaseModel):
    """A product extracted from a listing description."""
    title: str = Field(..., max_length=200)
    price: float = Field(..., ge=0)
    currency: str = Field("USD", pattern=r'^[A-Z]{3}$')
    category: str
    features: List[str] = Field(default_factory=list)
    in_stock: bool = True

class ProductCatalog(BaseModel):
    """Multiple products from a catalog page."""
    products: List[ProductListing]
    source_url: Optional[str] = None

Use Case 3: Conversational Response Structuring

Structure chatbot responses for downstream processing:

class Intent(BaseModel):
    """Detected user intent."""
    category: Literal["question", "command", "feedback", "other"]
    action: Optional[str] = Field(None, description="Specific action requested")
    entities: dict = Field(default_factory=dict)

class StructuredResponse(BaseModel):
    """A chatbot response with structured metadata."""
    text: str = Field(..., description="Response text to show user")
    intent: Intent
    follow_up_questions: List[str] = Field(default_factory=list)
    requires_human: bool = False
    confidence: float = Field(..., ge=0, le=1)

CLI Interface

# Extract entities from text
$ llm-extract --schema entities --input document.txt --output entities.json

# Extract from stdin with custom schema
$ cat document.txt | llm-extract --schema-file custom_schema.py --model Person

# Interactive mode with streaming
$ llm-extract --interactive --schema chat_response

# Batch processing
$ llm-extract --schema products --input-dir listings/ --output-dir extracted/

API Interface

from structured_llm import StructuredLLM, RetryConfig

# Initialize with configuration
llm = StructuredLLM(
    provider="openai",
    model="gpt-4o",
    retry_config=RetryConfig(max_retries=3)
)

# Simple extraction
person = llm.extract(
    schema=Person,
    text="John Smith is a 30-year-old software engineer from NYC."
)

# Batch extraction
products = llm.extract_many(
    schema=ProductListing,
    texts=listing_texts,
    concurrency=5
)

# With custom prompt
analysis = llm.extract(
    schema=DocumentAnalysis,
    text=document,
    system_prompt="You are an expert document analyst...",
    examples=[
        ("Example input...", {"summary": "...", "entities": [...]})
    ]
)

Solution Architecture

Component Design

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         StructuredLLM                            โ”‚
โ”‚                     (Main Entry Point)                           โ”‚
โ”‚                                                                  โ”‚
โ”‚  - extract(schema, text) -> Model                               โ”‚
โ”‚  - extract_many(schema, texts) -> List[Model]                   โ”‚
โ”‚  - stream(schema, text) -> AsyncIterator[PartialModel]          โ”‚
โ”‚                                                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ–ผ                     โ–ผ                     โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  SchemaBuilder    โ”‚ โ”‚   PromptBuilder   โ”‚ โ”‚   RetryHandler    โ”‚
โ”‚                   โ”‚ โ”‚                   โ”‚ โ”‚                   โ”‚
โ”‚ - to_json_schema  โ”‚ โ”‚ - build_system    โ”‚ โ”‚ - with_retries    โ”‚
โ”‚ - to_function     โ”‚ โ”‚ - build_user      โ”‚ โ”‚ - format_error    โ”‚
โ”‚ - from_pydantic   โ”‚ โ”‚ - inject_schema   โ”‚ โ”‚ - should_retry    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        LLM Providers                             โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚ OpenAIProvider  โ”‚  โ”‚ AnthropicProviderโ”‚ โ”‚  OllamaProvider โ”‚  โ”‚
โ”‚  โ”‚                 โ”‚  โ”‚                 โ”‚  โ”‚                 โ”‚  โ”‚
โ”‚  โ”‚ - function_call โ”‚  โ”‚ - tool_use      โ”‚  โ”‚ - json_mode     โ”‚  โ”‚
โ”‚  โ”‚ - structured    โ”‚  โ”‚                 โ”‚  โ”‚                 โ”‚  โ”‚
โ”‚  โ”‚ - json_mode     โ”‚  โ”‚                 โ”‚  โ”‚                 โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                       ResponseParser                             โ”‚
โ”‚                                                                  โ”‚
โ”‚  - parse_json(response) -> dict                                 โ”‚
โ”‚  - validate(dict, schema) -> Model | ValidationError            โ”‚
โ”‚  - extract_from_function_call(response) -> dict                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Provider Abstraction

from abc import ABC, abstractmethod
from pydantic import BaseModel
from typing import Type, TypeVar, AsyncIterator

T = TypeVar('T', bound=BaseModel)

class LLMProvider(ABC):
    """Abstract base class for LLM providers."""

    @abstractmethod
    def complete(
        self,
        messages: list[dict],
        schema: dict,
        **kwargs
    ) -> str:
        """Get completion from LLM."""
        pass

    @abstractmethod
    async def stream(
        self,
        messages: list[dict],
        schema: dict,
        **kwargs
    ) -> AsyncIterator[str]:
        """Stream completion from LLM."""
        pass

    @property
    @abstractmethod
    def supports_function_calling(self) -> bool:
        """Whether this provider supports function calling."""
        pass

    @property
    @abstractmethod
    def supports_structured_output(self) -> bool:
        """Whether this provider supports strict structured output."""
        pass


class OpenAIProvider(LLMProvider):
    """OpenAI API provider."""

    def __init__(self, model: str = "gpt-4o", api_key: str = None):
        self.model = model
        self.client = openai.OpenAI(api_key=api_key)

    @property
    def supports_function_calling(self) -> bool:
        return True

    @property
    def supports_structured_output(self) -> bool:
        return "gpt-4o" in self.model  # Only latest models

    def complete(self, messages: list[dict], schema: dict, **kwargs) -> str:
        if self.supports_structured_output:
            return self._complete_structured(messages, schema, **kwargs)
        elif self.supports_function_calling:
            return self._complete_function_call(messages, schema, **kwargs)
        else:
            return self._complete_json_mode(messages, schema, **kwargs)

Retry Handler Design

from dataclasses import dataclass
from typing import Callable, TypeVar
from pydantic import ValidationError
import time

T = TypeVar('T')

@dataclass
class RetryConfig:
    """Configuration for retry behavior."""
    max_retries: int = 3
    initial_delay: float = 0.5
    exponential_base: float = 2.0
    include_error_feedback: bool = True
    max_delay: float = 30.0

class RetryHandler:
    """Handles retries with self-correction feedback."""

    def __init__(self, config: RetryConfig, provider: LLMProvider):
        self.config = config
        self.provider = provider

    def with_retries(
        self,
        func: Callable[[], T],
        on_error: Callable[[Exception, int], list[dict]] = None
    ) -> T:
        """Execute function with retries."""
        last_error = None
        messages = None

        for attempt in range(self.config.max_retries + 1):
            try:
                if attempt > 0 and last_error and on_error:
                    # Add error feedback to messages
                    messages = on_error(last_error, attempt)

                return func(messages)

            except (ValidationError, json.JSONDecodeError) as e:
                last_error = e

                if attempt < self.config.max_retries:
                    delay = min(
                        self.config.initial_delay * (self.config.exponential_base ** attempt),
                        self.config.max_delay
                    )
                    time.sleep(delay)

        raise last_error

    def format_validation_error(self, error: ValidationError) -> str:
        """Format validation error for LLM feedback."""
        lines = ["Your previous response had validation errors:"]
        for err in error.errors():
            field_path = ".".join(str(p) for p in err["loc"])
            lines.append(f"- {field_path}: {err['msg']}")
        lines.append("\nPlease fix these issues and try again.")
        return "\n".join(lines)

Project Structure

structured_llm/
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ structured_llm/
โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚       โ”œโ”€โ”€ client.py           # Main StructuredLLM class
โ”‚       โ”œโ”€โ”€ schemas.py          # Schema building utilities
โ”‚       โ”œโ”€โ”€ prompts.py          # Prompt construction
โ”‚       โ”œโ”€โ”€ retry.py            # Retry handling
โ”‚       โ”œโ”€โ”€ streaming.py        # Streaming support
โ”‚       โ”‚
โ”‚       โ”œโ”€โ”€ providers/
โ”‚       โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚       โ”‚   โ”œโ”€โ”€ base.py         # Abstract provider
โ”‚       โ”‚   โ”œโ”€โ”€ openai.py       # OpenAI implementation
โ”‚       โ”‚   โ”œโ”€โ”€ anthropic.py    # Anthropic implementation
โ”‚       โ”‚   โ””โ”€โ”€ ollama.py       # Ollama implementation
โ”‚       โ”‚
โ”‚       โ””โ”€โ”€ examples/
โ”‚           โ”œโ”€โ”€ entities.py     # Entity extraction schemas
โ”‚           โ”œโ”€โ”€ products.py     # Product extraction schemas
โ”‚           โ””โ”€โ”€ chat.py         # Chat response schemas
โ”‚
โ”œโ”€โ”€ tests/
โ”‚   โ”œโ”€โ”€ test_schemas.py
โ”‚   โ”œโ”€โ”€ test_retry.py
โ”‚   โ”œโ”€โ”€ test_providers.py
โ”‚   โ””โ”€โ”€ test_integration.py
โ”‚
โ”œโ”€โ”€ examples/
โ”‚   โ”œโ”€โ”€ simple_extraction.py
โ”‚   โ”œโ”€โ”€ batch_processing.py
โ”‚   โ””โ”€โ”€ streaming_example.py
โ”‚
โ”œโ”€โ”€ pyproject.toml
โ””โ”€โ”€ README.md

Phased Implementation Guide

Phase 1: Core Schema Infrastructure (2-3 hours)

Goal: Build the foundation for schema handling.

  1. Create base Pydantic models for extraction:
    # src/structured_llm/schemas.py
    from pydantic import BaseModel, Field
    from typing import Type, Any
    import json
    
    def model_to_json_schema(model: Type[BaseModel]) -> dict:
        """Convert Pydantic model to JSON Schema for LLM."""
        schema = model.model_json_schema()
        # Clean up schema for LLM consumption
        return _clean_schema(schema)
    
    def _clean_schema(schema: dict) -> dict:
        """Remove Pydantic-specific fields that confuse LLMs."""
        # Remove $defs if not needed
        # Simplify title fields
        # etc.
        pass
    
  2. Create example extraction schemas:
    # src/structured_llm/examples/entities.py
    class Person(BaseModel):
        """A person extracted from text."""
        name: str = Field(..., description="Full name")
        age: Optional[int] = Field(None, ge=0, le=150)
        occupation: Optional[str] = None
    
  3. Write tests for schema generation:
    def test_simple_schema():
        schema = model_to_json_schema(Person)
        assert schema["properties"]["name"]["type"] == "string"
        assert "age" in schema["properties"]
    

Checkpoint: Can generate clean JSON schemas from Pydantic models.

Phase 2: Prompt Construction (2 hours)

Goal: Build reliable prompt templates.

  1. Create prompt builder:
    # src/structured_llm/prompts.py
    class PromptBuilder:
        def __init__(self, schema: dict, examples: list = None):
            self.schema = schema
            self.examples = examples or []
    
        def build_system_prompt(self) -> str:
            """Build system prompt with schema injection."""
            pass
    
        def build_extraction_prompt(self, text: str) -> str:
            """Build user prompt for extraction."""
            pass
    
        def build_retry_prompt(self, error: str) -> str:
            """Build prompt for retry with error feedback."""
            pass
    
  2. Test prompt construction with different schemas.

Checkpoint: Prompts correctly include schema and examples.

Phase 3: OpenAI Provider (3-4 hours)

Goal: Implement OpenAI integration with multiple modes.

  1. Create abstract provider base:
    # src/structured_llm/providers/base.py
    class LLMProvider(ABC):
        @abstractmethod
        def complete(self, messages: list, schema: dict) -> str:
            pass
    
  2. Implement OpenAI provider with three modes:
    • JSON mode (basic)
    • Function calling
    • Structured outputs (if model supports)
  3. Test with real API calls:
    def test_openai_extraction():
        provider = OpenAIProvider(model="gpt-4o")
        result = provider.complete(
            messages=[{"role": "user", "content": "John is 30"}],
            schema=Person.model_json_schema()
        )
        person = Person.model_validate_json(result)
        assert person.name == "John"
    

Checkpoint: Can extract structured data via OpenAI.

Phase 4: Retry and Self-Correction (2-3 hours)

Goal: Handle failures gracefully.

  1. Implement retry handler with exponential backoff
  2. Add error feedback for self-correction
  3. Create validation error formatter
  4. Test retry behavior:
    def test_retry_on_validation_error():
        # Mock LLM to return invalid then valid
        handler = RetryHandler(RetryConfig(max_retries=2))
        result = handler.with_retries(mock_extraction)
        assert result is not None
    

Checkpoint: System recovers from malformed responses.

Phase 5: Main Client API (2-3 hours)

Goal: Create the unified StructuredLLM interface.

  1. Implement main client:
    # src/structured_llm/client.py
    class StructuredLLM:
        def __init__(
            self,
            provider: str = "openai",
            model: str = "gpt-4o",
            retry_config: RetryConfig = None
        ):
            self.provider = self._create_provider(provider, model)
            self.retry = RetryHandler(retry_config or RetryConfig())
    
        def extract(
            self,
            schema: Type[T],
            text: str,
            system_prompt: str = None
        ) -> T:
            """Extract structured data from text."""
            pass
    
        def extract_many(
            self,
            schema: Type[T],
            texts: list[str],
            concurrency: int = 5
        ) -> list[T]:
            """Extract from multiple texts in parallel."""
            pass
    
  2. Add high-level convenience methods
  3. Write comprehensive integration tests

Checkpoint: Can use simple API for extractions.

Phase 6: Additional Providers and Streaming (3-4 hours)

Goal: Support more providers and streaming.

  1. Implement Anthropic provider:
    # src/structured_llm/providers/anthropic.py
    class AnthropicProvider(LLMProvider):
        def complete(self, messages, schema):
            # Use tool_use for structured output
            pass
    
  2. Add streaming support:
    async def stream(
        self,
        schema: Type[T],
        text: str
    ) -> AsyncIterator[PartialModel[T]]:
        """Stream partial results as they generate."""
        pass
    
  3. Create CLI tool for command-line usage

Checkpoint: Full-featured structured LLM system.


Testing Strategy

Unit Tests

# tests/test_schemas.py
import pytest
from pydantic import BaseModel, Field, ValidationError
from structured_llm.schemas import model_to_json_schema

class TestSchemaGeneration:
    def test_simple_model(self):
        class Simple(BaseModel):
            name: str
            age: int

        schema = model_to_json_schema(Simple)
        assert schema["type"] == "object"
        assert "name" in schema["properties"]
        assert schema["properties"]["age"]["type"] == "integer"

    def test_nested_model(self):
        class Address(BaseModel):
            city: str

        class Person(BaseModel):
            name: str
            address: Address

        schema = model_to_json_schema(Person)
        # Verify nested schema is properly included
        assert "address" in schema["properties"]

    def test_optional_fields(self):
        class WithOptional(BaseModel):
            required: str
            optional: Optional[str] = None

        schema = model_to_json_schema(WithOptional)
        assert "required" in schema.get("required", [])
        assert "optional" not in schema.get("required", [])

    def test_field_descriptions(self):
        class WithDescriptions(BaseModel):
            name: str = Field(..., description="The person's name")

        schema = model_to_json_schema(WithDescriptions)
        assert schema["properties"]["name"]["description"] == "The person's name"


# tests/test_retry.py
class TestRetryHandler:
    def test_succeeds_first_try(self):
        config = RetryConfig(max_retries=3)
        handler = RetryHandler(config, mock_provider)

        attempts = []
        def succeeding_func(messages=None):
            attempts.append(1)
            return {"name": "John", "age": 30}

        result = handler.with_retries(succeeding_func)
        assert len(attempts) == 1
        assert result["name"] == "John"

    def test_retries_on_validation_error(self):
        config = RetryConfig(max_retries=3)
        handler = RetryHandler(config, mock_provider)

        attempts = []
        def failing_then_succeeding(messages=None):
            attempts.append(1)
            if len(attempts) < 2:
                raise ValidationError(...)
            return {"name": "John", "age": 30}

        result = handler.with_retries(failing_then_succeeding)
        assert len(attempts) == 2

    def test_exhausts_retries(self):
        config = RetryConfig(max_retries=2)
        handler = RetryHandler(config, mock_provider)

        def always_failing(messages=None):
            raise ValidationError(...)

        with pytest.raises(ValidationError):
            handler.with_retries(always_failing)

    def test_error_feedback_included(self):
        config = RetryConfig(include_error_feedback=True)
        handler = RetryHandler(config, mock_provider)

        error_messages = []
        def capture_messages(error, attempt):
            error_messages.append(handler.format_validation_error(error))
            return [{"role": "system", "content": error_messages[-1]}]

        # ... test that error messages are properly formatted


# tests/test_prompts.py
class TestPromptBuilder:
    def test_schema_injection(self):
        schema = {"properties": {"name": {"type": "string"}}}
        builder = PromptBuilder(schema)

        system_prompt = builder.build_system_prompt()
        assert "name" in system_prompt
        assert "string" in system_prompt

    def test_examples_included(self):
        builder = PromptBuilder(
            schema={},
            examples=[("Input text", {"output": "value"})]
        )

        prompt = builder.build_system_prompt()
        assert "Input text" in prompt
        assert "output" in prompt

Integration Tests

# tests/test_integration.py
import pytest
from structured_llm import StructuredLLM
from structured_llm.examples.entities import Person, DocumentAnalysis

@pytest.mark.integration
class TestOpenAIIntegration:
    """Tests that require actual API calls."""

    @pytest.fixture
    def client(self):
        return StructuredLLM(provider="openai", model="gpt-4o-mini")

    def test_simple_extraction(self, client):
        person = client.extract(
            schema=Person,
            text="John Smith is a 30-year-old software engineer."
        )

        assert isinstance(person, Person)
        assert person.name == "John Smith"
        assert person.age == 30
        assert person.occupation == "software engineer"

    def test_missing_optional_fields(self, client):
        person = client.extract(
            schema=Person,
            text="Someone named Alice was mentioned."
        )

        assert person.name == "Alice"
        assert person.age is None  # Not mentioned

    def test_complex_nested_extraction(self, client):
        analysis = client.extract(
            schema=DocumentAnalysis,
            text="""
            Apple Inc. announced today that CEO Tim Cook will present
            the new iPhone at their Cupertino headquarters. Analysts
            expect strong sales despite economic headwinds.
            """
        )

        assert len(analysis.entities) > 0
        assert any(e.type == "organization" for e in analysis.entities)
        assert any(e.type == "person" for e in analysis.entities)

    def test_batch_extraction(self, client):
        texts = [
            "John is 25.",
            "Mary is 30.",
            "Bob is 45."
        ]

        people = client.extract_many(
            schema=Person,
            texts=texts,
            concurrency=3
        )

        assert len(people) == 3
        assert all(isinstance(p, Person) for p in people)

    def test_retry_on_malformed_response(self, client):
        # This tests the retry mechanism with a tricky prompt
        # that might produce invalid output on first try

        person = client.extract(
            schema=Person,
            text="The age is thirty and name is 123"  # Tricky!
        )

        # Should eventually succeed
        assert isinstance(person, Person)


@pytest.mark.integration
class TestAnthropicIntegration:
    @pytest.fixture
    def client(self):
        return StructuredLLM(provider="anthropic", model="claude-3-sonnet")

    def test_simple_extraction(self, client):
        person = client.extract(
            schema=Person,
            text="Jane Doe is 28 years old."
        )

        assert person.name == "Jane Doe"
        assert person.age == 28

Mock Tests for Offline Development

# tests/test_with_mocks.py
from unittest.mock import Mock, patch
from structured_llm import StructuredLLM

class TestWithMocks:
    def test_provider_called_correctly(self):
        mock_provider = Mock()
        mock_provider.complete.return_value = '{"name": "Test", "age": 25}'

        with patch('structured_llm.client.OpenAIProvider', return_value=mock_provider):
            client = StructuredLLM()
            person = client.extract(Person, "Test is 25")

        mock_provider.complete.assert_called_once()
        args = mock_provider.complete.call_args
        assert "Test is 25" in str(args)

    def test_schema_included_in_request(self):
        mock_provider = Mock()
        mock_provider.complete.return_value = '{"name": "Test", "age": 25}'

        with patch('structured_llm.client.OpenAIProvider', return_value=mock_provider):
            client = StructuredLLM()
            client.extract(Person, "Some text")

        call_kwargs = mock_provider.complete.call_args.kwargs
        assert "schema" in call_kwargs
        assert "name" in call_kwargs["schema"]["properties"]

Common Pitfalls and Debugging

Pitfall 1: Schema Too Complex for LLM

Problem: LLM fails to produce valid JSON for deeply nested or complex schemas.

Symptom:

ValidationError: 5 validation errors for ComplexModel
  nested.deeply.field1: Field required
  nested.deeply.field2: Input should be a valid string
  ...

Solution: Break extraction into steps or simplify schema:

# Instead of one complex schema
class ComplexPerson(BaseModel):
    name: str
    addresses: List[Address]
    employment_history: List[Job]
    education: List[Degree]

# Use simpler sequential extraction
basic_info = client.extract(BasicPerson, text)
addresses = client.extract(List[Address], text)
# Combine afterward

Pitfall 2: Field Descriptions Not Clear Enough

Problem: LLM misunderstands what a field should contain.

Symptom: Extracted values are technically valid but semantically wrong.

Solution: Add detailed descriptions and examples:

# BAD
class Product(BaseModel):
    price: float

# GOOD
class Product(BaseModel):
    price: float = Field(
        ...,
        description="Price in USD as a decimal number (e.g., 29.99). "
                    "Do NOT include currency symbol or thousands separators.",
        examples=[29.99, 149.00, 9.50]
    )

Pitfall 3: Optional Fields Treated as Required

Problem: LLM returns null/empty for required fields or omits optional fields entirely.

Symptom:

ValidationError: 1 validation error for Person
  occupation: Field required

Solution: Be explicit about optionality in descriptions:

class Person(BaseModel):
    name: str = Field(..., description="REQUIRED: The person's full name")
    occupation: Optional[str] = Field(
        None,
        description="OPTIONAL: Job title if mentioned, otherwise omit or set to null"
    )

Pitfall 4: JSON Parsing Failures

Problem: LLM includes markdown formatting or extra text around JSON.

Symptom:

json.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Example bad output:

Here's the extracted data:
```json
{"name": "John"}

**Solution**: Strip markdown and extract JSON:
```python
def extract_json(response: str) -> str:
    """Extract JSON from potentially wrapped response."""
    # Try direct parse first
    try:
        json.loads(response)
        return response
    except json.JSONDecodeError:
        pass

    # Look for JSON in code blocks
    import re
    json_match = re.search(r'```(?:json)?\s*([\s\S]*?)\s*```', response)
    if json_match:
        return json_match.group(1)

    # Look for JSON object/array
    json_match = re.search(r'(\{[\s\S]*\}|\[[\s\S]*\])', response)
    if json_match:
        return json_match.group(1)

    raise ValueError(f"Could not extract JSON from: {response[:100]}...")

Pitfall 5: Retry Loop Never Succeeds

Problem: Same validation error occurs on every retry.

Symptom: Max retries exceeded with identical errors.

Solution: Ensure error feedback is actually reaching the LLM:

def build_retry_prompt(self, original_messages: list, error: ValidationError) -> list:
    """Build prompt that includes error feedback."""
    error_feedback = {
        "role": "user",
        "content": f"""Your previous response was invalid.

Errors:
{self.format_validation_error(error)}

Please provide a corrected response that fixes these issues.
Remember to:
1. Return ONLY valid JSON
2. Include all required fields
3. Use correct data types (integers for ages, strings for names, etc.)
"""
    }

    return original_messages + [error_feedback]

Pitfall 6: Rate Limiting and Timeout Issues

Problem: API calls fail due to rate limits or timeouts.

Symptom:

openai.RateLimitError: Rate limit exceeded

Solution: Implement backoff and retry for rate limits:

import time
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(5),
    wait=wait_exponential(multiplier=1, min=1, max=60),
    retry=lambda e: isinstance(e, openai.RateLimitError)
)
def call_with_retry(self, *args, **kwargs):
    return self.client.chat.completions.create(*args, **kwargs)

Pitfall 7: Token Limit Exceeded

Problem: Schema + prompt + response exceeds modelโ€™s context window.

Symptom:

openai.BadRequestError: This model's maximum context length is 8192 tokens

Solution: Estimate and manage token usage:

def estimate_tokens(self, text: str) -> int:
    """Rough token estimate (4 chars per token average)."""
    return len(text) // 4

def check_token_budget(self, schema: dict, text: str, max_tokens: int = 8000):
    schema_tokens = self.estimate_tokens(json.dumps(schema))
    text_tokens = self.estimate_tokens(text)
    overhead = 500  # System prompt, formatting

    available_for_response = max_tokens - schema_tokens - text_tokens - overhead

    if available_for_response < 500:
        raise ValueError(
            f"Input too long. Schema: {schema_tokens}, Text: {text_tokens}, "
            f"Only {available_for_response} tokens left for response."
        )

Extensions and Challenges

Extension 1: Streaming Partial Results

Implement streaming for long extractions to show progress:

from typing import AsyncIterator
from pydantic import BaseModel

class PartialModel:
    """Represents a partially extracted model."""
    def __init__(self, model_class: type, partial_data: dict):
        self.model_class = model_class
        self.partial_data = partial_data
        self.is_complete = False

    def get_partial(self) -> dict:
        return self.partial_data

    def finalize(self) -> BaseModel:
        return self.model_class.model_validate(self.partial_data)

async def stream_extraction(
    self,
    schema: Type[T],
    text: str
) -> AsyncIterator[PartialModel[T]]:
    """Stream partial results as they generate."""
    buffer = ""
    partial_data = {}

    async for chunk in self.provider.stream(messages, schema):
        buffer += chunk

        # Try to parse partial JSON
        try:
            partial_data = parse_partial_json(buffer)
            yield PartialModel(schema, partial_data)
        except:
            continue

    # Final complete result
    final = PartialModel(schema, partial_data)
    final.is_complete = True
    yield final

Extension 2: Schema Evolution and Versioning

Handle schema changes gracefully:

class SchemaRegistry:
    """Manage multiple versions of extraction schemas."""

    def __init__(self):
        self.schemas: dict[str, dict[str, type]] = {}

    def register(self, name: str, version: str, schema: type):
        if name not in self.schemas:
            self.schemas[name] = {}
        self.schemas[name][version] = schema

    def get(self, name: str, version: str = "latest") -> type:
        if version == "latest":
            versions = sorted(self.schemas[name].keys())
            version = versions[-1]
        return self.schemas[name][version]

    def migrate(self, data: dict, from_version: str, to_version: str) -> dict:
        """Migrate extracted data between schema versions."""
        pass

# Usage
registry = SchemaRegistry()
registry.register("person", "1.0", PersonV1)
registry.register("person", "2.0", PersonV2)  # Added new fields

schema = registry.get("person", "latest")

Extension 3: Extraction with Confidence Scores

Add confidence scoring to extractions:

from typing import Generic, TypeVar

T = TypeVar('T', bound=BaseModel)

class ExtractionResult(BaseModel, Generic[T]):
    """Extraction result with confidence metadata."""
    data: T
    confidence: float = Field(..., ge=0, le=1)
    extraction_notes: Optional[str] = None
    fields_uncertain: list[str] = Field(default_factory=list)

class ConfidenceAwareExtractor:
    def extract_with_confidence(
        self,
        schema: Type[T],
        text: str
    ) -> ExtractionResult[T]:
        # First extraction
        data = self.extract(schema, text)

        # Ask LLM to rate confidence
        confidence_schema = ConfidenceRating
        rating = self.extract(
            confidence_schema,
            f"Rate your confidence in this extraction:\n{data.model_dump_json()}"
        )

        return ExtractionResult(
            data=data,
            confidence=rating.overall_confidence,
            fields_uncertain=rating.uncertain_fields
        )

Extension 4: Multi-LLM Consensus

Use multiple LLMs and take consensus:

class ConsensusExtractor:
    """Extract with multiple LLMs and take consensus."""

    def __init__(self, providers: list[LLMProvider]):
        self.providers = providers

    def extract_consensus(
        self,
        schema: Type[T],
        text: str,
        min_agreement: float = 0.66
    ) -> T:
        results = []
        for provider in self.providers:
            try:
                result = provider.extract(schema, text)
                results.append(result)
            except Exception:
                continue

        if not results:
            raise ValueError("All providers failed")

        # Find consensus (simplified - real impl would be smarter)
        return self._find_consensus(results, min_agreement)

    def _find_consensus(self, results: list[T], min_agreement: float) -> T:
        # Compare results field by field
        # Return most common values
        pass

Extension 5: Extraction Pipeline DSL

Create a domain-specific language for complex extraction pipelines:

from dataclasses import dataclass

@dataclass
class ExtractionStep:
    schema: type
    depends_on: list[str] = None
    condition: Callable = None

class ExtractionPipeline:
    """Define multi-step extraction pipelines."""

    def __init__(self):
        self.steps: dict[str, ExtractionStep] = {}

    def add_step(self, name: str, step: ExtractionStep):
        self.steps[name] = step
        return self

    def run(self, text: str) -> dict[str, BaseModel]:
        results = {}

        for name, step in self._topological_sort():
            # Check condition
            if step.condition and not step.condition(results):
                continue

            # Build context from dependencies
            context = {
                dep: results[dep].model_dump()
                for dep in (step.depends_on or [])
            }

            # Extract
            results[name] = self.client.extract(
                step.schema,
                text,
                context=context
            )

        return results

# Usage
pipeline = ExtractionPipeline()
pipeline.add_step("basic", ExtractionStep(BasicInfo))
pipeline.add_step("details", ExtractionStep(
    DetailedInfo,
    depends_on=["basic"],
    condition=lambda r: r["basic"].needs_details
))

Real-World Connections

Where This Pattern Appears

  1. AI-Powered Data Entry - Extracting form data from documents
  2. Chatbot Response Structuring - Making chatbot outputs machine-readable
  3. Content Classification - Categorizing content with consistent schemas
  4. API Response Generation - Using LLMs to generate structured API responses
  5. ETL Pipelines - Transforming unstructured to structured data

Industry Examples

  • Anthropic Claude Tool Use - Structured outputs via tool definitions
  • OpenAI Structured Outputs - Native JSON Schema compliance
  • LangChain Output Parsers - Framework for structured LLM outputs
  • Instructor Library - Popular library for Pydantic + LLM integration
  • Outlines - Constrained text generation with regex/JSON

Production Considerations

  1. Cost Management
    • Schema injection adds tokens (and cost)
    • Cache frequent extractions
    • Use cheaper models for simple schemas
  2. Latency
    • Retries add latency
    • Consider async/parallel extraction
    • Streaming for user-facing applications
  3. Reliability
    • Always have fallback behavior
    • Log failed extractions for analysis
    • Monitor extraction success rates
  4. Security
    • Validate extracted data before use
    • Donโ€™t trust LLM output for security decisions
    • Sanitize inputs to prevent prompt injection

Self-Assessment Checklist

Core Understanding

  • Can I explain why structured LLM output is important for production systems?
  • Can I describe the difference between JSON mode, function calling, and structured outputs?
  • Can I explain how Pydanticโ€™s JSON Schema generation works?
  • Can I describe the self-correction retry pattern?

Implementation Skills

  • Can I generate a clean JSON Schema from a Pydantic model?
  • Can I build prompts that reliably produce structured output?
  • Can I implement retry logic with error feedback?
  • Can I handle complex nested schemas?

Provider Knowledge

  • Can I integrate with OpenAIโ€™s structured output features?
  • Can I use Anthropicโ€™s tool use for structured extraction?
  • Can I abstract providers behind a common interface?

Production Readiness

  • Can I handle edge cases (empty input, no matches, partial data)?
  • Can I manage token budgets for complex schemas?
  • Can I implement streaming for long extractions?
  • Can I test structured extraction without API calls?

Mastery Indicators

  • System handles all validation errors gracefully
  • Extraction succeeds on complex real-world text
  • Provider abstraction allows easy switching
  • Tests cover both success and failure cases
  • Documentation is comprehensive

Resources

Documentation

Libraries

Books and Articles