Agent Configuration¶

Agents define LLM providers and their settings for AI-powered workflow steps.

Basic Configuration¶

agents/default.yaml

kind: "AgentModel"
version: "v1"
metadata:
  name: "default"
spec:
  provider: "ollama"
  model: "llama3"
  api_base: "http://localhost:11434"
  temperature: 0.7
  max_tokens: 1024

Provider Configurations¶

Ollama (Local)¶

kind: "AgentModel"
version: "v1"
metadata:
  name: "local"
spec:
  provider: "ollama"
  model: "llama3"
  api_base: "http://localhost:11434"
  temperature: 0.7
  max_tokens: 2048

Environment setup:

# .env
LITELLM_OLLAMA_BASE_URL=http://localhost:11434

OpenAI¶

kind: "AgentModel"
version: "v1"
metadata:
  name: "openai"
spec:
  provider: "openai"
  model: "gpt-4o"
  api_key: "${OPENAI_API_KEY}"
  temperature: 0.7
  max_tokens: 4096

Environment setup:

# .env
OPENAI_API_KEY=sk-...

Anthropic¶

kind: "AgentModel"
version: "v1"
metadata:
  name: "claude"
spec:
  provider: "anthropic"
  model: "claude-3-5-sonnet-20241022"
  api_key: "${ANTHROPIC_API_KEY}"
  temperature: 0.7
  max_tokens: 4096

Environment setup:

# .env
ANTHROPIC_API_KEY=sk-ant-...

LiteLLM Proxy¶

For routing through a LiteLLM proxy:

kind: "AgentModel"
version: "v1"
metadata:
  name: "proxy"
spec:
  provider: "litellm"
  model: "gpt-4"                    # Model name configured in LiteLLM
  api_base: "${LITELLM_PROXY_URL}"
  api_key: "${LITELLM_MASTER_KEY}"

Configuration Properties¶

Property	Type	Required	Description
`provider`	string	Yes	Provider name: `ollama`, `openai`, `anthropic`, `litellm`
`model`	string	Yes	Model identifier
`api_base`	string	For ollama/litellm	API endpoint URL
`api_key`	string	For openai/anthropic	API authentication key
`temperature`	float	No	Randomness (0.0-2.0, default: 0.7)
`max_tokens`	integer	No	Maximum response length
`top_p`	float	No	Nucleus sampling (0.0-1.0)
`timeout`	integer	No	Request timeout in seconds

Using Agents in Workflows¶

Reference the agent model or specify inline:

Using Agent Preset¶

- id: "classify"
  kind: "agent"
  agent:
    model: "default"    # Uses agents/default.yaml
    prompt: "..."

Inline Model Specification¶

- id: "classify"
  kind: "agent"
  agent:
    model: "ollama/llama3"    # LiteLLM format
    prompt: "..."

LiteLLM Model Strings¶

tuvl uses LiteLLM format for model identifiers:

Provider	Format	Example
Ollama	`ollama/{model}`	`ollama/llama3`
OpenAI	`openai/{model}`	`openai/gpt-4o`
Anthropic	`anthropic/{model}`	`anthropic/claude-3-5-sonnet-20241022`
Azure	`azure/{deployment}`	`azure/gpt-4-deployment`

Agent Step Configuration¶

Full agent step specification:

- id: "analyze"
  kind: "agent"
  agent:
    model: "ollama/llama3"

    # Prompts
    system: |
      You are a helpful assistant that analyzes data.
      Always respond with valid JSON.

    prompt: |
      Analyze this customer:
      Name: {{ name }}
      Company: {{ company }}

      Return JSON: {"score": 1-100, "tags": ["tag1", "tag2"]}

    # Output handling
    output:
      format: json              # json | text | signal
      map:
        score: customer_score   # Map LLM keys to context
        tags: customer_tags
      signal_from: score        # Use for routing

    # Error handling
    retry:
      attempts: 3
      on: [parse_error, timeout, rate_limit]
      backoff: 2

    timeout: 30

Output Formats¶

JSON Format¶

agent:
  prompt: 'Return JSON: {"decision": "approve" | "reject"}'
  output:
    format: json
    map:
      decision: approval_decision

The LLM response is parsed as JSON and mapped to context keys.

Text Format¶

agent:
  prompt: "Summarize this document in one paragraph."
  output:
    format: text
    map:
      response: summary

Raw text response is stored in the specified key.

Signal Format¶

agent:
  prompt: "Respond with one word: approve, reject, or review"
  output:
    format: signal

The response is used directly as the routing signal.

Retry Configuration¶

Handle transient errors with retries:

agent:
  retry:
    attempts: 3        # Total attempts (including first)
    on:                # Error types to retry
      - parse_error    # JSON parsing failed
      - timeout        # Request timed out
      - rate_limit     # Rate limit exceeded
      - server_error   # 5xx response
    backoff: 2         # Exponential backoff multiplier

With backoff: 2:

Attempt 1: Immediate
Attempt 2: Wait 2 seconds
Attempt 3: Wait 4 seconds

Multiple Agent Presets¶

Define different presets for different use cases:

agents/fast.yaml

kind: "AgentModel"
version: "v1"
metadata:
  name: "fast"
spec:
  provider: "ollama"
  model: "mistral"
  temperature: 0.3
  max_tokens: 512

agents/creative.yaml

kind: "AgentModel"
version: "v1"
metadata:
  name: "creative"
spec:
  provider: "openai"
  model: "gpt-4o"
  temperature: 0.9
  max_tokens: 2048

Use in workflows:

- id: "quick_check"
  agent:
    model: "fast"
    prompt: "..."

- id: "write_copy"
  agent:
    model: "creative"
    prompt: "..."

Best Practices¶

1. Use Presets for Common Configs¶

# Define once
# agents/default.yaml
spec:
  provider: "ollama"
  model: "llama3"
  temperature: 0.7

# Use everywhere
agent:
  model: "default"

2. Keep API Keys in Environment¶

# Good
api_key: "${OPENAI_API_KEY}"

# Never do this
api_key: "sk-actual-key-here"

3. Set Appropriate Timeouts¶

# Short for simple tasks
timeout: 15

# Longer for complex analysis
timeout: 60

4. Use Structured Output¶

# Good - clear JSON structure
prompt: |
  Return JSON: {"category": "A" | "B" | "C"}

# Harder to parse
prompt: |
  What category is this?

5. Handle All Outcomes¶

agent:
  output:
    signal_from: decision
routes:
  approve: "process"
  reject: "notify"
  error: "manual_review"  # Always handle errors

Troubleshooting¶

Connection Refused (Ollama)¶

Connection refused: http://localhost:11434

Ensure Ollama is running: ollama serve
Check the port: curl http://localhost:11434/api/version

Invalid API Key¶

AuthenticationError: Invalid API key

Verify the key in your .env file
Check for extra whitespace or newlines
Ensure the key has proper permissions

Rate Limiting¶

RateLimitError: Rate limit exceeded

Add retry configuration with backoff
Consider using multiple API keys
Implement request queuing

JSON Parse Errors¶

JSONDecodeError: Extra data

Improve prompts to request clean JSON
Add "Return ONLY valid JSON" to system prompt
Use retry with parse_error handling

Next Steps¶

Workflows — Using agents in workflows
Nodes — Combining agents with code
Examples — Complete examples