Skip to content

Agent Configuration

Agents define LLM providers and their settings for AI-powered workflow steps.

Basic Configuration

agents/default.yaml
kind: "AgentModel"
version: "v1"
metadata:
  name: "default"
spec:
  provider: "ollama"
  model: "llama3"
  api_base: "http://localhost:11434"
  temperature: 0.7
  max_tokens: 1024

Provider Configurations

Ollama (Local)

kind: "AgentModel"
version: "v1"
metadata:
  name: "local"
spec:
  provider: "ollama"
  model: "llama3"
  api_base: "http://localhost:11434"
  temperature: 0.7
  max_tokens: 2048

Environment setup:

# .env
LITELLM_OLLAMA_BASE_URL=http://localhost:11434

OpenAI

kind: "AgentModel"
version: "v1"
metadata:
  name: "openai"
spec:
  provider: "openai"
  model: "gpt-4o"
  api_key: "${OPENAI_API_KEY}"
  temperature: 0.7
  max_tokens: 4096

Environment setup:

# .env
OPENAI_API_KEY=sk-...

Anthropic

kind: "AgentModel"
version: "v1"
metadata:
  name: "claude"
spec:
  provider: "anthropic"
  model: "claude-3-5-sonnet-20241022"
  api_key: "${ANTHROPIC_API_KEY}"
  temperature: 0.7
  max_tokens: 4096

Environment setup:

# .env
ANTHROPIC_API_KEY=sk-ant-...

LiteLLM Proxy

For routing through a LiteLLM proxy:

kind: "AgentModel"
version: "v1"
metadata:
  name: "proxy"
spec:
  provider: "litellm"
  model: "gpt-4"                    # Model name configured in LiteLLM
  api_base: "${LITELLM_PROXY_URL}"
  api_key: "${LITELLM_MASTER_KEY}"

Configuration Properties

Property Type Required Description
provider string Yes Provider name: ollama, openai, anthropic, litellm
model string Yes Model identifier
api_base string For ollama/litellm API endpoint URL
api_key string For openai/anthropic API authentication key
temperature float No Randomness (0.0-2.0, default: 0.7)
max_tokens integer No Maximum response length
top_p float No Nucleus sampling (0.0-1.0)
timeout integer No Request timeout in seconds

Using Agents in Workflows

Reference the agent model or specify inline:

Using Agent Preset

- id: "classify"
  kind: "agent"
  agent:
    model: "default"    # Uses agents/default.yaml
    prompt: "..."

Inline Model Specification

- id: "classify"
  kind: "agent"
  agent:
    model: "ollama/llama3"    # LiteLLM format
    prompt: "..."

LiteLLM Model Strings

tuvl uses LiteLLM format for model identifiers:

Provider Format Example
Ollama ollama/{model} ollama/llama3
OpenAI openai/{model} openai/gpt-4o
Anthropic anthropic/{model} anthropic/claude-3-5-sonnet-20241022
Azure azure/{deployment} azure/gpt-4-deployment

Agent Step Configuration

Full agent step specification:

- id: "analyze"
  kind: "agent"
  agent:
    model: "ollama/llama3"

    # Prompts
    system: |
      You are a helpful assistant that analyzes data.
      Always respond with valid JSON.

    prompt: |
      Analyze this customer:
      Name: {{ name }}
      Company: {{ company }}

      Return JSON: {"score": 1-100, "tags": ["tag1", "tag2"]}

    # Output handling
    output:
      format: json              # json | text | signal
      map:
        score: customer_score   # Map LLM keys to context
        tags: customer_tags
      signal_from: score        # Use for routing

    # Error handling
    retry:
      attempts: 3
      on: [parse_error, timeout, rate_limit]
      backoff: 2

    timeout: 30

Output Formats

JSON Format

agent:
  prompt: 'Return JSON: {"decision": "approve" | "reject"}'
  output:
    format: json
    map:
      decision: approval_decision

The LLM response is parsed as JSON and mapped to context keys.

Text Format

agent:
  prompt: "Summarize this document in one paragraph."
  output:
    format: text
    map:
      response: summary

Raw text response is stored in the specified key.

Signal Format

agent:
  prompt: "Respond with one word: approve, reject, or review"
  output:
    format: signal

The response is used directly as the routing signal.

Retry Configuration

Handle transient errors with retries:

agent:
  retry:
    attempts: 3        # Total attempts (including first)
    on:                # Error types to retry
      - parse_error    # JSON parsing failed
      - timeout        # Request timed out
      - rate_limit     # Rate limit exceeded
      - server_error   # 5xx response
    backoff: 2         # Exponential backoff multiplier

With backoff: 2:

  • Attempt 1: Immediate
  • Attempt 2: Wait 2 seconds
  • Attempt 3: Wait 4 seconds

Multiple Agent Presets

Define different presets for different use cases:

agents/fast.yaml
kind: "AgentModel"
version: "v1"
metadata:
  name: "fast"
spec:
  provider: "ollama"
  model: "mistral"
  temperature: 0.3
  max_tokens: 512
agents/creative.yaml
kind: "AgentModel"
version: "v1"
metadata:
  name: "creative"
spec:
  provider: "openai"
  model: "gpt-4o"
  temperature: 0.9
  max_tokens: 2048

Use in workflows:

- id: "quick_check"
  agent:
    model: "fast"
    prompt: "..."

- id: "write_copy"
  agent:
    model: "creative"
    prompt: "..."

Best Practices

1. Use Presets for Common Configs

# Define once
# agents/default.yaml
spec:
  provider: "ollama"
  model: "llama3"
  temperature: 0.7

# Use everywhere
agent:
  model: "default"

2. Keep API Keys in Environment

# Good
api_key: "${OPENAI_API_KEY}"

# Never do this
api_key: "sk-actual-key-here"

3. Set Appropriate Timeouts

# Short for simple tasks
timeout: 15

# Longer for complex analysis
timeout: 60

4. Use Structured Output

# Good - clear JSON structure
prompt: |
  Return JSON: {"category": "A" | "B" | "C"}

# Harder to parse
prompt: |
  What category is this?

5. Handle All Outcomes

agent:
  output:
    signal_from: decision
routes:
  approve: "process"
  reject: "notify"
  error: "manual_review"  # Always handle errors

Troubleshooting

Connection Refused (Ollama)

Connection refused: http://localhost:11434
  • Ensure Ollama is running: ollama serve
  • Check the port: curl http://localhost:11434/api/version

Invalid API Key

AuthenticationError: Invalid API key
  • Verify the key in your .env file
  • Check for extra whitespace or newlines
  • Ensure the key has proper permissions

Rate Limiting

RateLimitError: Rate limit exceeded
  • Add retry configuration with backoff
  • Consider using multiple API keys
  • Implement request queuing

JSON Parse Errors

JSONDecodeError: Extra data
  • Improve prompts to request clean JSON
  • Add "Return ONLY valid JSON" to system prompt
  • Use retry with parse_error handling

Next Steps