Agent Configuration¶
Agents define LLM providers and their settings for AI-powered workflow steps.
Basic Configuration¶
agents/default.yaml
kind: "AgentModel"
version: "v1"
metadata:
name: "default"
spec:
provider: "ollama"
model: "llama3"
api_base: "http://localhost:11434"
temperature: 0.7
max_tokens: 1024
Provider Configurations¶
Ollama (Local)¶
kind: "AgentModel"
version: "v1"
metadata:
name: "local"
spec:
provider: "ollama"
model: "llama3"
api_base: "http://localhost:11434"
temperature: 0.7
max_tokens: 2048
Environment setup:
OpenAI¶
kind: "AgentModel"
version: "v1"
metadata:
name: "openai"
spec:
provider: "openai"
model: "gpt-4o"
api_key: "${OPENAI_API_KEY}"
temperature: 0.7
max_tokens: 4096
Environment setup:
Anthropic¶
kind: "AgentModel"
version: "v1"
metadata:
name: "claude"
spec:
provider: "anthropic"
model: "claude-3-5-sonnet-20241022"
api_key: "${ANTHROPIC_API_KEY}"
temperature: 0.7
max_tokens: 4096
Environment setup:
LiteLLM Proxy¶
For routing through a LiteLLM proxy:
kind: "AgentModel"
version: "v1"
metadata:
name: "proxy"
spec:
provider: "litellm"
model: "gpt-4" # Model name configured in LiteLLM
api_base: "${LITELLM_PROXY_URL}"
api_key: "${LITELLM_MASTER_KEY}"
Configuration Properties¶
| Property | Type | Required | Description |
|---|---|---|---|
provider |
string | Yes | Provider name: ollama, openai, anthropic, litellm |
model |
string | Yes | Model identifier |
api_base |
string | For ollama/litellm | API endpoint URL |
api_key |
string | For openai/anthropic | API authentication key |
temperature |
float | No | Randomness (0.0-2.0, default: 0.7) |
max_tokens |
integer | No | Maximum response length |
top_p |
float | No | Nucleus sampling (0.0-1.0) |
timeout |
integer | No | Request timeout in seconds |
Using Agents in Workflows¶
Reference the agent model or specify inline:
Using Agent Preset¶
Inline Model Specification¶
LiteLLM Model Strings¶
tuvl uses LiteLLM format for model identifiers:
| Provider | Format | Example |
|---|---|---|
| Ollama | ollama/{model} |
ollama/llama3 |
| OpenAI | openai/{model} |
openai/gpt-4o |
| Anthropic | anthropic/{model} |
anthropic/claude-3-5-sonnet-20241022 |
| Azure | azure/{deployment} |
azure/gpt-4-deployment |
Agent Step Configuration¶
Full agent step specification:
- id: "analyze"
kind: "agent"
agent:
model: "ollama/llama3"
# Prompts
system: |
You are a helpful assistant that analyzes data.
Always respond with valid JSON.
prompt: |
Analyze this customer:
Name: {{ name }}
Company: {{ company }}
Return JSON: {"score": 1-100, "tags": ["tag1", "tag2"]}
# Output handling
output:
format: json # json | text | signal
map:
score: customer_score # Map LLM keys to context
tags: customer_tags
signal_from: score # Use for routing
# Error handling
retry:
attempts: 3
on: [parse_error, timeout, rate_limit]
backoff: 2
timeout: 30
Output Formats¶
JSON Format¶
agent:
prompt: 'Return JSON: {"decision": "approve" | "reject"}'
output:
format: json
map:
decision: approval_decision
The LLM response is parsed as JSON and mapped to context keys.
Text Format¶
agent:
prompt: "Summarize this document in one paragraph."
output:
format: text
map:
response: summary
Raw text response is stored in the specified key.
Signal Format¶
The response is used directly as the routing signal.
Retry Configuration¶
Handle transient errors with retries:
agent:
retry:
attempts: 3 # Total attempts (including first)
on: # Error types to retry
- parse_error # JSON parsing failed
- timeout # Request timed out
- rate_limit # Rate limit exceeded
- server_error # 5xx response
backoff: 2 # Exponential backoff multiplier
With backoff: 2:
- Attempt 1: Immediate
- Attempt 2: Wait 2 seconds
- Attempt 3: Wait 4 seconds
Multiple Agent Presets¶
Define different presets for different use cases:
agents/fast.yaml
kind: "AgentModel"
version: "v1"
metadata:
name: "fast"
spec:
provider: "ollama"
model: "mistral"
temperature: 0.3
max_tokens: 512
agents/creative.yaml
kind: "AgentModel"
version: "v1"
metadata:
name: "creative"
spec:
provider: "openai"
model: "gpt-4o"
temperature: 0.9
max_tokens: 2048
Use in workflows:
- id: "quick_check"
agent:
model: "fast"
prompt: "..."
- id: "write_copy"
agent:
model: "creative"
prompt: "..."
Best Practices¶
1. Use Presets for Common Configs¶
# Define once
# agents/default.yaml
spec:
provider: "ollama"
model: "llama3"
temperature: 0.7
# Use everywhere
agent:
model: "default"
2. Keep API Keys in Environment¶
3. Set Appropriate Timeouts¶
4. Use Structured Output¶
# Good - clear JSON structure
prompt: |
Return JSON: {"category": "A" | "B" | "C"}
# Harder to parse
prompt: |
What category is this?
5. Handle All Outcomes¶
agent:
output:
signal_from: decision
routes:
approve: "process"
reject: "notify"
error: "manual_review" # Always handle errors
Troubleshooting¶
Connection Refused (Ollama)¶
- Ensure Ollama is running:
ollama serve - Check the port:
curl http://localhost:11434/api/version
Invalid API Key¶
- Verify the key in your
.envfile - Check for extra whitespace or newlines
- Ensure the key has proper permissions
Rate Limiting¶
- Add retry configuration with backoff
- Consider using multiple API keys
- Implement request queuing
JSON Parse Errors¶
- Improve prompts to request clean JSON
- Add
"Return ONLY valid JSON"to system prompt - Use retry with
parse_errorhandling