Introducing Sutra: A Declarative Framework for AI Agents

Over the past few months, I’ve been working on Sutra, a declarative framework for building AI agents. The core idea? Define your agent’s behavior in YAML, not Python.

The Problem

Building AI agents typically involves:

Writing complex orchestration code
Managing state manually
Handling errors and retries
Debugging opaque execution flows
Struggling with parallel execution

Every agent becomes a custom snowflake with its own patterns and quirks.

The Solution: Declare, Don’t Imperative

Sutra takes a different approach. Instead of writing procedural code, you declare your agent’s structure:

name: research_agent
version: "1.0"
format: advanced
entry: analyze

llm_profiles:
  default:
    model: gemini/gemini-2.5-flash
    temperature: 0.7
    timeout: 60

nodes:
  - id: analyze
    type: llm
    profile: default
    prompt: "Analyze: {{ input }}"
    edges:
      - target: search

  - id: search
    type: tool
    tool: web_search
    args:
      query: "{{ analyze }}"
    edges:
      - target: synthesize

  - id: synthesize
    type: llm
    profile: default
    prompt: |
      Based on the analysis and search results:
      Analysis: {{ analyze }}
      Results: {{ search }}
      
      Provide a comprehensive summary.
    edges:
      - target: END

That’s it. No orchestration code, no state management, no manual error handling.

Key Features

1. Graph-Based Execution

Sutra uses a Directed Acyclic Graph (DAG) for execution. This enables:

Parallel execution of independent nodes
Conditional routing based on outputs
Fan-out/fan-in patterns with convergence nodes

2. Built-in Observability

Every execution is traced:

result = await graph.ainvoke(state)

print(result["metrics"])
# {
#   "llm_calls": 3,
#   "tool_calls": 1,
#   "total_tokens": 1547,
#   "execution_time": 4.2
# }

3. Template System

Jinja2 templates for dynamic prompts:

prompt: |
  You are analyzing {{ input_type }}.
  
  Data: {{ previous_step }}
  
  {% if urgent %}
  URGENT: Respond within 24 hours
  {% endif %}

4. Multi-Agent Coordination

Agents can spawn sub-agents:

- id: coordinator
  type: llm
  profile: default
  prompt: "Coordinate the research"
  edges:
    - target: specialist_agent
      condition: "'complex' in coordinator"

Real-World Example

Here’s a production agent that processes research queries with parallel execution:

nodes:
  - id: classify
    type: llm
    prompt: "Classify query: {{ input }}"
    edges:
      - target: search_academic
        condition: "'academic' in classify"
      - target: search_web
        condition: "'web' in classify"
      - target: search_both
        condition: "'both' in classify"

  - id: search_academic
    type: tool
    tool: scholar_search
    args:
      query: "{{ input }}"
    edges:
      - target: merge

  - id: search_web
    type: tool
    tool: web_search
    args:
      query: "{{ input }}"
    edges:
      - target: merge

  - id: merge
    type: convergence
    convergence:
      policy: lenient
      timeout: 60
    depends_on:
      - search_academic
      - search_web
    edges:
      - target: synthesize

The framework handles:

Parallel execution of searches
Waiting for all results at convergence
Error handling if one fails (lenient policy)
Automatic timeout management

Architecture Insights

Under the hood, Sutra uses:

LangGraph for graph execution
Protocol-based adapters for LLM providers
Registry pattern for tools and LLMs
Advanced state management for complex flows

The builder pattern separates:

Specification (YAML) - What to do
Registry (Python) - Available resources
Builder (Framework) - How to wire it up
Executor (LangGraph) - Runtime execution

Lessons Learned

Building Sutra taught me:

Declarative > Imperative - YAML specs are easier to debug than nested Python
Observability is critical - Without tracing, debugging agents is impossible
Convergence is hard - Fan-in patterns need careful timeout management
Sub-agents complicate - Parent-child coordination adds complexity
Schema validation saves time - Catch errors before execution

Current Status

Sutra powers the konf.dev agentic platform and is being battle-tested with production workloads. The framework handles:

Multi-agent research coordination
Parallel document processing
Conditional workflows
Error recovery and retry
Checkpointing and resume

What’s Next

Upcoming features:

Enhanced YAML validation with better error messages
Visual graph editor
More built-in tools
Streaming support
Memory management improvements

Try It Yourself

Check out the GitHub repo for:

Full documentation
10+ runnable examples
Production showcases
Contributing guidelines

Quick start:

pip install sutra

Then create your first agent in 5 minutes following the User Guide.

Sutra is part of the konf.dev ecosystem - building production infrastructure for agentic AI.