Skip to content

Sibyl Agent Harness: Architecture Vision

Status: Design Document Date: 2026-01-01 Author: Nova + Bliss


Executive Summary

Transform Sibyl from a knowledge graph into a Collective Intelligence Runtime — an orchestration platform where AI agents collaborate through shared memory, parallel execution, and coordinated development workflows.

Core Insight: The Claude Agent SDK provides the primitives (subagents, hooks, tools). Sibyl provides the coordination layer (task locking, knowledge sharing, progress tracking). Git worktrees enable true parallel development without conflicts.


1. Vision: The 1000x Engineer's Toolkit

Imagine describing a feature and watching a fleet of specialized agents:

  1. Planning Agent breaks it into tasks with dependencies
  2. Spec Agent generates detailed specifications
  3. Implementation Agents (2-8) work in parallel worktrees
  4. Testing Agent writes and runs tests
  5. Review Agent performs code review
  6. Integration Agent merges worktrees, resolves conflicts
  7. Documentation Agent updates docs

All coordinated through Sibyl's knowledge graph, with human oversight at key checkpoints.

Why This Works Now

CapabilitySourceStatus
Agent executionClaude Agent SDKProduction-ready
Task coordinationSibyl (existing)Extend for agents
Parallel developmentGit worktreesBattle-tested
Knowledge sharingSibyl graphExisting
Web UISibyl web (Next.js 16)Extend with Agents page

2. System Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                           SIBYL WEB UI                                       │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │Dashboard │  │  Tasks   │  │  Graph   │  │ AGENTS   │  │ Settings │       │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘  └──────────┘       │
│                                    ▲                                         │
│                                    │ WebSocket (real-time updates)           │
└────────────────────────────────────┼─────────────────────────────────────────┘

┌────────────────────────────────────┼─────────────────────────────────────────┐
│                           SIBYL API                                          │
│  ┌─────────────────────────────────┼───────────────────────────────────────┐ │
│  │             ORCHESTRATOR        │                                       │ │
│  │  ┌─────────┐  ┌─────────┐  ┌────┴────┐  ┌─────────┐  ┌─────────┐       │ │
│  │  │ Planner │  │Scheduler│  │ Monitor │  │ Merger  │  │ Router  │       │ │
│  │  └─────────┘  └─────────┘  └─────────┘  └─────────┘  └─────────┘       │ │
│  └─────────────────────────────────────────────────────────────────────────┘ │
│                                     │                                         │
│  ┌─────────────────────────────────┼───────────────────────────────────────┐ │
│  │      ON-DEMAND AGENT SPAWNING   │                                       │ │
│  │  ┌─────────┐  ┌─────────┐  ┌────┴────┐                                  │ │
│  │  │Claude 1 │  │Claude 2 │  │Claude N │  (spawned as needed)             │ │
│  │  │+subagnts│  │+subagnts│  │+subagnts│                                  │ │
│  │  │worktree │  │worktree │  │worktree │                                  │ │
│  │  │  /t001  │  │  /t002  │  │  /t00N  │                                  │ │
│  │  └─────────┘  └─────────┘  └─────────┘                                  │ │
│  └─────────────────────────────────────────────────────────────────────────┘ │
│                                     │                                         │
│  ┌──────────────────────────────────┼──────────────────────────────────────┐ │
│  │                 HUMAN-IN-THE-LOOP LAYER                                  │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                   │ │
│  │  │Approval Queue│  │ Review Phase │  │  Chat Bridge │                   │ │
│  │  │  (unified)   │  │  (on request)│  │  (per agent) │                   │ │
│  │  └──────────────┘  └──────────────┘  └──────────────┘                   │ │
│  └─────────────────────────────────────────────────────────────────────────┘ │
│                                     │                                         │
│  ┌──────────────────────────────────┼──────────────────────────────────────┐ │
│  │                    COORDINATION LAYER                                    │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐ │ │
│  │  │ Task Locking │  │  Event Bus   │  │ Checkpoints  │  │  Heartbeats  │ │ │
│  │  │   (Redis)    │  │   (Redis)    │  │   (Redis)    │  │   (Redis)    │ │ │
│  │  └──────────────┘  └──────────────┘  └──────────────┘  └──────────────┘ │ │
│  └─────────────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────┼─────────────────────────────────────────┐
│                        PERSISTENCE LAYER                                      │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                        │
│  │   FalkorDB   │  │    Redis     │  │  Worktrees   │                        │
│  │   (Graph)    │  │   (State)    │  │   (Git)      │                        │
│  └──────────────┘  └──────────────┘  └──────────────┘                        │
└──────────────────────────────────────────────────────────────────────────────┘

3. Claude Agent SDK Integration

3.1 Agent Execution Model

Each agent runs as a separate Claude Agent SDK instance with:

python
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition

async def create_worker_agent(
    task: Task,
    worktree_path: Path,
    sibyl_tools: list
) -> AsyncIterator[Message]:
    """Create a worker agent for a specific task."""

    options = ClaudeAgentOptions(
        # Isolated workspace
        cwd=str(worktree_path),

        # Tools available to agent
        allowed_tools=["Read", "Write", "Edit", "Bash", "Grep", "Glob", *sibyl_tools],

        # MCP servers (Sibyl for knowledge, custom for project)
        mcp_servers={
            "sibyl": create_sibyl_mcp_server(task.organization_id),
            "project": get_project_mcp_server(task.project_id),
        },

        # Hooks for coordination
        hooks={
            "PreToolUse": [
                HookMatcher(matcher="Bash", hooks=[validate_bash_command]),
                HookMatcher(matcher="Write", hooks=[track_file_changes]),
            ],
            "PostToolUse": [
                HookMatcher(hooks=[emit_progress_event]),
            ],
        },

        # Session management
        enable_file_checkpointing=True,

        # System prompt with task context
        system_prompt=build_agent_prompt(task),
    )

    async for message in query(
        prompt=build_task_prompt(task),
        options=options
    ):
        yield message

3.2 Agent Types & Specialization

python
AGENT_DEFINITIONS = {
    "planner": AgentDefinition(
        description="Breaks features into implementable tasks",
        prompt="You are a senior software architect...",
        tools=["Read", "Grep", "Glob", "mcp__sibyl__add"],
        model="opus"  # Use Opus for planning
    ),

    "implementer": AgentDefinition(
        description="Implements code changes",
        prompt="You are a senior developer...",
        tools=["Read", "Write", "Edit", "Bash", "Grep", "Glob"],
        model="sonnet"  # Sonnet for speed on implementation
    ),

    "tester": AgentDefinition(
        description="Writes and runs tests",
        prompt="You are a QA engineer...",
        tools=["Read", "Write", "Edit", "Bash"],
        model="sonnet"
    ),

    "reviewer": AgentDefinition(
        description="Reviews code for quality and security",
        prompt="You are a code reviewer...",
        tools=["Read", "Grep", "Glob"],
        model="opus"  # Opus for deep analysis
    ),

    "integrator": AgentDefinition(
        description="Merges worktrees and resolves conflicts",
        prompt="You are a git expert...",
        tools=["Bash", "Read", "Write"],
        model="sonnet"
    ),
}

3.3 Inter-Agent Communication via Sibyl

Agents don't talk directly. They communicate through Sibyl:

python
# Agent adds a note to a task
await sibyl_client.add(
    entity_type="note",
    content="Found 3 edge cases that need handling...",
    task=current_task_id,
    author_type="agent",
    author_name="implementer-001"
)

# Another agent discovers this via search
results = await sibyl_client.search(
    query="edge cases",
    entity_type="note",
    task=current_task_id
)

4. Multi-Agent Orchestration

4.1 On-Demand Agent Spawning

Agents are created on-demand (not pooled) via two paths:

Path 1: Orchestrator-Initiated

  • Orchestrator analyzes task graph and spawns workers as needed
  • Automatically scales based on parallelizable work
  • Terminates agents when work completes

Path 2: User-Initiated

  • User clicks "Start Agent" with a prompt
  • Direct chat session with new Claude instance
  • Can work on ad-hoc tasks or assigned Sibyl tasks
python
class AgentSpawner:
    """Create Claude instances on-demand."""

    async def spawn_for_task(
        self,
        task: Task,
        agent_type: str = "implementer"
    ) -> AgentInstance:
        """Orchestrator spawns agent for specific task."""
        worktree = await self.worktree_manager.create(task.id)

        return await self._create_agent(
            agent_type=agent_type,
            worktree=worktree,
            task=task,
            spawn_source="orchestrator"
        )

    async def spawn_for_user(
        self,
        prompt: str,
        project_id: str,
        task_id: str | None = None
    ) -> AgentInstance:
        """User spawns agent directly with a prompt."""
        # Optional: attach to existing task
        task = await self.sibyl.get_task(task_id) if task_id else None

        # Create worktree if working on code
        worktree = None
        if task or self._prompt_needs_worktree(prompt):
            worktree = await self.worktree_manager.create(
                task.id if task else f"adhoc-{uuid4().hex[:8]}"
            )

        return await self._create_agent(
            agent_type="general",
            worktree=worktree,
            task=task,
            initial_prompt=prompt,
            spawn_source="user"
        )

4.2 Agent Organization

Agents are organized hierarchically by Organization → Project:

Organization (Acme Corp)
├── Project: auth-service
│   ├── 🤖 implementer-001 (task: OAuth2)
│   ├── 🤖 tester-001 (task: Auth tests)
│   └── 🤖 adhoc-abc (user prompt)
├── Project: frontend-app
│   └── 🤖 implementer-002 (task: Dashboard)
└── Project: shared-libs
    └── (no active agents)

This enables:

  • Project-scoped views in the UI
  • Resource limits per project (max agents, budget)
  • Cross-project awareness for the orchestrator
  • Isolation between projects' worktrees

4.3 Orchestrator Design

The Orchestrator manages on-demand agent creation and coordination:

python
class AgentOrchestrator:
    """Central coordinator for multi-agent execution."""

    def __init__(self, org_id: str, project_id: str):
        self.org_id = org_id
        self.project_id = project_id
        self.agents: dict[str, AgentInstance] = {}
        self.worktree_manager = WorktreeManager(project_path)

    async def plan_feature(self, feature_description: str) -> list[Task]:
        """Use planning agent to break feature into tasks."""
        planner = await self.spawn_agent("planner")

        tasks = await planner.execute(
            prompt=f"Break this feature into implementable tasks:\n\n{feature_description}",
            structured_output=TaskListSchema
        )

        # Create tasks in Sibyl with dependencies
        for task in tasks:
            await self.sibyl.add(entity_type="task", **task.dict())

        return tasks

    async def execute_parallel(self, tasks: list[Task], max_agents: int = 4):
        """Execute independent tasks in parallel."""

        # Build dependency graph
        ready_tasks = [t for t in tasks if not t.depends_on]

        async with asyncio.TaskGroup() as tg:
            for task in ready_tasks[:max_agents]:
                tg.create_task(self.execute_task(task))

    async def execute_task(self, task: Task) -> TaskResult:
        """Execute a single task with a dedicated agent."""

        # 1. Claim task in Sibyl
        claimed = await self.claim_task(task.id)
        if not claimed:
            return TaskResult(status="already_claimed")

        # 2. Create isolated worktree
        worktree = await self.worktree_manager.create(task.id)

        # 3. Spawn agent
        agent = await self.spawn_agent(
            agent_type="implementer",
            worktree=worktree,
            task=task
        )

        # 4. Execute with progress tracking
        try:
            result = await agent.execute()

            # 5. Run tests
            if result.success:
                test_result = await self.run_tests(worktree)
                if not test_result.passed:
                    result = await agent.fix_tests(test_result.failures)

            # 6. Submit for review
            await self.submit_for_review(task, result)

            return result

        except Exception as e:
            await self.handle_failure(task, e)
            raise
        finally:
            # 7. Update task status
            await self.complete_task(task.id, result)

4.2 Agent Lifecycle

┌──────────────────────────────────────────────────────────────────────┐
│                          AGENT LIFECYCLE                              │
├──────────────────────────────────────────────────────────────────────┤
│                                                                       │
│  [IDLE] ──claim task──> [INITIALIZING] ──worktree ready──> [WORKING] │
│                                                                  │    │
│                              ┌──────────────────────────────────┘    │
│                              │                                        │
│                              ▼                                        │
│  [BLOCKED] <──needs help── [WORKING] ──done──> [REVIEWING]           │
│      │                        │                      │                │
│      │                        │                      ▼                │
│      └──unblocked──> [WORKING]│              [INTEGRATING]           │
│                               │                      │                │
│                               │                      ▼                │
│                               └──failed──> [FAILED] [COMPLETED]      │
│                                                                       │
└──────────────────────────────────────────────────────────────────────┘

4.6 Human-in-the-Loop System

The Human-in-the-Loop (HITL) layer ensures human oversight at critical points while allowing autonomous operation for routine work.

4.6.1 Unified Approval Queue

All agent requests flow through a single, tabbed queue interface:

┌─────────────────────────────────────────────────────────────────────────────┐
│  🔔 Approval Queue                                          [Mark All Read] │
├─────────────────────────────────────────────────────────────────────────────┤
│  [All (5)] [⚠️ Pending (3)] [✅ Approved] [❌ Denied] [⏳ Expired]           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  ⚠️ APPROVAL REQUIRED                                    2 min ago      ││
│  │  Agent: implementer-001 · Project: auth-service                         ││
│  │  ───────────────────────────────────────────────────────────────────── ││
│  │  Action: Execute bash command                                            ││
│  │  ┌─────────────────────────────────────────────────────────────────────┐││
│  │  │  rm -rf node_modules && pnpm install                                │││
│  │  └─────────────────────────────────────────────────────────────────────┘││
│  │  Risk: 🟡 Medium (destructive command pattern)                           ││
│  │  Context: Agent is fixing dependency conflict                            ││
│  │                                                                          ││
│  │  [✅ Approve] [❌ Deny] [✏️ Edit Command] [💬 Ask Agent] [View Context]  ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  🔍 REVIEW REQUESTED                                     5 min ago      ││
│  │  Agent: implementer-002 · Project: frontend-app                         ││
│  │  ───────────────────────────────────────────────────────────────────── ││
│  │  Round Complete: "Implement dashboard charts"                            ││
│  │  Files Changed: 4 (+234, -12)                                           ││
│  │  Tests: ✅ 12 passed                                                    ││
│  │                                                                          ││
│  │  [👀 Review Changes] [✅ Approve & Continue] [🔄 Request Changes]        ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  💬 QUESTION                                             8 min ago      ││
│  │  Agent: planner-001 · Project: auth-service                              ││
│  │  ───────────────────────────────────────────────────────────────────── ││
│  │  "Should I use OAuth2 or OpenID Connect for the SSO implementation?     ││
│  │   OAuth2 is simpler but OIDC provides identity verification."           ││
│  │                                                                          ││
│  │  [Type your response...]                                     [Send]     ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

4.6.2 Approval Types

python
class ApprovalType(StrEnum):
    """Types of human approval requests."""

    # Risk-based triggers
    DESTRUCTIVE_COMMAND = "destructive_command"    # rm -rf, force push, etc.
    SENSITIVE_FILE = "sensitive_file"              # .env, secrets, creds
    EXTERNAL_API = "external_api"                  # Calling external services
    COST_THRESHOLD = "cost_threshold"              # Approaching budget limit

    # Agent-requested
    REVIEW_PHASE = "review_phase"                  # Agent completed a round
    QUESTION = "question"                          # Agent needs clarification
    SCOPE_CHANGE = "scope_change"                  # Work exceeds original task

    # System-initiated
    MERGE_CONFLICT = "merge_conflict"              # Needs human resolution
    TEST_FAILURE = "test_failure"                  # Tests failed after changes

4.6.3 Review Phase

Agents can explicitly request human review at natural breakpoints:

python
async def request_review_phase(
    agent_id: str,
    round_description: str,
    changes_summary: dict
) -> ApprovalRequest:
    """Agent requests human review of completed work."""

    # Collect context
    diff_stats = await git_diff_stats(agent_worktree)
    test_results = await run_tests(agent_worktree)

    # Create approval request
    request = ApprovalRequest(
        type=ApprovalType.REVIEW_PHASE,
        agent_id=agent_id,
        project_id=agent.project_id,
        title=f"Review: {round_description}",
        summary=changes_summary,
        metadata={
            "files_changed": diff_stats.files,
            "insertions": diff_stats.insertions,
            "deletions": diff_stats.deletions,
            "test_results": test_results.to_dict(),
        },
        actions=["approve_continue", "request_changes", "view_diff", "chat"],
        timeout_minutes=60,  # Auto-expire if no response
    )

    await approval_queue.enqueue(request)

    # Agent pauses until approval
    response = await approval_queue.wait_for_response(request.id)

    if response.action == "request_changes":
        # Agent receives feedback and continues
        return ApprovalResult(
            approved=False,
            feedback=response.message,
            continue_work=True
        )

    return ApprovalResult(approved=True)

4.6.4 Agent-Initiated Checkpoints

Agents can proactively pause for human input:

python
# In agent system prompt
CHECKPOINT_INSTRUCTIONS = """
Request human review when:
1. You've completed a logical unit of work (e.g., finished implementing a feature)
2. You're about to make a significant architectural decision
3. You're unsure about requirements or approach
4. You've hit an unexpected error or blocker
5. Tests are failing and you're not sure why

Use the request_review tool with a clear summary of:
- What you've accomplished
- What you're planning next
- Any questions or concerns
"""

# Agent tool for requesting review
@tool
async def request_review(
    summary: str,
    questions: list[str] | None = None,
    blocking: bool = True
) -> ReviewResponse:
    """Request human review of current progress.

    Args:
        summary: What you've accomplished and current state
        questions: Optional list of questions for the reviewer
        blocking: If True, pause until human responds
    """
    return await hitl.request_review_phase(
        round_description=summary,
        questions=questions,
        blocking=blocking
    )

4.6.5 Approval Flow

Agent Action


┌────────────┐     No      ┌───────────┐
│ Needs      │────────────►│ Execute   │
│ Approval?  │             │ Directly  │
└────────────┘             └───────────┘
     │ Yes

┌────────────────┐
│ Create Request │
│ + Enqueue      │
└────────────────┘


┌────────────────┐     Timeout     ┌───────────────┐
│ Wait for       │────────────────►│ Auto-Deny or  │
│ Human Response │                 │ Escalate      │
└────────────────┘                 └───────────────┘


┌────────────┐
│ Response   │
│ Received   │
└────────────┘

     ├──► Approved ──► Continue Execution

     ├──► Denied ──► Log + Skip Action

     ├──► Edit ──► Execute Modified Action

     └──► Chat ──► Agent Receives Message ──► Retry Decision

4.6.6 Notification System

python
class ApprovalNotifier:
    """Multi-channel notification for pending approvals."""

    async def notify(self, request: ApprovalRequest):
        # In-app notification (always)
        await self.websocket_broadcast(request)

        # Browser notification if tab not focused
        if self.user_settings.browser_notifications:
            await self.send_browser_notification(request)

        # Email for high-priority or stale requests
        if request.priority == "high" or request.age_minutes > 30:
            await self.send_email(request)

        # Slack/Discord integration (if configured)
        if self.integrations.slack:
            await self.send_slack(request)

    async def escalate_stale(self, request: ApprovalRequest):
        """Escalate requests that have been pending too long."""
        if request.age_minutes > 60:
            # Notify additional team members
            await self.notify_team(request, escalation_level=1)

        if request.age_minutes > 120:
            # Auto-pause agent to avoid wasted compute
            await self.pause_agent(request.agent_id)

5. Git Worktree Strategy

5.1 Worktree Isolation Model

~/dev/sibyl/                        # Main repo (main branch)
├── .git/                           # Shared git database
└── ...

~/.sibyl-worktrees/                 # Agent worktrees (sibling to avoid nesting)
├── task_abc123/                    # Agent 1's workspace
│   ├── .venv/                      # Isolated Python env
│   ├── node_modules/               # Isolated Node deps
│   └── ...                         # Full repo checkout
├── task_def456/                    # Agent 2's workspace
└── integration/                    # Merge staging area

5.2 Worktree Manager

python
class WorktreeManager:
    """Manage git worktrees for parallel agent execution."""

    def __init__(self, repo_path: Path, worktree_base: Path):
        self.repo = repo_path
        self.base = worktree_base
        self.base.mkdir(parents=True, exist_ok=True)

    async def create(self, task_id: str) -> Worktree:
        """Create isolated worktree for a task."""
        branch = f"agent/{task_id}"
        path = self.base / task_id

        # Fetch latest from origin
        await self._git(["fetch", "origin", "main"])

        # Create worktree with new branch from origin/main
        await self._git([
            "worktree", "add", "-b", branch,
            str(path), "origin/main"
        ])

        # Install dependencies
        await self._install_deps(path)

        return Worktree(
            task_id=task_id,
            branch=branch,
            path=path,
            created_at=datetime.now(UTC)
        )

    async def cleanup(self, task_id: str):
        """Remove worktree after task completion."""
        path = self.base / task_id
        branch = f"agent/{task_id}"

        await self._git(["worktree", "remove", "--force", str(path)])
        await self._git(["worktree", "prune"])

        # Delete branch if merged
        try:
            await self._git(["branch", "-d", branch])
        except subprocess.CalledProcessError:
            pass  # Branch not merged or doesn't exist

    async def check_conflicts(self, branch_a: str, branch_b: str) -> bool:
        """Pre-check if branches would conflict."""
        merge_base = await self._git(["merge-base", branch_a, branch_b])
        result = await self._git(
            ["merge-tree", merge_base.strip(), branch_a, branch_b],
            check=False
        )
        return "<<<<<<<" in result

5.3 Integration Strategy

python
class IntegrationManager:
    """Coordinate merging of agent worktrees."""

    async def integrate_task(self, task_id: str, target: str = "main"):
        """Integrate a completed task's branch."""
        branch = f"agent/{task_id}"

        # 1. Pre-check for conflicts
        has_conflicts = await self.worktrees.check_conflicts(branch, target)

        if has_conflicts:
            # Create merge request for human review
            return await self.create_conflict_review(task_id, branch, target)

        # 2. Rebase onto target
        await self._git(["checkout", branch])
        await self._git(["rebase", target])

        # 3. Run full test suite
        test_result = await self.run_tests()
        if not test_result.passed:
            return await self.create_test_failure_review(task_id, test_result)

        # 4. Fast-forward merge
        await self._git(["checkout", target])
        await self._git(["merge", "--ff-only", branch])

        # 5. Cleanup
        await self.worktrees.cleanup(task_id)

        return IntegrationResult(status="success", commit=await self._git(["rev-parse", "HEAD"]))

    async def integrate_batch(self, task_ids: list[str], target: str = "main"):
        """Integrate multiple tasks in dependency order."""

        # Build dependency graph
        tasks = [await self.sibyl.get_task(tid) for tid in task_ids]
        sorted_tasks = topological_sort(tasks)

        for task in sorted_tasks:
            result = await self.integrate_task(task.id, target)
            if result.status != "success":
                return result  # Stop on first failure

        return IntegrationResult(status="success", count=len(task_ids))

6. Task Coordination via Sibyl

6.1 Extended Task Model

python
class Task(Entity):
    # ... existing fields ...

    # Agent coordination (NEW)
    assigned_agent: str | None = Field(default=None, description="Agent ID currently working")
    claimed_at: datetime | None = Field(default=None, description="When agent claimed task")
    heartbeat_at: datetime | None = Field(default=None, description="Last agent heartbeat")

    # Worktree tracking (NEW)
    worktree_path: str | None = Field(default=None, description="Path to agent's worktree")
    worktree_branch: str | None = Field(default=None, description="Git branch name")

    # Multi-agent collaboration (NEW)
    collaborators: list[str] = Field(default_factory=list, description="Other agents involved")
    handoff_history: list[dict] = Field(default_factory=list, description="Agent handoff log")

    # Checkpointing (NEW)
    last_checkpoint: dict | None = Field(default=None, description="Last saved progress state")

6.2 Optimistic Task Claiming

python
async def claim_task(task_id: str, agent_id: str) -> Task | None:
    """Claim task using optimistic locking (no contention)."""

    result = await graph.execute_write_org("""
        MATCH (t:Task {uuid: $task_id})
        WHERE t.status IN ['todo', 'backlog']
          AND (t.assigned_agent IS NULL OR t.assigned_agent = '')
        SET t.status = 'doing',
            t.assigned_agent = $agent_id,
            t.claimed_at = datetime(),
            t.heartbeat_at = datetime()
        RETURN t
    """, org_id, task_id=task_id, agent_id=agent_id)

    if result:
        # Emit event for UI update
        await emit_task_event(org_id, task_id, "task_claimed", agent_id)
        return Task.from_node(result[0])

    return None

6.3 Agent Health Monitoring

python
class AgentHealthMonitor:
    """Detect dead agents and reclaim their tasks."""

    HEARTBEAT_INTERVAL = 30  # seconds
    STALE_THRESHOLD = 90     # 3 missed heartbeats
    DEAD_THRESHOLD = 300     # 5 minutes

    async def run_health_check_loop(self):
        """Background task to monitor agent health."""
        while True:
            await asyncio.sleep(self.HEARTBEAT_INTERVAL)

            # Find stale tasks
            stale_tasks = await self.graph.execute_read_org("""
                MATCH (t:Task)
                WHERE t.status = 'doing'
                  AND t.heartbeat_at < datetime() - duration('PT5M')
                RETURN t.uuid as task_id, t.assigned_agent as agent_id
            """, self.org_id)

            for task in stale_tasks:
                await self.reclaim_task(task["task_id"], task["agent_id"])

    async def reclaim_task(self, task_id: str, dead_agent_id: str):
        """Return task to queue after agent death."""
        await self.graph.execute_write_org("""
            MATCH (t:Task {uuid: $task_id})
            SET t.status = 'todo',
                t.assigned_agent = NULL,
                t.reclaim_reason = 'agent_dead',
                t.reclaimed_at = datetime()
        """, self.org_id, task_id=task_id)

        await emit_task_event(self.org_id, task_id, "task_abandoned", dead_agent_id)

7. Web UI: Agents Page

7.1 Navigation Addition

typescript
// apps/web/src/components/layout/sidebar.tsx
const NAVIGATION = [
  // ... existing items ...
  { name: "Agents", href: "/agents", icon: Bot }, // NEW
];

7.2 Agent Dashboard Layout (Project-Organized)

┌─────────────────────────────────────────────────────────────────────────────┐
│  🤖 Agents                                              [+ Start Agent]     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  Organization: Acme Corp                                                ││
│  │  ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐  ┌────────────────────────────┐   ││
│  │  │ 🟢 3 │ │ 🟡 1 │ │ 🔴 0 │ │ ⏸ 2 │  │ 🔔 3 Pending Approvals    │   ││
│  │  │Active│ │Waiting│ │Failed│ │Paused│  └────────────────────────────┘   ││
│  │  └──────┘ └──────┘ └──────┘ └──────┘                                    ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  📁 auth-service                                      2 agents active   ││
│  │  ───────────────────────────────────────────────────────────────────── ││
│  │  ┌─────────────────────────────────────────────────────────────────────┐││
│  │  │  🤖 implementer-001                              [Pause] [Stop]     │││
│  │  │  Task: "Add OAuth2 authentication"                                  │││
│  │  │  Status: 🟢 Working · 23m elapsed · 47% complete                    │││
│  │  │  ┌─────────────────────────────────────────────────────────────────┐│││
│  │  │  │ Latest: "Writing auth middleware... ▊"                          ││││
│  │  │  └─────────────────────────────────────────────────────────────────┘│││
│  │  │  [View Logs] [Open Chat] [View Diff]                                │││
│  │  └─────────────────────────────────────────────────────────────────────┘││
│  │  ┌─────────────────────────────────────────────────────────────────────┐││
│  │  │  🤖 tester-001 · 🟡 Waiting for implementer-001                     │││
│  │  └─────────────────────────────────────────────────────────────────────┘││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  📁 frontend-app                                      1 agent active    ││
│  │  ───────────────────────────────────────────────────────────────────── ││
│  │  ┌─────────────────────────────────────────────────────────────────────┐││
│  │  │  🤖 implementer-002 · 🔍 Review requested · [Review Now]            │││
│  │  └─────────────────────────────────────────────────────────────────────┘││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  📁 shared-libs                                       0 agents          ││
│  │  ───────────────────────────────────────────────────────────────────── ││
│  │  No active agents. [+ Start Agent]                                      ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  Activity Feed (all projects)                           [Expand]       ││
│  │  ─────────────────────────────────────────────────────────────────────  ││
│  │  10:45:23  auth-service/implementer-001  ✏️  Modified: oauth.py        ││
│  │  10:45:21  auth-service/implementer-001  📝  Created: tokens.py        ││
│  │  10:44:12  frontend-app/implementer-002  🔍  Requested review          ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  Resource Usage                                                         ││
│  │  ─────────────────────────────────────────────────────────────────────  ││
│  │  Tokens:  45,230 / 100,000 (45%)  │  Cost: $2.45 / $10.00 budget       ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

7.3 Start Agent Dialog

┌─────────────────────────────────────────────────────────────────────────────┐
│  Start New Agent                                                     [✕]   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  Project:    [auth-service ▼]                                               │
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  What would you like the agent to work on?                              ││
│  │                                                                          ││
│  │  │ Implement rate limiting middleware for the API endpoints.            ││
│  │  │ Use Redis for distributed rate tracking. Add tests.                  ││
│  │  │                                                                      ││
│  │  └──────────────────────────────────────────────────────────────────────││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  Options:                                                                    │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  □ Attach to existing task:  [Select task... ▼]                         ││
│  │  ☑ Create worktree (isolated git branch)                                ││
│  │  □ Request review after each round                                      ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│                                          [Cancel]  [🚀 Start Agent]         │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

7.4 Approval Queue Page

┌─────────────────────────────────────────────────────────────────────────────┐
│  🔔 Approval Queue                                      [Mark All Read]     │
├─────────────────────────────────────────────────────────────────────────────┤
│  [All (5)] [⚠️ Pending (3)] [✅ Approved (12)] [❌ Denied (2)] [⏳ Expired] │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  ⚠️ APPROVAL REQUIRED · auth-service                     2 min ago      ││
│  │  Agent: implementer-001                                                  ││
│  │  ───────────────────────────────────────────────────────────────────── ││
│  │  Action: Execute bash command                                            ││
│  │  ┌─────────────────────────────────────────────────────────────────────┐││
│  │  │  rm -rf node_modules && pnpm install                                │││
│  │  └─────────────────────────────────────────────────────────────────────┘││
│  │  Risk: 🟡 Medium                                                         ││
│  │  Context: Agent is fixing dependency conflict after adding new package   ││
│  │                                                                          ││
│  │  [✅ Approve] [❌ Deny] [✏️ Edit] [💬 Ask Agent] [📋 View Full Context]  ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  🔍 REVIEW REQUESTED · frontend-app                      5 min ago      ││
│  │  Agent: implementer-002                                                  ││
│  │  ───────────────────────────────────────────────────────────────────── ││
│  │  Round Complete: "Implement dashboard charts component"                  ││
│  │                                                                          ││
│  │  Summary:                                                                ││
│  │  • Created ChartCard component with responsive layout                    ││
│  │  • Added bar, line, and pie chart types using Recharts                   ││
│  │  • Integrated with React Query for data fetching                         ││
│  │                                                                          ││
│  │  Files: 4 changed (+234, -12)  │  Tests: ✅ 12 passed                    ││
│  │                                                                          ││
│  │  [👀 Review Diff] [✅ Approve & Continue] [🔄 Request Changes]           ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  💬 QUESTION · auth-service                              8 min ago      ││
│  │  Agent: planner-001                                                      ││
│  │  ───────────────────────────────────────────────────────────────────── ││
│  │  "Should I use OAuth2 or OpenID Connect for SSO? OAuth2 is simpler      ││
│  │   but OIDC provides identity verification out of the box."              ││
│  │                                                                          ││
│  │  Your response:                                                          ││
│  │  ┌─────────────────────────────────────────────────────────────────────┐││
│  │  │                                                                     │││
│  │  └─────────────────────────────────────────────────────────────────────┘││
│  │                                                                  [Send] ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

7.3 Agent Chat Interface

┌─────────────────────────────────────────────────────────────────────────────┐
│  🤖 implementer-001 · "Add OAuth2 authentication"          [⏸ Pause] [✕]  │
├───────────────────────────────────┬─────────────────────────────────────────┤
│                                   │                                          │
│  Chat                             │  Workspace                               │
│  ───────────────────────────────  │  ───────────────────────────────────────│
│                                   │                                          │
│  🤖 Agent:                        │  apps/api/src/auth/oauth.py              │
│  I've analyzed the codebase and   │  ─────────────────────────────────────  │
│  found the existing auth module.  │  1  from fastapi import Depends          │
│  I'll extend it with OAuth2.      │  2  from .tokens import create_token     │
│                                   │  3                                        │
│  Creating `oauth.py` with:        │  4  class OAuth2Handler:                 │
│  - Google provider                │  5      def __init__(self, config):      │
│  - GitHub provider                │  6          self.config = config         │
│  - Token refresh flow             │  7          ...                          │
│                                   │                                          │
│  🤖 Agent is typing...            │  [📁 Files] [📝 Diff] [🖥️ Terminal]     │
│  ─────────────────────────────────│                                          │
│                                   │                                          │
│  You:                             │  ┌─────────────────────────────────────┐│
│  │                             │  │  │  Changes (3 files)                  ││
│  └─────────────────────────────┘  │  │  ✚ oauth.py (+145)                  ││
│                                   │  │  ✚ tokens.py (+67)                  ││
│  [Send] [Pause Agent]             │  │  ~ middleware.py (+12, -3)          ││
│                                   │  └─────────────────────────────────────┘│
└───────────────────────────────────┴──────────────────────────────────────────┘

7.4 Real-time Updates via WebSocket

typescript
// apps/web/src/hooks/useAgentEvents.ts
export function useAgentEvents(orgId: string) {
  const queryClient = useQueryClient();

  useEffect(() => {
    const ws = new WebSocket(`/api/ws/agents/${orgId}`);

    ws.onmessage = (event) => {
      const data = JSON.parse(event.data);

      switch (data.type) {
        case "agent_status":
          queryClient.invalidateQueries(["agents", data.agentId]);
          break;
        case "task_progress":
          queryClient.invalidateQueries(["tasks", data.taskId]);
          break;
        case "approval_required":
          toast.warning(`Agent ${data.agentId} needs approval`);
          break;
        case "agent_output":
          // Stream to chat interface
          break;
      }
    };

    return () => ws.close();
  }, [orgId, queryClient]);
}

8. Persistence & Recovery

The Agent Harness must survive system reboots without losing state or work-in-progress.

8.1 State Layers

┌─────────────────────────────────────────────────────────────────────────────┐
│                           STATE PERSISTENCE                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  DURABLE STATE (FalkorDB + PostgreSQL)                                  ││
│  │  ─────────────────────────────────────────────────────────────────────  ││
│  │  • Agent records (id, type, project, task, status, config)              ││
│  │  • Approval queue items (pending, history)                              ││
│  │  • Session checkpoints (conversation state, tool history)               ││
│  │  • Worktree registry (path, branch, task mapping)                       ││
│  │  • Task assignments (claimed_by, heartbeat_at)                          ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  EPHEMERAL STATE (Redis)                                                ││
│  │  ─────────────────────────────────────────────────────────────────────  ││
│  │  • Active WebSocket connections                                         ││
│  │  • Real-time event streams                                              ││
│  │  • Rate limiting counters                                               ││
│  │  • Distributed locks (rebuilt on startup)                               ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │  FILESYSTEM STATE (Git Worktrees)                                       ││
│  │  ─────────────────────────────────────────────────────────────────────  ││
│  │  • Code changes (committed and uncommitted)                             ││
│  │  • Branch state (survives reboot)                                       ││
│  │  • Stashed work (auto-stash before shutdown if needed)                  ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

8.2 Agent State Model

python
class AgentRecord(BaseModel):
    """Persistent agent state stored in database."""

    id: str = Field(description="Unique agent identifier")
    organization_id: str
    project_id: str
    agent_type: str  # implementer, tester, reviewer, etc.
    spawn_source: str  # "orchestrator" or "user"

    # Assignment
    task_id: str | None = None
    worktree_path: str | None = None
    worktree_branch: str | None = None

    # Lifecycle
    status: AgentStatus  # initializing, working, paused, waiting_approval, completed, failed
    created_at: datetime
    started_at: datetime | None = None
    last_heartbeat: datetime | None = None
    completed_at: datetime | None = None

    # Session (for resume)
    session_id: str | None = None  # Claude Agent SDK session ID
    checkpoint_id: str | None = None  # Last checkpoint for resume
    conversation_turns: int = 0

    # Context
    initial_prompt: str
    system_prompt_hash: str  # Detect if prompt changed

    # Cost tracking
    tokens_used: int = 0
    cost_usd: float = 0.0


class AgentStatus(StrEnum):
    """Agent lifecycle states (persisted)."""

    INITIALIZING = "initializing"    # Setting up worktree
    WORKING = "working"              # Actively executing
    PAUSED = "paused"                # User paused
    WAITING_APPROVAL = "waiting_approval"  # Blocked on human
    WAITING_DEPENDENCY = "waiting_dependency"  # Blocked on another task
    RESUMING = "resuming"            # Recovering after restart
    COMPLETED = "completed"          # Finished successfully
    FAILED = "failed"                # Error state
    TERMINATED = "terminated"        # User stopped

8.3 Session Checkpointing

Leverage Claude Agent SDK's session management for resumability:

python
class AgentCheckpointManager:
    """Persist and restore agent sessions."""

    async def checkpoint(self, agent: AgentInstance) -> str:
        """Save agent state for later resume."""

        checkpoint = AgentCheckpoint(
            agent_id=agent.id,
            session_id=agent.session_id,
            timestamp=datetime.now(UTC),

            # Conversation state
            conversation_history=agent.get_conversation_history(),
            pending_tool_calls=agent.get_pending_tools(),

            # Work state
            files_modified=await self._get_modified_files(agent.worktree),
            uncommitted_changes=await self._get_uncommitted_diff(agent.worktree),

            # Progress
            current_step=agent.current_step,
            completed_steps=agent.completed_steps,

            # Approval state
            pending_approval_id=agent.pending_approval_id,
        )

        # Store in database
        await self.db.save_checkpoint(checkpoint)

        # Update agent record
        await self.db.update_agent(agent.id, checkpoint_id=checkpoint.id)

        return checkpoint.id

    async def restore(self, agent_id: str) -> AgentInstance:
        """Restore agent from last checkpoint."""

        record = await self.db.get_agent(agent_id)
        checkpoint = await self.db.get_checkpoint(record.checkpoint_id)

        # Verify worktree still exists
        if not Path(record.worktree_path).exists():
            raise WorktreeMissingError(f"Worktree gone: {record.worktree_path}")

        # Recreate agent with session resume
        agent = await self.spawner.create_agent(
            agent_type=record.agent_type,
            worktree_path=record.worktree_path,
            task_id=record.task_id,
            # Resume from checkpoint
            resume_session_id=checkpoint.session_id,
            conversation_history=checkpoint.conversation_history,
        )

        # Restore approval wait if needed
        if checkpoint.pending_approval_id:
            await self.approval_queue.reattach(
                checkpoint.pending_approval_id,
                agent
            )

        return agent

8.4 Startup Recovery Flow

python
class AgentHarnessRecovery:
    """Recover agent harness state after system restart."""

    async def recover(self):
        """Full recovery sequence on startup."""

        logger.info("🔄 Starting Agent Harness recovery...")

        # 1. Rebuild ephemeral state
        await self._rebuild_locks()
        await self._clear_stale_connections()

        # 2. Audit worktrees
        worktree_status = await self._audit_worktrees()

        # 3. Find agents that need recovery
        agents_to_recover = await self.db.query_agents(
            status__in=[
                AgentStatus.WORKING,
                AgentStatus.WAITING_APPROVAL,
                AgentStatus.WAITING_DEPENDENCY,
                AgentStatus.PAUSED,
            ]
        )

        logger.info(f"📋 Found {len(agents_to_recover)} agents to recover")

        # 4. Recover each agent
        for record in agents_to_recover:
            try:
                await self._recover_agent(record, worktree_status)
            except Exception as e:
                logger.error(f"❌ Failed to recover agent {record.id}: {e}")
                await self._mark_agent_failed(record.id, str(e))

        # 5. Resume approval queue processing
        await self.approval_queue.resume()

        # 6. Notify orchestrator of recovered state
        await self.orchestrator.on_recovery_complete()

        logger.info("✅ Agent Harness recovery complete")

    async def _recover_agent(
        self,
        record: AgentRecord,
        worktree_status: dict
    ):
        """Recover a single agent."""

        # Check worktree health
        wt = worktree_status.get(record.worktree_path)
        if not wt or not wt.exists:
            # Worktree missing - cannot recover
            await self._mark_agent_failed(
                record.id,
                "Worktree missing after restart"
            )
            return

        # Mark as resuming
        await self.db.update_agent(record.id, status=AgentStatus.RESUMING)

        if record.status == AgentStatus.PAUSED:
            # Leave paused, just update registry
            logger.info(f"⏸️  Agent {record.id} remains paused")
            return

        if record.status == AgentStatus.WAITING_APPROVAL:
            # Re-register with approval queue, don't resume yet
            await self.approval_queue.reattach_waiter(record.id)
            logger.info(f"⏳ Agent {record.id} waiting for approval")
            return

        # Resume active agent
        agent = await self.checkpoint_manager.restore(record.id)

        # Inject recovery context
        await agent.inject_message(
            role="system",
            content=f"""
            [SYSTEM RECOVERY]
            The system was restarted. You are resuming from your last checkpoint.

            Your worktree is intact at: {record.worktree_path}
            Last known state: {record.status}
            Time since last activity: {datetime.now(UTC) - record.last_heartbeat}

            Please verify your work state and continue from where you left off.
            """
        )

        # Resume execution
        await self.orchestrator.resume_agent(agent)
        logger.info(f"▶️  Agent {record.id} resumed")

    async def _audit_worktrees(self) -> dict[str, WorktreeHealth]:
        """Check health of all registered worktrees."""

        registered = await self.db.get_all_worktrees()
        results = {}

        for wt in registered:
            path = Path(wt.path)
            results[wt.path] = WorktreeHealth(
                exists=path.exists(),
                branch_exists=await self._check_branch(wt.branch),
                has_uncommitted=await self._has_uncommitted(path) if path.exists() else False,
                last_commit=await self._get_last_commit(path) if path.exists() else None,
            )

        return results

8.5 Graceful Shutdown

python
class AgentHarnessShutdown:
    """Graceful shutdown with state preservation."""

    async def shutdown(self, timeout_seconds: int = 30):
        """Gracefully shut down all agents."""

        logger.info("🛑 Initiating graceful shutdown...")

        # 1. Stop accepting new work
        await self.orchestrator.stop_accepting()

        # 2. Pause all active agents
        active_agents = await self.db.query_agents(status=AgentStatus.WORKING)

        for agent in active_agents:
            try:
                # Request agent to pause at next safe point
                await agent.request_pause(reason="system_shutdown")

                # Wait for acknowledgment (with timeout)
                await asyncio.wait_for(
                    agent.wait_for_pause(),
                    timeout=timeout_seconds / len(active_agents)
                )

                # Checkpoint
                await self.checkpoint_manager.checkpoint(agent)

                # Commit any uncommitted work
                await self._safe_commit(agent.worktree, "WIP: System shutdown checkpoint")

            except asyncio.TimeoutError:
                logger.warning(f"⚠️  Agent {agent.id} didn't pause in time, force checkpointing")
                await self.checkpoint_manager.checkpoint(agent)

        # 3. Update all agent statuses
        await self.db.bulk_update_agents(
            [a.id for a in active_agents],
            status=AgentStatus.PAUSED,
            paused_reason="system_shutdown"
        )

        # 4. Persist approval queue state
        await self.approval_queue.persist()

        logger.info("✅ Graceful shutdown complete")

8.6 Data Models for Persistence

python
# New database tables/graph nodes needed

class WorktreeRecord(BaseModel):
    """Persistent worktree registry."""

    id: str
    task_id: str
    agent_id: str | None
    path: str
    branch: str
    base_commit: str  # Commit worktree was created from
    created_at: datetime
    last_used: datetime
    status: str  # active, orphaned, merged, deleted


class ApprovalRecord(BaseModel):
    """Persistent approval queue item."""

    id: str
    organization_id: str
    project_id: str
    agent_id: str
    type: ApprovalType
    priority: str

    # Request details
    title: str
    summary: str
    metadata: dict
    actions: list[str]

    # Lifecycle
    created_at: datetime
    expires_at: datetime | None
    responded_at: datetime | None
    response: str | None  # approved, denied, edited
    response_by: str | None  # user who responded
    response_message: str | None


class AgentCheckpoint(BaseModel):
    """Snapshot of agent state for resume."""

    id: str
    agent_id: str
    session_id: str
    timestamp: datetime

    # Conversation
    conversation_history: list[dict]  # Serialized messages
    pending_tool_calls: list[dict]

    # Work state
    files_modified: list[str]
    uncommitted_changes: str  # Git diff

    # Progress
    current_step: str | None
    completed_steps: list[str]

    # Blocking state
    pending_approval_id: str | None
    waiting_for_task_id: str | None

9. Development Phases

Phase 1: Foundation

Epic: Core Agent Infrastructure

TaskDescriptionPriority
Extend Task modelAdd agent fields (assigned_agent, heartbeat_at, etc.)Critical
AgentRecord modelPersistent agent state in FalkorDBCritical
Worktree managerCreate/cleanup worktrees for tasksCritical
Worktree registryPersistent tracking of all worktreesCritical
Agent SDK integrationBasic query() wrapper with Sibyl toolsCritical
Task claimingOptimistic locking for agent task claimsHigh
Heartbeat systemAgent health monitoring + reclaimHigh

Phase 2: Orchestration & Persistence

Epic: Multi-Agent Coordination

TaskDescriptionPriority
Orchestrator serviceCentral coordinator processCritical
On-demand spawningCreate agents via orchestrator or userCritical
Agent checkpointingSave conversation + work stateCritical
Session resumeRestore agents from checkpointsCritical
Dependency resolutionExecute tasks in correct orderHigh
Event busRedis-based event streamingHigh
Graceful shutdownPause agents, checkpoint, persist queueHigh
Startup recoveryAudit worktrees, resume agentsHigh

Phase 3: Human-in-the-Loop

Epic: Approval System

TaskDescriptionPriority
ApprovalRecord modelPersistent queue itemsCritical
Approval queue serviceEnqueue, wait, respondCritical
Hook integrationPre-tool approval triggersCritical
Review phase toolAgent-initiated review requestsHigh
Notification systemWebSocket + browser + emailHigh
Approval queue UITabbed interface with actionsHigh
Stale request handlingEscalation + auto-pauseMedium

Phase 4: Web UI

Epic: Agent Management Interface

TaskDescriptionPriority
Agents pageProject-organized dashboardCritical
Start Agent dialogUser-initiated agent spawningCritical
Agent chatReal-time conversation with agentsCritical
Activity feedCross-agent action streamHigh
Resource metricsToken usage, costs, timingMedium
Agent controlsPause, resume, stop, view logsMedium

Phase 5: Git Integration

Epic: Worktree & Merge Operations

TaskDescriptionPriority
Merge orchestrationRebase/merge worktrees to mainCritical
Conflict detectionPre-check with git merge-treeHigh
Conflict resolution UIHuman-assisted conflict handlingHigh
PR creationAuto-create PRs from agent workHigh
Test integrationRun tests before mergeHigh
Cross-project agentsManage agents across projectsMedium

10. Risk Mitigation

Technical Risks

RiskMitigation
Agent runaway costsBudget limits with auto-pause
Merge conflictsPre-check with git merge-tree
Dead agentsHeartbeat + auto-reclaim
Context explosionHierarchical compression, subagent isolation
SecuritySandbox mode, file path restrictions

Operational Risks

RiskMitigation
User overwhelmGraduated autonomy levels
Trust issuesTransparent logs, approval gates
Debugging complexityEvent replay, detailed traces
Cost unpredictabilityReal-time cost dashboard

11. Success Metrics

MetricTargetMeasurement
Task completion rate>85%Tasks completed without human intervention
Time to first value<5 minFrom feature description to first code
Conflict rate<10%Merges requiring manual resolution
User satisfaction>4/5Post-session ratings
Cost efficiency<$1/taskAverage cost per completed task

Appendix: Key Sources

Claude Agent SDK

Multi-Agent Patterns

UX Patterns


This document represents the vision for Sibyl Agent Harness. Implementation details may evolve as we learn from early prototypes.

Released under the MIT License.