Skip to content

Capability System

Capabilities define what Iris can do. Each capability is a TOML file that specifies the task prompt and expected output format.

Location: src/agents/capabilities/*.toml

Design Philosophy

Separation of Concerns

  • TOML files define the task and output type
  • Agent handles execution and tool calling
  • Types define the structured response schema

This separation allows:

  • Non-programmers to modify task instructions
  • Easy experimentation with different prompts
  • Version control of prompt engineering
  • Compile-time embedding for portability

LLM-Driven Structure

Capabilities don't rigidly enforce structure — they guide the LLM. For example:

  • Commit messages: JSON with specific fields (emoji, title, message)
  • Reviews: Markdown with suggested sections, but Iris decides final structure
  • PRs: Markdown with flexibility for project-specific conventions

The LLM adapts to project needs while following general guidelines.

Capability Structure

A capability TOML has three fields:

toml
name = "capability_name"
description = "Short description of what this capability does"
output_type = "OutputTypeName"

task_prompt = """
Multi-line prompt that instructs Iris...
"""

Output Types

Output types map to Rust enums in src/agents/iris.rs:

rust
pub enum StructuredResponse {
    CommitMessage(GeneratedMessage),       // JSON: { emoji, title, message, completion_message }
    PullRequest(MarkdownPullRequest),      // Markdown wrapper: { content: String }
    Changelog(MarkdownChangelog),          // Markdown wrapper: { content: String }
    ReleaseNotes(MarkdownReleaseNotes),    // Markdown wrapper: { content: String }
    Review(crate::types::Review),          // Structured: { summary, metadata, findings[], stats }
    SemanticBlame(String),                 // Plain text
    PlainText(String),                     // Fallback
}

Strict structured types (GeneratedMessage, Review) define schemas Iris must populate field-by-field. Markdown wrappers carry a single content: String field and let Iris choose the layout. The internal verify capability returns a private Critique struct (see Critic Verification); it never appears in StructuredResponse.

Built-in Capabilities

1. Commit (commit.toml)

Purpose: Generate commit messages from staged changes

Output: GeneratedMessage

rust
#[derive(Serialize, Deserialize, JsonSchema)]
pub struct GeneratedMessage {
    pub emoji: Option<String>,            // Single gitmoji or null
    pub title: String,                    // Subject line (max 72 chars)
    pub message: String,                  // Body (may be empty)
    #[serde(default)]
    pub completion_message: Option<String>, // Short UI status, e.g. "Auth refactor ready."
}

completion_message is required by commit.toml so the Studio TUI has a one-line status to show when generation finishes.

Key instructions:

  • Start with git_diff() for change evidence
  • Use project_docs(doc_type="context") when repository conventions or product framing matter
  • Treat project_docs(doc_type="context") as a compact snapshot; use targeted doc types for full files
  • Adapt context strategy based on changeset size
  • Use parallel_analyze for very large changes

Style adaptation:

  • Gitmoji mode: Set emoji field from gitmoji list
  • Conventional mode: Set emoji to null, use conventional prefixes
  • Presets: Apply personality while maintaining structure

2. Review (review.toml)

Purpose: Analyze code changes and emit a structured, parseable review

Output: Review — a strict JSON shape, not a markdown wrapper. The capability's output_type = "Review" (see src/agents/capabilities/review.toml:3).

rust
pub struct Review {
    pub summary: String,
    pub metadata: ReviewMetadata,   // risk_level, strategy, specialist_passes, coverage_notes
    pub findings: Vec<Finding>,     // id, severity, confidence, file, line range, category, body, suggested_fix, evidence
    pub stats: ReviewStats,         // files_reviewed, findings_count, critical/high/medium/low counts
    pub parse_failed: bool,         // set when from_unstructured() rescues raw text
}

Finding.confidence is an integer 0–100. Findings below DEFAULT_MIN_FINDING_CONFIDENCE = 70 are hidden from visible_findings() and from inline GitHub comments. Category is a 12-variant enum: security, performance, error_handling, complexity, abstraction, duplication, testing, style, api_contract, concurrency, documentation, other. Severity and RiskLevel accept critical/high/medium/low.

Key instructions (from review.toml):

  • Use git_diff(detail="summary") first; escalate to repo_map, file_read, static_analysis, git_show, or parallel_analyze based on size and risk.
  • Only report findings with confidence ≥ 70; do not duplicate issues a configured linter or type-checker already catches.
  • Cite an exact file:start_line (and end_line) on a changed line; supply suggested_fix when feasible and evidence references for non-trivial claims.
  • Set metadata.risk_level, name your strategy, list specialist_passes you ran (or delegated through parallel_analyze), and record coverage_notes.
  • Return a JSON object matching the schema — never markdown. If there are no actionable issues, return findings: [] and zero counts in stats.

3. Pull Request (pr.toml)

Purpose: Generate PR descriptions from branch changes

Output: MarkdownPullRequest

Suggested sections:

  • Summary
  • Changes
  • Test Plan
  • Breaking Changes (if any)
  • Screenshots/Demos (if applicable)

Key instructions:

  • Use git_diff(from="<default-branch>", to="HEAD") for full branch context
  • Analyze entire feature branch, not just latest commit
  • Include migration/upgrade notes for breaking changes
  • Suggest testing approach

4. Changelog (changelog.toml)

Purpose: Generate changelog entries in Keep a Changelog format

Output: MarkdownChangelog

Structure:

markdown
## [Version] - YYYY-MM-DD

### Added

- New features

### Changed

- Enhancements to existing features

### Deprecated

- Features marked for removal

### Removed

- Deleted features

### Fixed

- Bug fixes

### Security

- Security patches

Key instructions:

  • Group changes by category
  • Be specific about what changed
  • Include migration notes if needed
  • Focus on user-facing impact

5. Release Notes (release_notes.toml)

Purpose: Generate user-facing release documentation

Output: MarkdownReleaseNotes

Suggested sections:

  • Highlights
  • Breaking Changes
  • New Features
  • Improvements
  • Bug Fixes
  • Performance
  • Upgrade Instructions

Key instructions:

  • Write for end users, not developers
  • Highlight impact and benefits
  • Include version numbers and dates
  • Provide upgrade path for breaking changes

6. Chat (chat.toml)

Purpose: Interactive conversation with Iris in Studio

Output: Varies (text or tool calls)

Special features:

  • Access to content update tools (update_commit, update_pr, update_review)
  • Can read and modify current Studio content
  • Freeform conversation for exploration

7. Semantic Blame (semantic_blame.toml)

Purpose: Explain the history and reasoning behind code

Output: SemanticBlame (plain text)

Key instructions:

  • Read git log for the file/region
  • Analyze commit messages and diffs
  • Explain why the code evolved this way
  • Connect changes to broader project goals

8. Verify (verify.toml) — Critic Verification

Purpose: Internal critic pass that checks a generated artifact against repository evidence.

Output: Critique — a private struct in iris.rs with fields requires_revision: bool, issues: Vec<CritiqueIssue> (title, body, severity), revision_prompt: String, confidence: u8. Critique is not part of StructuredResponse; it's consumed inside verify_response_if_enabled and used to decide whether to regenerate the artifact.

How it runs. After execute_output_type produces a StructuredResponse, execute_task passes the result to verify_response_if_enabled. When the critic is enabled (default Config.critic_enabled = true) and the (capability, output_type) pair matches review / pr / changelog / release_notes, Iris loads verify.toml, runs execute_with_agent::<Critique> against the serialized artifact and the original user prompt, and:

  • If requires_revision is false (or true but issues and revision_prompt are both empty), returns the original artifact.
  • Otherwise builds a revision prompt with the critic's issues and instruction appended and calls execute_output_type exactly once more.

commit / GeneratedMessage uses the same mechanism only when the invocation explicitly opts in with gen --critic.

What the critic flags. Unsupported claims, asserted risks without code verification, review findings citing the wrong file or line, PR/changelog/release note text that overstates scope, and missing caveats when an inference is presented as fact. It deliberately skips wording preferences and style choices that match repository conventions.

The critic is a safety net: any error inside the pass (capability load failure, schema mismatch, network error) is logged as a warning and the original artifact is returned unchanged. To opt out, set critic_enabled = false in the Git-Iris config.

Creating a Custom Capability

Step 1: Create the TOML File

Create src/agents/capabilities/my_capability.toml:

toml
name = "my_capability"
description = "What my capability does"
output_type = "MyOutputType"

task_prompt = """
Instructions for Iris on how to complete this task.

## Tools Available
- `git_diff()` - Get changes
- `file_read()` - Read files
- `code_search()` - Search for patterns

## Output Requirements
Describe the expected structure...
"""

Step 2: Define the Output Type

In src/types/my_output.rs:

rust
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)]
pub struct MyOutputType {
    pub summary: String,
    pub details: Vec<String>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub metadata: Option<Metadata>,
}

Step 3: Add to StructuredResponse Enum

In src/agents/iris.rs:

rust
pub enum StructuredResponse {
    // ... existing variants
    MyOutput(MyOutputType),
}

impl fmt::Display for StructuredResponse {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        match self {
            // ... existing matches
            StructuredResponse::MyOutput(output) => {
                write!(f, "{}", output.summary)
            }
        }
    }
}

Step 4: Embed the Capability

In src/agents/iris.rs, add the constant:

rust
const CAPABILITY_MY_CAPABILITY: &str = include_str!("capabilities/my_capability.toml");

And update the loader:

rust
fn load_capability_config(&self, capability: &str) -> Result<(String, String)> {
    let content = match capability {
        // ... existing capabilities
        "my_capability" => CAPABILITY_MY_CAPABILITY,
        _ => { /* fallback */ }
    };
    // ...
}

Step 5: Handle Execution

In execute_output_type() (src/agents/iris.rs:1068-1129) — the inner dispatch function called by execute_task — add a match arm:

rust
match output_type {
    // ... existing types
    "MyOutputType" => {
        let response = self
            .execute_with_agent::<MyOutputType>(system_prompt, user_prompt)
            .await?;
        Ok(StructuredResponse::MyOutput(response))
    }
    // ...
}

execute_task itself just loads the capability, injects style instructions, calls execute_output_type, then runs verify_response_if_enabled for the critic pass — you don't need to touch it for a new output type unless you want the critic to gate your new artifact (add the (capability, output_type) pair to should_run_critic if you do).

Step 6: Test

bash
cargo build
cargo run -- my-capability

Prompt Engineering Best Practices

1. Context Gathering

Instruct Iris to gather the highest-signal evidence first, then pull repo docs when they materially change the answer:

toml
task_prompt = """
## Context Gathering
`project_docs(doc_type="context")` returns a compact snapshot of README and agent instructions.
Start with `git_diff()` for code evidence, then call `project_docs` when conventions, terminology, or workflow rules matter.
"""

2. Tool Guidance

List available tools with clear purposes:

toml
## Tools Available
- `git_diff()` - Get staged changes with relevance scores
- `git_log(count=5)` - Recent commits for style reference
- `file_read(path, start_line, num_lines)` - Read file contents

3. Size-Based Strategy

Guide Iris on how to handle different changeset sizes:

toml
## Context Strategy by Size
- **Small** (≤3 files): Consider all changes
- **Large** (>10 files): Focus on high-relevance files
- **Huge** (>20 files): Use `parallel_analyze`

4. Output Requirements

Be explicit about format:

toml
## Output Requirements
- **Subject line**: Imperative mood, max 72 chars
- **Body**: Wrap at 72 chars, explain WHY not what
- **Plain text only**: No markdown, no code fences

5. Avoid Uncertainty

Instruct Iris to be definitive:

toml
## Writing Guidelines
- **NEVER use speculative language**: Avoid "likely", "probably", "seems"
- If unsure, use tools to investigate
- State facts definitively

6. Style Flexibility

Allow preset injection:

toml
## Style Adaptation
If STYLE INSTRUCTIONS are provided, prioritize that style.
A cosmic preset means cosmic language. Express the style!

This enables users to inject personality via presets.

Advanced Patterns

Conditional Tool Calls

Instruct Iris to adapt:

toml
If the changeset is large (>20 files or >1000 lines):
  - Use `parallel_analyze` to distribute analysis
  - Example: parallel_analyze({ "tasks": ["Analyze auth/", "Review API/"] })
Otherwise:
  - Use `git_diff()` and `file_read()` directly

Multi-Stage Analysis

Guide a workflow:

toml
1. Call `git_diff()` to see what changed
2. Identify the primary affected subsystem
3. Call `code_search()` to find related patterns
4. Call `file_read()` for detailed context
5. Synthesize findings into a coherent summary

Project-Specific Adaptation

Use project docs:

toml
When `project_docs(doc_type="context")` is relevant:
- Follow any commit conventions from AGENTS.md
- Use terminology from README
- Respect project style guide

Validation and Recovery

All JSON outputs go through schema validation:

  1. Schema generationschemars::schema_for! creates JSON schema from Rust type
  2. Prompt injection — Schema is added to prompt as a constraint
  3. Response parsingextract_json_from_response() finds JSON in response
  4. Sanitizationsanitize_json_response() fixes control characters
  5. Validationvalidate_and_parse() attempts recovery if parsing fails

See Output Validation for details.

Debugging Capabilities

Run with --debug to see:

  • Which capability is loaded
  • The full prompt sent to the LLM
  • Tool calls made by Iris
  • JSON extraction and validation steps
  • Token usage statistics
bash
git-iris gen --debug

Color-coded output shows:

  • 🔵 Blue — Phase transitions
  • 🟢 Green — Successful operations
  • 🟡 Yellow — Warnings
  • 🔴 Red — Errors

Best Practices Summary

DO:

  • Start with git_diff() or the primary change evidence
  • Use project_docs(doc_type="context") as a compact conventions snapshot
  • Provide clear tool descriptions
  • Guide size-based strategies
  • Allow style flexibility
  • Be explicit about output format

DON'T:

  • Hardcode project-specific details
  • Over-constrain markdown structure
  • Assume file locations
  • Use speculative language
  • Ignore relevance scores

Next Steps

Released under the Apache 2.0 License.