Agent Catalog User Guide

Agent Catalog targets three (non-mutually-exclusive) types of users:

Agent Builders: Those responsible for creating prompts and agents.
Tool Builders: Those responsible for creating tools.
Agent Analysts: Those responsible for analyzing agent performance.

In this short guide, we detail the workflow each type of user follows when using Agent Catalog. We assume that you have already installed the agentc package. If you have not, please refer to the Installation page.

Metrics Driven Development

The Agent Catalog package is not just a tool/prompt catalog, it’s a foundation for building agents using metrics-driven development. Agent builders will follow this workflow:

Sample Downloading: Download the starter agent from the templates/agents directory.
Agent Building: The sample agent is meant to be a reference for building your own agents. You will need to modify the agent to fit your use case.
- Agent Catalog integrates with agent applications in two main areas: i) by providing tools and prompts to the agent framework via agentc.Provider instances, and ii) by providing auditing capabilities to the agent via agentc.Auditor instances. The sample agent demonstrates how to use both of these classes.
- Agent Catalog providers will always return plain ol’ Python functions. SQL++ tools, semantic search tools, and HTTP request tools undergo some code generation (in the traditional sense, not using LLMs) to yield Python functions that will easily slot into any agent framework. Python tools indexed by agentc will be returned as-is.
  
  Note
  
  Users must ensure that these tools already exist in the agent application’s Git repository, or that the Python source code tied to the tool can be easily imported using Python’s import statement.
Prompt Building: Follow the steps outlined in the Couchbase-Backed Agent Catalogs section to create prompts.
- In a multi-team setting, you can also use agentc find prompt to see if other team members have already created prompts that address your use case.
- To accelerate prompt building, you can specify your tool requirements in the prompt. This will allow Agent Catalog to automatically fetch the tools you need when the prompt is executed.
Agent Execution: Run your agent! Depending on how your agentc.Auditor instances are configured, you should see logs in the ./agent-activity directory and/or in the agent_activity scope of your Couchbase instance.

Couchbase-Backed Agent Catalogs

The catalog (currently) versions two types of items: tools and prompts. Both tool builders and prompt builders (i.e., agent builders) will follow this workflow:

Assessing Agent Quality

The Agent Catalog package also provides a foundation for analyzing agent performance. Agent analysts will follow this workflow:

Log Access: Your first step is to get access to the agentc.Auditor captured logs. For logs sent to Couchbase, you can find them in the agent_activity.raw_logs collection of your Couchbase instance. For logs stored locally, you can find them in the ./agent-activity directory. We recommend the former, as it allows for easy ad-hoc analysis through Couchbase Query and/or Couchbase Analytics.
Log Transformations: For users with Couchbase Analytics enabled, we provide the following views (expressed as Couchbase Analytics Views) to help you get started with conversational-based agents. All Views belong to the scope agent_activity and can be queried using the Analytics service by executing the following query:
```
SELECT logs_view.* FROM `[[MY_BUCKET]]`.agent_activity.[VIEW_NAME] AS logs_view;
```
Following are the type of Views available to explore:
Sessions (sid, start_t, vid, msgs)

The Sessions view provides one record per session (alt. conversation). Each session record contains:
1. the session ID sid,
2. the session start time start_t,
3. the catalog version vid, and
4. a list of messages msgs.
The msgs field details all events that occurred during the session (e.g., the user’s messages, the response to the user, the internal “thinking” performed by the agent, the agent’s transitions between tasks, etc…). The latest session can be found by applying the filter:
```
WHERE sid = `[[MY_BUCKET]]`.agent_activity.LastSession
```
Exchanges (sid, question, answer, walk)

The Exchanges view provides one record per exchange (i.e., the period between a user question and an assistant response) in a given session. Each exchange record contains:
1. the session ID sid,
2. the user’s question question,
3. the agent’s answer answer, and
4. the agent’s walk walk (e.g., the messages sent to the LLMs, the tools executed, etc…).
This view is commonly used as input into frameworks like Ragas.
LLMGenerations (session, llm_generations)

The LLMGenerations view provides each group of messages generated by the LLM per session. Each llm generations record contains:
1. the session ID session and
2. list of llm generated messages with common grouping id per session llm_generations.
This view is commonly used to dive deeper into the LLM workings and though process.
ToolCalls (sid, vid, tool_calls)

The ToolCalls view provides one record per session (alt. conversation). Each tool call record contains:
1. the session ID sid,
2. the catalog version vid, and
3. a list of tool calls tool_calls.
The tool_calls field details all information around an LLM tool call (e.g., the tool name, the tool-call arguments, and the tool result).
Walks (vid, msgs, sid)

The Walks view provides one record per session (alt. conversation). This view is essentially the Sessions view where all msgs only contain task transitions.

The next two steps are under active development!

Log Analysis: Once you have a grasp how your agent is working, you’ll want to move into the realm of “quantitative”. A good starting point is Ragas, where you can use the Analytics service to serve “datasets” to the Ragas evaluate function [1].
Log Visualization: Users are free to define their own views from the steps above and visualize their results using dashboards like Tableau or Grafana [2].

Ignoring Files While Indexing

When indexing tools and prompts, you may want to ignore certain files. By default the index command will ignore files/patterns present in .gitignore file.

In addition to .gitignore, there might be situation where additional files have to be ignored by agentc and not git. To add such files/patterns .agentcignore file can be used similar to .gitignore.

For example, if the project structure is as below:

project/
├── docs/
│   ├── conf.py
│   ├── index.rst
│   └── structure.rst
├── src/
│   ├── tool1.py
│   ├── tool2.sqlpp
│   └── agent.py
├── prompts/
│   ├── prompt1.prompt
│   └── prompt2.jinja
├── .gitignore
└── README.md

src/agent.py contains the code for agent which uses the tools and prompts present in the project. src directory contains the code for the agent along with the tools.

While indexing using the command agentc index --tools src, src/agent.py will be indexed along with the tools present in the src directory.

Inorder to avoid that, .agentcignore file can be added in src directory with the following content to avoid indexing the file containing agent code:

agent.py