Agent Catalog Concepts

Agent Catalog targets three (non-mutually-exclusive) types of users:

Agent Builders

Those responsible for creating prompts and agents.

Tool Builders

Those responsible for creating tools.

Agent Analysts

Those responsible for analyzing agent performance.

In this short guide, we detail the workflow each type of user follows when using Agent Catalog. We assume that you have already installed the agentc package. If you have not, please refer to the Installation page.

Metrics Driven Development

The Agent Catalog package is not just a tool/prompt catalog, it's a foundation for building agents using metrics-driven development. Agent builders will follow this workflow:

  1. Sample Downloading: Download the starter agent from the templates/with_langgraph directory.

  2. Project Initializing: Initialize your project as a Git repository and invoke the Agent Catalog initialization command. Don't forget to setup the appropriate environment variables!

    $ # Initialize a new Git repository with an initial commit.
    $ git init
    $ git add *; git add .gitignore .env.example .pre-commit-config.yaml
    $ git commit -m "Initial commit"
    
    $ # Initialize your local and remote Agent Catalog instances.
    $ cp .env.example .env; vi .env
    $ agentc init
    
  3. Agent Building: The sample agent system is meant to be a reference for building your own agents. You will need to modify the agent to fit your use case.

    • Agent Catalog integrates with agent applications in two main areas: i) by providing tools and prompts to the agent framework via agentc.Catalog instances, and ii) by providing analytics capabilities to the agent via agentc.Span instances. The sample agent system demonstrates how to use both of these classes.

    • Agent Catalog providers will always return plain ol' Python functions. SQL++ tools, semantic search tools, and HTTP request tools undergo some code generation (in the traditional sense, not using LLMs) to yield Python functions that will easily slot into any agent framework. Python tools indexed by agentc index will be returned as-is.

      Note

      Users must ensure that these tools already exist in the agent application's Git repository, or that the Python source code tied to the tool can be easily imported using Python's import statement.

  1. Prompt Building: Follow the steps outlined in the Couchbase-Backed Agent Catalogs section to create prompts.

    • In a multi-team setting, you can also use agentc find prompt to see if other team members have already created prompts that address your use case.

    • To accelerate prompt building, you can specify your tool requirements in the prompt. This will allow Agent Catalog to automatically fetch the tools you need when the prompt is executed.

  2. Agent Execution: Run your agent system! Depending on how your agentc.Span instances are configured, you should see logs in the ./agent-activity directory and/or in the agent_activity.logs collection of your Couchbase instance.

  3. Agent Tuning: Make changes to your agent system and "register" them with Git + the agentc command line tool. This sample app illustrates how to setup agentc index + agentc publish as post-commit hooks. Try this out yourself by running the commands below:

    $ # This command only needs to be run once.
    $ pre-commit install --hook-type post-commit --hook-type pre-commit
    
    $ # Commit your changes. `agentc index` + `agentc publish` will run after `git commit`.
    $ git add [CHANGED_FILES]
    $ git commit -m "My changes"
    

    For changes that are small and don't warrant a new commit, these hooks will also apply to git commit --amend.

    $ git add [CHANGED_FILES]
    $ git commit --amend
    

    All logs your agent system generates are bound to the Git SHA generated by git commit, thus you can easily see the changes you've made using git diff [GIT_SHA_IN_LOGS].

Couchbase-Backed Agent Catalogs

The catalog (currently) versions two types of items: tools and prompts. Both tool builders and prompt builders (i.e., agent builders) will follow this workflow:

  1. Repository Cloning: Grab the Git repository + Couchbase bucket that your team is working on and run git clone + agentc init --no-db. If you have run the steps in section above (i.e., you are a one-developer team), skip this step.

    $ git clone [MY_TEAMS_APP_REPOSITORY]
    $ agentc init --no-db
    
    $ # Install your post-commit hooks to automatically run "index" + "publish".
    $ pre-commit install --hook-type post-commit --hook-type pre-commit
    
  2. Tool Creation: For users with existing Python tools, simply decorate your existing functions with the agentc.catalog.tool decorator.

    import agentc
    
    @agentc.catalog.tool
    def positive_sentiment_analysis_tool(text_to_analyze: str) -> float:
        """ Using the given text, return a number between 0 and 1.
            A value of 0 means the text is not positive.
            A value of 1 means the text is positive.
            A vale of 0.5 means the text is slightly positive. """
       ...
    

    For users that want to leverage our suite of declarative tools (i.e., semantic search, OpenAPI spec, and SQL++ tools), use the agentc add command (see here) to automatically download the template of your choice.

  3. Prompt Creation: Prompts in Agent Catalog must be authored in YAML. Similar to our suite of declarative tools, use the agentc add command to automatically download the templat of your choice.

  4. Indexing: Agent Catalog will be unaware of any changes you make until you run agentc index, which will crawl a set of directories for tools and prompts. For workflows that have agentc index installed as a post-commit hook, you should not have to run this command manually --- nonetheless, we show the agentc index below for some "behind-the-scenes" clarity.

    $ agentc index [DIRECTORY] --prompts/no-prompts --tools/no-tools
    

    [DIRECTORY] refers to the directory containing your tools/prompts.

    Note

    By default, files and directories ignored by Git via .gitignore will also be ignored by agentc index. To accommodate situations where a file should be ignored by agentc index but not git, developers can specify an .agentcignore file (similar to a .gitignore file). Agent Catalog will "run" all Python files found during agentc index to find Python tools, thus an .agentcignore file is necessary to ignore executable scripts.

  5. Publishing: Indexing will populate your local catalog with tools and prompts versioned by Git. To make your local catalog available as a snapshot that can be JOINed with the logs your agent application generates, use the agentc publish command. For workflows that have agentc publish installed as a post-commit hook, you should not have to run this command manually --- nonetheless, we show the agentc publish below for some "behind-the-scenes" clarity.

    $ # Don't forget to modify your ".env" file appropriately!
    $ agentc publish
    

    agentc publish does not accept local catalogs indexed with a dirty Git repository, therefore make sure that git status reveals no tracked changes before running agentc index [DIRECTORY] + agentc publish.

  6. Prompt/Tool Tuning: Changes to your prompts (and less often, your tools) should be registered using Git + the agentc command line tool. If you install agentc index and agentc publish as post-commit hooks, you will run the following standard Git commands:

    $ # Commit your changes. `agentc index` + `agentc publish` will run after `git commit`.
    $ git add [CHANGED_FILES]
    $ git commit -m "My changes"
    

    For changes that are small and don't warrant a new commit, these hooks will also apply to git commit --amend.

    $ git add [CHANGED_FILES]
    $ git commit --amend
    

    Again, all logs your agent system generates are bound to the Git SHA generated by git commit, thus you can easily see the changes you've made using git diff [GIT_SHA_IN_LOGS].

Assessing Agent Quality

The Agent Catalog package also provides a foundation for analyzing agent system performance over a series of Git-backed changes. Agent analysts will follow this workflow:

  1. Log Access: Your first step is to get access to the agentc.Span captured logs. For logs sent to Couchbase, you can find them in the agent_activity.logs collection of your Couchbase instance. For logs stored locally, you can find them in the ./agent-activity directory. We recommend the former, as it allows for easy ad-hoc analysis through Couchbase Query and/or Couchbase Analytics.

  2. Log Transformations: Next, you'll want to explore your logs. We provide a set of non-materialized views (expressed as both Analytics Service Views and Query Service UDFs) to help you get started. All views belong to the scope agent_activity and can be queried using SQL++ below:

    SELECT logs_view.* FROM `[MY_BUCKET]`.agent_activity.`[VIEW_NAME]` AS logs_view;
    
    SELECT logs_view.* FROM `[MY_BUCKET]`.agent_activity.`[VIEW_NAME]`() AS logs_view;
    

    Where [MY_BUCKET] is your Agent Catalog bucket and [VIEW_NAME] is one of the views given here. Using the couchbase package, you can author the following to directly access these logs:

    import couchbase.auth
    import couchbase.cluster
    import couchbase.options
    
    auth = couchbase.auth.PasswordAuthenticator(
        username="Administrator",
        password="password"
    )
    cluster = couchbase.cluster.Cluster(
        "couchbase:127.0.0.1",
        options=couchbase.options.ClusterOptions(auth)
    )
    
    bucket_name = "[MY_BUCKET]"
    view_name = "[VIEW_NAME]"
    query = cluster.analytics_query(f"""
        FROM
            `{bucket_name}`.agent_activity.{view_name} l
        SELECT
            l.*;
    """)
    for result in query:
        print(result)
    
    import couchbase.auth
    import couchbase.cluster
    import couchbase.options
    
    auth = couchbase.auth.PasswordAuthenticator(
        username="Administrator",
        password="password"
    )
    cluster = couchbase.cluster.Cluster(
        "couchbase:127.0.0.1",
        options=couchbase.options.ClusterOptions(auth)
    )
    
    bucket_name = "[MY_BUCKET]"
    view_name = "[VIEW_NAME]"
    query = cluster.query(f"""
        FROM
            `{bucket_name}`.agent_activity.{view_name}() l
        SELECT
            l.*;
    """)
    for result in query:
        print(result)
    
  3. Log Analysis Once you have a grasp how your application is working, you'll want to move into the realm of "quantitative". This area should be tailored to your specific application, as there are no "one-evaluation-fits-all" solutions. To get you started, our LangGraph sample application here illustrates some evaluations for a route-planner.