Agent Catalog Record Entries

As of date, Agent Catalog supports six different types of records (four types of tools, two types of prompts).

Tool Catalog Records

Tools are explicit actions that an agent can take to accomplish a task. Agent Catalog currently supports four types of tools: Python function tools, SQL++ query tools, semantic search tools, and HTTP request tools.

Python Function Tools

The most generic tool is the Python function tool, which is associated with a function in .py file. To signal to Agent Catalog that you want to mark a function as a tool, you must use the @tool decorator.

#
# The following file is a template for a Python tool.
#
from agentc import tool
from pydantic import BaseModel


# Although Python uses duck-typing, the specification of models greatly improves the response quality of LLMs.
# It is highly recommended that all tools specify the models of their bound functions using Pydantic or dataclasses.
class SalesModel(BaseModel):
    input_sources: list[str]
    sales_formula: str


# Only functions decorated with "tool" will be indexed.
# All other functions / module members will be ignored by the indexer.
@tool
def compute_sales_for_this_week(sales_model: SalesModel) -> float:
    """A description for the function bound to the tool. This is mandatory for tools."""

    # The implementation of the tool (given below) is *not* indexed.
    # The indexer only cares about the name, function signature, and description.
    return 1.0 * 0.99 + 2.00 % 6.0


# You can also specify the name and description of the tool explicitly, as well as any annotations you wish to attach.
@tool(name="compute_sales_for_the_month", annotations={"type": "sales"})
def compute_sales_for_the_month(sales_model: SalesModel) -> float:
    """A description for the function bound to the tool. This is mandatory for tools."""

    return 1.0 * 0.99 + 2.00 % 6.0

SQL++ Query Tools

SQL++ is the query language used by Couchbase to interact with the data stored in the cluster. To create a SQL++ query tool, you must author a .sqlpp file with a header that details various metadata. If you are importing an existing SQL++ query, simply prepend the header to the query.

--
-- The following file is a template for a (Couchbase) SQL++ query tool.
--

-- All SQL++ query tools are specified using a valid SQL++ (.sqlpp) file.
-- The tool metadata must be specified with YAML inside a multi-line C-style comment.
/*
# The name of the tool must be a valid Python identifier (e.g., no spaces).
# This field is mandatory, and will be used as the name of a Python function.
name: find_high_order_item_customers_between_date

# A description for the function bound to this tool.
# This field is mandatory, and will be used in the docstring of a Python function.
description: >
    Given a date range, find the customers that have placed orders where the total number of items is more than 1000.

# The inputs used to resolve the named parameters in the SQL++ query below.
# Inputs are described using a JSON object that follows the JSON schema standard.
# This field is mandatory, and will be used to build a Pydantic model.
# See https://json-schema.org/learn/getting-started-step-by-step for more info.
input: >
    {
      "type": "object",
      "properties": {
        "orderdate_start": { "type": "string" },
        "orderdate_end": { "type": "string" }
      }
    }

# The outputs used describe the structure of the SQL++ query result.
# Outputs are described using a JSON object that follows the JSON schema standard.
# This field is optional, and will be used to build a Pydantic model.
# We recommend using the 'INFER' command to build a JSON schema from your query results.
# See https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/infer.html.
# In the future, this field will be optional (we will INFER the query automatically for you).
# output: >
#     {
#       "type": "array",
#       "items": {
#         "type": "object",
#         "properties": {
#           "cust_id": { "type": "string" },
#           "first_name": { "type": "string" },
#           "last_name": { "type": "string" },
#           "item_cnt": { "type": "integer" }
#         }
#       }
#     }

# As a supplement to the tool similarity search, users can optionally specify search annotations.
# The values of these annotations MUST be strings (e.g., not 'true', but '"true"').
# This field is optional, and does not have to be present.
annotations:
  gdpr_2016_compliant: "false"
  ccpa_2019_compliant: "true"

# The "secrets" field defines search keys that will be used to query a "secrets" manager.
# Note that these values are NOT the secrets themselves, rather they are used to lookup secrets.
secrets:

    # All Couchbase tools (e.g., semantic search, SQL++) must specify conn_string, username, and password.
    - couchbase:
        conn_string: CB_CONN_STRING
        username: CB_USERNAME
        password: CB_PASSWORD
*/

SELECT
  c.cust_id,
  c.name.first AS first_name,
  c.name.last  AS last_name,
  COUNT(*)     AS item_cnt
FROM
  customers AS c,
  orders    AS o,
  o.items   AS i
WHERE
  -- Parameters specified in the input field of the tool metadata above correspond to named parameters here.
  -- The '$' syntax is used to denote a named parameter.
  -- See https://docs.couchbase.com/server/current/n1ql/n1ql-rest-api/exnamed.html for more details.
  ( o.orderdate BETWEEN $orderdate_start AND $orderdate_end ) AND
  c.cust_id = o.cust_id
GROUP BY
  c.cust_id
HAVING
  COUNT(*) > 1000;

Semantic Search Tools

Semantic search tools are used to search for text that is semantically similar to some query text. To create a semantic search tool, you must author a .yaml file with the record_kind field populated with semantic_search.

#
# The following file is a template for a (Couchbase) semantic search tool.
#
record_kind: semantic_search

# The name of the tool must be a valid Python identifier (e.g., no spaces).
# This field is mandatory, and will be used as the name of a Python function.
name: search_for_relevant_products

# A description for the function bound to this tool.
# This field is mandatory, and will be used in the docstring of a Python function.
description: >
  Find product descriptions that are closely related to a collection of tags.

# The inputs used to build a comparable representation for a semantic search.
# Inputs are described using a JSON object that follows the JSON schema standard.
# This field is mandatory, and will be used to build a Pydantic model.
# See https://json-schema.org/learn/getting-started-step-by-step for more info.
input: >
  {
    "type": "object",
    "properties": {
      "search_tags": {
        "type": "array",
        "items": { "type": "string" }
      }
    }
  }

# As a supplement to the tool similarity search, users can optionally specify search annotations.
# The values of these annotations MUST be strings (e.g., not 'true', but '"true"').
# This field is optional, and does not have to be present.
annotations:
  gdpr_2016_compliant: "false"
  ccpa_2019_compliant: "true"

# The "secrets" field defines search keys that will be used to query a "secrets" manager.
# Note that these values are NOT the secrets themselves, rather they are used to lookup secrets.
secrets:

  # All Couchbase tools (e.g., semantic search, SQL++) must specify conn_string, username, and password.
  - couchbase:
      conn_string: CB_CONN_STRING
      username: CB_USERNAME
      password: CB_PASSWORD

# Couchbase semantic search tools always involve a vector search.
vector_search:

  # A bucket, scope, and collection must be specified.
  # Semantic search across multiple collections is currently not supported.
  bucket: my-bucket
  scope: my-scope
  collection: my-collection

  # All semantic search operations require that a (FTS) vector index is built.
  # In the future, we will relax this constraint.
  index: my-vector-index

  # The vector_field refers to the field the vector index (above) was built on.
  # In the future, we will relax the constraint that an index exists on this field.
  vector_field: vec

  # The text_field is the field name used in the tool output (i.e., the results).
  # In the future, we will support multi-field tool outputs for semantic search.
  text_field: text

  # The embedding model used to generate the vector_field.
  # This embedding model field value is directly passed to sentence transformers.
  # In the future, we will add support for other types of embedding models.
  embedding_model: sentence-transformers/all-MiniLM-L12-v2

  # The number of candidates (i.e., the K value) to request for when performing a vector top-k search.
  # This field is optional, and defaults to k=3 if not specified.
  num_candidates: 3

HTTP Request Tools

HTTP request tools are used to interact with external services via REST API calls. The details on how to interface with these external services are detailed in a standard OpenAPI spec (see here for more details). To create an HTTP request tool, you must author a .yaml file with the record_kind field populated with http_request. One tool is generated per specified endpoint.

#
# The following file is a template for a set of HTTP request tools.
#
record_kind: http_request

# As a supplement to the tool similarity search, users can optionally specify search annotations.
# The values of these annotations MUST be strings (e.g., not 'true', but '"true"').
# This field is optional, and does not have to be present.
annotations:
  gdpr_2016_compliant: "false"
  ccpa_2019_compliant: "true"

# HTTP requests must be specified using an OpenAPI spec.
open_api:

  # The path relative to the tool-calling code.
  # The OpenAPI spec can either be in JSON or YAML.
  filename: path_to_openapi_spec.json

  # A URL denoting where to retrieve the OpenAPI spec.
  # The filename or the url must be specified (not both).
  # url: http://url_to_openapi_spec/openapi.json

  # Which OpenAPI operations should be indexed as tools are specified below.
  # This field is mandatory, and each operation is validated against the spec on index.
  operations:

    # All operations must specify a path and a method.
    # 1. The path corresponds to an OpenAPI path object.
    # 2. The method corresponds to GET/POST/PUT/PATCH/DELETE/HEAD/OPTIONS/TRACE.
    # See https://swagger.io/specification/#path-item-object for more information.
    - path: /users/create
      method: post
    - path: /users/delete/{user_id}
      method: delete

To know more on generating your OpenAPI spec, check out the schema here. For an example OpenAPI spec used in the travel-sample agent, see here.

Prompt Catalog Records

Prompts in Agent Catalog are more than just a hunk of text: they also contain metadata that help developers build agents faster. These prompts can be a part of a larger workflow (e.g., as a nested prompt within an agent framework) or as a standalone set of instructions. Agent Catalog currently supports two types of prompts: raw prompts and Jinja templated prompts.

Raw Prompts

Raw prompts are static, predefined text-based instructions used to guide your agent’s actions. Raw prompts are written directly as plain text without any dynamic elements.

---
#
# The following file is a template for a raw prompt.
#
# The content in between the '---' lines must be valid YAML.
record_kind: raw_prompt

# The name of the prompt must be a valid Python identifier (e.g., no spaces).
# This field is mandatory, and will be used when searching for prompts by name.
name: route_finding_prompt

# A description of the prompt's purpose (e.g., where this prompt will be used).
# This field is mandatory, and will be used (indirectly) when performing semantic search for prompts.
description: >
    Instructions on how to find routes between airports.

# As a supplement to the description similarity search, users can optionally specify search annotations.
# The values of these annotations MUST be strings (e.g., not 'true', but '"true"').
# This field is optional, and does not have to be present.
annotations:
  organization: "sequoia"

# A prompt is _generally_ (more often than not) associated with a small collection of tools.
# This field is used at provider time to search the catalog for tools.
# This field is optional, and does not have to be present.
tools:
  # Tools can be specified using the same parameters found in Provider.get_tools_for.
  # For instance, we can condition on the tool name...
  - name: "find_indirect_routes"

  # ...the tool name and some annotations...
  - name: "find_direct_routes"
    annotations: gdpr_2016_compliant = "true"

  # ...or even a semantic search via the tool description.
  - query: "finding flights by name"
    limit: 2

# Below the '---' represents the prompt in its entirety.
---
Goal:
Your goal is to find a sequence of routes between the source and destination airport.

Examples:
...

Instructions:
Try to find a direct routes first between the source airport and the destination airport.
If there are no direct routes, then find a one-layover route.
If there are no such routes, then try another source airport that is close.

Jinja Templated Prompts

In contrast to raw prompts, Jinja templated prompts enable users to author dynamic prompts that can be rendered at runtime. If you are working without an agent framework, Jinja templated prompts might make more sense for your use case.

---
#
# The following file is a template for a Jinja2 prompt.
#
# The content in between the '---' lines must be valid YAML.
record_kind: jinja_prompt

# The name of the prompt must be a valid Python identifier (e.g., no spaces).
# This field is mandatory, and will be used when searching for prompts by name.
name: route_finding_prompt

# A description of the prompt's purpose (e.g., where this prompt will be used).
# This field is mandatory, and will be used (indirectly) when performing semantic search for prompts.
description: >
    Instructions on how to find routes between two specific airports.

# As a supplement to the description similarity search, users can optionally specify search annotations.
# The values of these annotations MUST be strings (e.g., not 'true', but '"true"').
# This field is optional, and does not have to be present.
annotations:
  organization: "sequoia"

# A prompt is _generally_ (more often than not) associated with a small collection of tools.
# This field is used at provider time to search the catalog for tools.
# This field is optional, and does not have to be present.
tools:
  # Tools can be specified using the same parameters found in Provider.get_tools_for.
  # For instance, we can condition on the tool name...
  - name: "find_indirect_routes"

  # ...the tool name and some annotations...
  - name: "find_direct_routes"
    annotations: gdpr_2016_compliant = "true"

  # ...or even a semantic search via the tool description.
  - query: "finding flights by name"
    limit: 2

# Below the '---' represents the prompt in its entirety.
---
Goal:
Your goal is to find a sequence of routes between the source and destination airport.

Source Airport:
{{ source_airport }}

Destination Airport
{{ destination_airport }}

Examples:
...

Instructions:
Try to find a direct routes first between the source airport and the destination airport.
If there are no direct routes, then find a one-layover route.
If there are no such routes, then try another source airport that is close.