Connect Honeyhive to OpenClaw on Operator.io
Honeyhive is an AI observability and evaluation platform for analyzing LLM apps. It helps teams monitor, debug, and improve AI system reliability faster.
Automate Honeyhive with AI
On Operator, an OpenClaw agent pilots Honeyhive for you. It reads your message, plans the steps, and runs them in Honeyhive, using actions like add datapoints to dataset, compare experiment runs, compare runs events.
Your agent reaches Honeyhive directly or through connectors like Composio MCP and Pipedream MCP, which handle the sign in and token refresh for you, so there is nothing to wire up and no API keys to paste.
What your agent can do with Honeyhive
Your agent can call any of these Honeyhive actions by name as part of a larger task. Ask for the outcome you want and it picks the right ones.
Add datapoints to dataset
Tool to add datapoints to a dataset. Use when you need to append multiple entries with specified input, ground truth, and history mappings.
Compare Experiment Runs
Tool to retrieve experiment comparison between two evaluation runs. Use when you need to analyze the differences in metrics, datapoints, and events between two runs.
Compare Runs Events
Tool to compare events between two experiment runs side-by-side. Use when analyzing differences in model behavior, performance metrics, or outputs between evaluation runs. Returns matched event pairs with their respective data from both...
Batch Create Datapoints
Tool to create multiple datapoints in a single batch operation. Use when you need to bulk-import events into a dataset or create many datapoints at once. Supports filtering by date range, event IDs, or custom criteria. Efficient for migr...
Create Batch Model Events
Tool to create multiple model events in a single request. Use when you need to log a batch of event interactions to HoneyHive.
Create Batch Tool Events
Tool to log a batch of external API calls as tool events. Use when you need to record multiple tool events in one request—use after gathering all event data.
Create Configuration
Creates a new configuration in HoneyHive for managing LLM or pipeline settings. Use this to define reusable configurations with specific models, prompts, and parameters that can be deployed across different environments (dev, staging, pr...
Create Datapoint
Tool to create a new datapoint with input-output pairs. Use when you need to add a single datapoint with inputs, ground truth, conversation history, and metadata.
Create Dataset
Tool to create a dataset. Use when you need to initialize a new dataset within a project.
Create Event
Tool to create a new event in HoneyHive to track execution of different parts of your application. Use when you need to log a model call, tool execution, or chain step. Events can be grouped into sessions and nested hierarchically using...
Create Metric
Tool to create a new metric in HoneyHive. Use when you need to define how to evaluate model outputs, whether through code (PYTHON), AI evaluation (LLM), human review (HUMAN), or combining multiple metrics (COMPOSITE). Important: LLM metr...
Create Model Event
Tool to create a new model event to log LLM call data. Use when you need to track a single model interaction including messages, responses, usage, and metadata.
Create Tool
Creates a new tool definition in a HoneyHive project. Use this to register functions or plugins that can be invoked and tracked within HoneyHive. Tools are defined with a JSON Schema for their parameters, allowing HoneyHive to validate i...
Delete Datapoint
Tool to delete a specific datapoint by its ID. Use when you need to remove a datapoint from HoneyHive after confirming its identifier.
Delete Dataset
Tool to delete a dataset by ID. Use when you need to remove a dataset after confirming its ID.
End Evaluation Run
Tool to update an evaluation run's status and metadata. Use to mark a run as completed after finishing evaluations, or update run properties like name, metadata, configuration, and associated event/datapoint IDs.
Get Configurations
Tool to retrieve a list of configurations. Use when you need to fetch all configurations for a specific project before making changes.
Get Datasets
Retrieve datasets from HoneyHive for a specified project. Use this tool when you need to: - List all datasets within a project - Find datasets by type (evaluation or fine-tuning) - Retrieve a specific dataset by its ID Returns dataset de...
Get Events
Tool to query events with filters and projections from HoneyHive. Use this action when you need to retrieve events with lightweight filtering (limit 1000 results). For bulk exports or more complex queries, use the Retrieve Events action...
Get Events By Session ID
Tool to retrieve the complete tree of nested events for a specific session. Use when you need to analyze all events (model calls, tool calls, chains) that occurred within a session, including their hierarchical relationships, inputs, out...
Get Events Chart
Tool to retrieve charting and analytics data for events over time. Use when you need aggregated metrics (duration, cost, token usage) grouped by time buckets or fields. Supports percentile analysis (p50, p95, p99) for latency monitoring...
Get Metrics
Retrieves all metrics associated with a HoneyHive project. Returns a list of metrics including their configuration (name, type, description, thresholds, evaluator details) and metadata (creation/update timestamps, sampling settings). Use...
Get Projects
Tool to retrieve all projects in the HoneyHive account. Use when you need to list available projects, get project IDs for use in other API calls, or search for a specific project by name.
Get Evaluation Run Details
Tool to get details of an evaluation run by its UUID. Use when you need to check the status, configuration, results, or metadata of a specific evaluation run.
Get Run Metrics
Tool to get event metrics for an experiment run. Use when you need to retrieve metrics computed on events within a specific experiment run. Returns an array of event objects with their associated metrics, which can be filtered by date ra...
Get Evaluation Runs
Tool to retrieve a list of evaluation runs from HoneyHive. Use when you need to: - List all evaluation runs for analysis - Find runs by status, name, or dataset - Get specific runs by their IDs - Paginate through large sets of evaluation...
Get Runs Schema
Tool to retrieve the schema for experiment runs in HoneyHive. Use when you need to understand available fields, datasets, and mappings for experiment runs.
Get Session
Retrieve a complete session tree by session ID from HoneyHive. Use this tool to fetch the full session hierarchy including all nested events (model calls, tool calls, chains) with their inputs, outputs, durations, and metadata. Returns a...
List Tools
Tool to list all available Honeyhive tools. Use when you need to discover which functions or plugins are registered for use.
Retrieve Datapoint
Retrieve a specific datapoint by its ID from HoneyHive. Use this tool when you need the full details of a single datapoint, including its inputs, ground truth, conversation history, linked datasets, and metadata. Prerequisites: You need...
Retrieve Datapoints
Retrieve datapoints from a HoneyHive project. Use this tool to fetch evaluation datapoints containing inputs, ground truth, and metadata. Supports filtering by specific datapoint IDs or dataset name. Commonly used to: - Review existing t...
Retrieve Events
Retrieve and export events from a HoneyHive project. Use this tool to query traced events (model calls, tool calls, sessions, chains) with optional filters by event_type, metadata, feedback scores, or date range. Returns events with thei...
Retrieve Experiment Result
Tool to retrieve the result of a specific experiment run. Use when you need the status, metrics, and datapoint-level details of a completed experiment.
Start Evaluation Run
Creates a new evaluation run to group and track multiple session events for analysis. Use this action when you want to: - Compare model performance across multiple sessions - Create evaluation batches for quality assurance - Link existin...
Start Session
Start a new HoneyHive session for tracing and observability. Use this tool to initiate a tracking session that groups together related model, tool, and chain events. Returns a session_id that should be used to link subsequent events to t...
Update Configuration
Tool to update an existing HoneyHive configuration. Use when you need to modify a configuration's name, provider, model parameters, environments, or other settings. You must provide the configuration ID (obtainable via Get Configurations...
Update Datapoint
Update an existing datapoint by ID. Use this to modify any combination of inputs, ground_truth, history, metadata, linked_datasets, or linked_evals for a datapoint. Requires a valid datapoint ID obtained from retrieve_datapoints or add_d...
Update Dataset
Tool to update an existing dataset. Use when you need to modify a dataset's details (name, description, datapoints, linked evaluations, or metadata) after confirming its ID.
Update Event
Update an existing HoneyHive event by ID. Use to attach feedback, metrics, metadata, outputs, config, user properties, or update duration on events created via start_session or batch event creation. At least one optional field must be pr...
Update Metric
Tool to update an existing metric. Use when you need to modify a metric’s properties after creation. Ensure you retrieve the metric first to verify its current state.
Update Project
Updates an existing HoneyHive project's name or description. Use this action to modify project metadata after creation. You must provide the project_id and at least one field to update (name or description). To find project IDs, use the...
Update Tool
Tool to update an existing tool in HoneyHive. Use when you need to modify a tool's name, description, parameters, or type after confirming its ID. At least one optional field must be provided alongside the required tool ID.
How to connect Honeyhive
You authorize Honeyhive once from your dashboard. Operator holds the connection and refreshes the access tokens on its own, so your agent keeps working with Honeyhive without you signing in again. The same setup unlocks every other app your agent can reach, so you only do it once.
When you are ready, the get started guide walks through standing up your OpenClaw agent.
Common questions about Honeyhive
- How do I connect Honeyhive to Operator?
- You authorize Honeyhive once from your Operator dashboard. Operator holds the connection and refreshes the access token for you, so your agent keeps working with Honeyhive without you signing in again.
- Can my agent run Honeyhive as part of a larger task?
- Yes. It can call Honeyhive mid task, hand it the input, and use what comes back in the next step. So a job that involves generating, classifying, or analyzing something can route through Honeyhive without you stitching the calls together yourself.
- Do I need to write code or manage Honeyhive API keys?
- No code and no API keys. You authorize Honeyhive through a normal sign in and Operator handles the connection, so there is nothing to wire up or host.
- Can my agent use Honeyhive together with my other apps?
- Yes. The same agent reaches every app you connect, so it can move between Honeyhive and tools like Datarobot, Chatbotkit, Griptape in one job, reading from one and acting in another without you wiring anything between them.
More apps to automate
Apps your agent runs alongside Honeyhive, or browse all integrations.
Put your agent on Honeyhive
Sign in, connect Honeyhive, and hand your agent the work. Your first week is free.
Try for free