Ashr Labs TypeScript SDK

A TypeScript SDK for the Ashr Labs platform. Two independent products share this SDK and one API key:

Testing Platform — generate datasets, run your agent against test scenarios offline, compare expected vs actual behavior, and submit graded results. The EvalRunner / RunBuilder / Agent / comparators surface.
Observability (separate product) — trace your agent's production behavior (LLM calls, tool invocations, retrieval steps, latency, errors). The client.trace() / span / generation surface. Traces are stored in Postgres and rendered in the Observability panel of the Ashr Labs dashboard. Requires the observability feature flag for your tenant.

Zero external dependencies beyond Node.js built-ins.

Quick Links

Testing Your Agent — start here if you're doing offline evals (end-to-end guide with EvalRunner)
Observability — Production Agent Tracing — client.trace(), spans, generations, analytics
VM Integration — browser/desktop agents with VM stream logging
Installation
Quick Start
Authentication
API Reference — EvalRunner, Agent, comparators, RunBuilder, client methods, observability methods
SDK Notes — platform advisories delivered to your SDK
Error Handling
Examples — testing platform examples + observability examples

Requirements

Node.js 18 or higher
TypeScript 5.4+ (recommended)

Installation

npm install ashr-labs

Quick Example

Any agent with respond() and reset() methods works out of the box:

import { AshrLabsClient, EvalRunner } from "ashr-labs";

const client = new AshrLabsClient("tp_your_api_key_here");
const runner = await EvalRunner.fromDataset(client, 322);
await runner.runAndDeploy(myAgent, client, 322);

Or with more control:

import { AshrLabsClient, EvalRunner } from "ashr-labs";

const client = new AshrLabsClient("tp_your_api_key_here");

// Generate a dataset
const [datasetId, source] = await client.generateDataset(
  "My Agent Eval",
  { /* Your agent config */ },
);

// Run the eval with progress logging
const runner = new EvalRunner(source);
const run = await runner.run(myAgent, {
  onScenario: (sid, s) => console.log(`Running: ${s.title}`),
});

// Inspect metrics before submitting
const metrics = run.build().aggregate_metrics as Record<string, unknown>;
console.log(`Passed: ${metrics.tests_passed}/${metrics.total_tests}`);
console.log(`Avg similarity: ${metrics.average_similarity_score}`);

// Submit
await run.deploy(client, datasetId);

Support

For issues and feature requests, please visit our GitHub repository.

Quick Links​

Requirements​

Installation​

Quick Example​

Support​

Quick Links

Requirements

Installation

Quick Example

Support