Ashr Labs TypeScript SDK
A TypeScript SDK for the Ashr Labs platform. Two independent products share this SDK and one API key:
- Testing Platform — generate datasets, run your agent against test scenarios offline, compare expected vs actual behavior, and submit graded results. The
EvalRunner/RunBuilder/Agent/ comparators surface. - Observability (separate product) — trace your agent's production behavior (LLM calls, tool invocations, retrieval steps, latency, errors). The
client.trace()/ span / generation surface. Traces are stored in Postgres and rendered in the Observability panel of the Ashr Labs dashboard. Requires theobservabilityfeature flag for your tenant.
Zero external dependencies beyond Node.js built-ins.
Quick Links
- Testing Your Agent — start here if you're doing offline evals (end-to-end guide with EvalRunner)
- Observability — Production Agent Tracing —
client.trace(), spans, generations, analytics - VM Integration — browser/desktop agents with VM stream logging
- Installation
- Quick Start
- Authentication
- API Reference — EvalRunner, Agent, comparators, RunBuilder, client methods, observability methods
- SDK Notes — platform advisories delivered to your SDK
- Error Handling
- Examples — testing platform examples + observability examples
Requirements
- Node.js 18 or higher
- TypeScript 5.4+ (recommended)
Installation
npm install ashr-labs
Quick Example
Any agent with respond() and reset() methods works out of the box:
import { AshrLabsClient, EvalRunner } from "ashr-labs";
const client = new AshrLabsClient("tp_your_api_key_here");
const runner = await EvalRunner.fromDataset(client, 322);
await runner.runAndDeploy(myAgent, client, 322);
Or with more control:
import { AshrLabsClient, EvalRunner } from "ashr-labs";
const client = new AshrLabsClient("tp_your_api_key_here");
// Generate a dataset
const [datasetId, source] = await client.generateDataset(
"My Agent Eval",
{ /* Your agent config */ },
);
// Run the eval with progress logging
const runner = new EvalRunner(source);
const run = await runner.run(myAgent, {
onScenario: (sid, s) => console.log(`Running: ${s.title}`),
});
// Inspect metrics before submitting
const metrics = run.build().aggregate_metrics as Record<string, unknown>;
console.log(`Passed: ${metrics.tests_passed}/${metrics.total_tests}`);
console.log(`Avg similarity: ${metrics.average_similarity_score}`);
// Submit
await run.deploy(client, datasetId);
Support
For issues and feature requests, please visit our GitHub repository.