humanhours sits between your agents and your CFO. You send one HTTP call per task; we compute "how long would a human have taken" and "what did that cost", and serve the numbers back as a dashboard, a weekly digest, and a Reports API.
What gets stored per event
{
agent_id: string, // your stable id, e.g. "support-classifier"
task_type: string, // controlled vocabulary, e.g. "email_classification"
outcome: "success" | "failure" | "needs_review",
agent_duration_seconds?: number, // how long the agent took
agent_cost?: number, // what the agent run cost in your currency
human_baseline_minutes?: number, // override the default baseline for this event only
metadata?: object, // free-form, e.g. { client_id, channel, model }
audit_sample?: {
input_excerpt?: string,
output_excerpt?: string,
model?: string,
tokens_in?: number,
tokens_out?: number,
},
occurred_at?: ISO 8601, // defaults to server-time on receipt
}agent_id, task_type, and outcome are required. Everything else is optional; defaults make the lazy path work.
Task types and baselines
Every task type ships with a default baseline (in minutes), cited from public benchmarks (McKinsey 2024, Worklytics 2025, Anthropic Economic Index 2024) or Triad pilots. 50 task types are seeded out of the box across eight categories: support, ops, sales, legal, content, research, data, dev.
You can do three things to a baseline:
- Use as-is — start here. The baseline is publicly cited.
- Override per workspace — change the value for your org while keeping the original on record.
- Create a custom task type — when nothing in the catalogue fits.
Both override and custom require Pro. The lazy default is "use as-is".
What we compute
hours_saved = max(0, baseline_minutes - agent_duration_seconds / 60) / 60
cost_saved = hours_saved * hourly_rate
agent_cost = tokens_in * input_price + tokens_out * output_price (when model + tokens given and no agent_cost)
net_saved = cost_saved - (agent_cost || 0)
hourly_rate resolves in this order: per-agent override → org default → €45/h fallback.
baseline_minutes resolves in this order: request override → org override → builtin → reject.
agent_cost resolves in this order: a value you pass (provided) → derived from model + tokens (computed) → none. The model price book syncs daily from public per-token pricing, so new models are priced automatically; matching is exact on the model id. The computed cost is frozen on the event (with the prices used) so historical ROI never re-prices. resolved_cost_source on the response tells you which path supplied the cost. Report ?group_by=model to see ROI per model.
Audit trail (the CFO-proof layer)
When you send an audit_sample (input excerpt, output excerpt, model, token counts), we store it next to the event. The CFO link and the dashboard then show a sample of "what the agent actually did" so finance can spot-check claims. This is Pro-only; on Free, the field is silently dropped.
Quotas and overage
| Plan | Included events / month | Above the cap |
|---|---|---|
| Free | 1 000 | Hard 429 quota_exceeded |
| Pro | 100 000 | Metered overage: €0.0004 → €0.00025 → €0.00015 per event |
| Enterprise | Unlimited | Negotiated |
The dashboard's /reports/usage tab shows your real-time month-to-date count and the projected overage cost.
Open spec, hosted layer
The wire format and error envelope are versioned (/v1/...) and stable. When /v2 lands, /v1 keeps working for at least 12 months. The reference implementation, the SDKs, and the protocol document are open source; the hosted humanhours.dev is the revenue layer (PostHog / Segment / OpenTelemetry pattern).