Skip to main content

Definition

The session cost test monitors LLM cost aggregated to the session level. For each session, Openlayer sums the per-trace cost column across all turns in the session, then exposes window-level aggregates you can threshold against. No LLM evaluator is involved — it’s a deterministic aggregation.

Taxonomy

  • Task types: LLM.
  • Availability: and .
  • Evaluation level: session.
  • Computation: deterministic aggregation.

Why it matters

  • Cost per trace is useful, but cost per session is what maps to a user’s real experience — users don’t see individual LLM calls, they see conversations.
  • Mean/median session cost tracks baseline spend; totalCost catches the bill for the full window.
  • Pair with Session record count to distinguish “long session, predictable cost” from “short session, expensive turn”.

Available measurements

MeasurementWhat it means
totalCostSum of cost across all traces in the window (global, not per-session)
meanCostPerSessionMean of per-session cost sums across sessions in the window
medianCostPerSessionMedian of per-session cost sums across sessions in the window

Required columns

  • Session ID: Groups turns belonging to the same conversation.
  • Cost: Per-trace cost (usually populated automatically by the Openlayer client or via OpenTelemetry).

Test configuration examples

[
  {
    "name": "Mean session cost below $0.50",
    "description": "Alert when average session cost exceeds $0.50",
    "type": "performance",
    "subtype": "sessionCost",
    "thresholds": [
      {
        "insightName": "sessionCost",
        "measurement": "meanCostPerSession",
        "operator": "<=",
        "value": 0.5
      }
    ],
    "subpopulationFilters": null,
    "mode": "monitoring",
    "usesProductionData": true,
    "evaluationWindow": 3600,
    "delayWindow": 0
  }
]