Session goal achievement

Definition

The session goal achievement test evaluates whether the user’s goal was met by the end of a conversation. An LLM-as-a-judge infers the user’s goal from their messages and scores the full session against four criteria:

The user’s intent is correctly identified
The goal is fully resolved (not just partially addressed)
The session reaches a satisfying conclusion
User-side signals indicate satisfaction (e.g., acknowledgment, no repeat asks)

Taxonomy

Task types: LLM.
Availability: and .
Evaluation level: session.
Polarity: higher score = better. 0 = goal not achieved at all, 1 = goal fully achieved.

Why it matters

Goal achievement is the clearest direct product-quality signal for agentic assistants.
Tracking it at the session level captures outcomes that per-turn evaluations miss.

Required columns

Input: The user’s message in each turn.
Output: The assistant’s response in each turn.
Session ID: Groups turns belonging to the same conversation.
Timestamp: Used to reconstruct turn order within a session.

This metric relies on an LLM evaluator. On Openlayer you can configure the underlying LLM used to compute it. Check out the OpenAI or Anthropic integration guides for details.

Test configuration examples

[
  {
    "name": "Session goal achievement above 0.7",
    "description": "Ensure sessions meet the user's inferred goal",
    "type": "performance",
    "subtype": "sessionGoalAchievement",
    "thresholds": [
      {
        "insightName": "sessionGoalAchievement",
        "measurement": "meanScore",
        "operator": ">=",
        "value": 0.7
      }
    ],
    "subpopulationFilters": null,
    "mode": "monitoring",
    "usesProductionData": true,
    "evaluationWindow": 3600,
    "delayWindow": 0
  }
]

Session task progression — whether the session was making steady progress along the way.
Session conversation completeness — tracks whether the dialogue reached a clean end.

Session duration

Session guideline adherence

⌘I

​Definition

​Taxonomy

​Why it matters

​Required columns

​Test configuration examples

​Related

Definition

Taxonomy

Why it matters

Required columns

Test configuration examples

Related