Context recall

Definition
Taxonomy
Why it matters
Required columns
Test configuration examples
Related

Definition

The context recall test measures the ability of the retriever to retrieve all necessary context for the question. This metric is based on the Ragas context recall metric.

Taxonomy

Task types: LLM.
Availability: and .

Why it matters

Context recall ensures that your retrieval system captures all the relevant information needed to answer a question properly.
This metric helps identify when your retrieval mechanism is missing important context that should be available to the LLM.
It’s crucial for RAG (Retrieval-Augmented Generation) systems where the quality of retrieved context directly impacts answer quality.

Required columns

To compute this metric, your dataset must contain the following columns:

Ground truth: The reference/correct answer
Context: The retrieved context or background information

This metric relies on an LLM evaluator judging your submission. On Openlayer, you can configure the underlying LLM used to compute it. Check out the OpenAI or Anthropic integration guides for details.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the context recall test:

[
  {
    "name": "Context recall above 0.8",
    "description": "Ensure that the retrieval system captures all necessary context with a score above 0.8",
    "type": "performance",
    "subtype": "metricThreshold",
    "thresholds": [
      {
        "insightName": "metrics",
        "insightParameters": null,
        "measurement": "contextRecall",
        "operator": ">",
        "value": 0.8
      }
    ],
    "subpopulationFilters": null,
    "mode": "development",
    "usesValidationDataset": true,
    "usesTrainingDataset": false,
    "usesMlModel": false,
    "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689"
  }
]

Ragas integration - Learn more about Ragas metrics.
Context relevancy test - Measure relevance of retrieved context.
Context utilization test - Evaluate how well context is used.
Aggregate metrics - Overview of all available metrics.

Conciseness

Context relevancy

⌘I

Get started

Workspace setup

Governance

Observability

Data quality monitoring

Offline testing

Tests

Administration

Other resources

Definition

Taxonomy

Why it matters

Required columns

Test configuration examples

Get started

Workspace setup

Governance

Observability

Data quality monitoring

Offline testing

Tests

Administration

Other resources

​Definition

​Taxonomy

​Why it matters

​Required columns

​Test configuration examples

​Related

Definition

Taxonomy

Why it matters

Required columns

Test configuration examples

Related