Definition

The context relevancy test measures how relevant the context retrieved is given the question. This metric is based on the Ragas context precision metric.

Taxonomy

  • Task types: LLM.
  • Availability: and .

Why it matters

  • Context relevancy ensures that your retrieval system provides information that is directly related to the user’s question.
  • This metric helps identify when your retrieval mechanism is returning irrelevant or off-topic context that could confuse the LLM.
  • It’s essential for RAG (Retrieval-Augmented Generation) systems to maintain high precision in retrieved information.

Required columns

To compute this metric, your dataset must contain the following columns:
  • Input: The question or prompt given to the LLM
  • Ground truth: The reference/correct answer
  • Context: The retrieved context or background information
This metric relies on an LLM evaluator judging your submission. On Openlayer, you can configure the underlying LLM used to compute it. Check out the OpenAI or Anthropic integration guides for details.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the context relevancy test:
[
  {
    "name": "Context relevancy above 0.8",
    "description": "Ensure that retrieved context is highly relevant to the question with a score above 0.8",
    "type": "performance",
    "subtype": "metricThreshold",
    "thresholds": [
      {
        "insightName": "metrics",
        "insightParameters": null,
        "measurement": "contextRelevancy",
        "operator": ">",
        "value": 0.8
      }
    ],
    "subpopulationFilters": null,
    "mode": "development",
    "usesValidationDataset": true,
    "usesTrainingDataset": false,
    "usesMlModel": false,
    "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689"
  }
]