> ## Documentation Index
> Fetch the complete documentation index at: https://docs.openlayer.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Faithfulness

> Learn how to use the faithfulness test

## Definition

The faithfulness test measures the factual consistency of the generated answer against the given context. This metric is based on the Ragas [faithfulness](https://docs.ragas.io/en/stable/concepts/metrics/available_metrics/faithfulness/) metric.

## Taxonomy

* **Task types**: LLM.
* **Availability**: <Tooltip tip="Continuously evaluate your models and datasets as you iterate on their versions.">development</Tooltip>
  and <Tooltip tip="Monitor a model in production, measure its health, check for drifts and set up alerts.">monitoring</Tooltip>.

## Why it matters

* Faithfulness ensures that your LLM generates responses that are consistent with the provided context and doesn't hallucinate information.
* This metric helps identify when your model is making up facts or contradicting the given context.
* It's essential for RAG (Retrieval-Augmented Generation) systems where the model should stay grounded in the provided information.

## Required columns

To compute this metric, your dataset must contain the following columns:

* **Outputs**: The generated answer/response from your LLM
* **Context**: The provided context or background information

<Note>
  This metric relies on an LLM evaluator judging your submission. On Openlayer,
  you can configure the underlying LLM used to compute it. Check out the
  [OpenAI](/integrations/openai#openai-llm-evaluator) or
  [Anthropic](/integrations/anthropic#anthropic-llm-evaluator) integration
  guides for details.
</Note>

## Test configuration examples

If you are writing a `tests.json`, here are a few valid configurations for the faithfulness test:

<CodeGroup>
  ```json Development theme={null}
  [
    {
      "name": "Faithfulness above 0.9",
      "description": "Ensure that generated responses are faithful to the provided context with a score above 0.9",
      "type": "performance",
      "subtype": "metricThreshold",
      "thresholds": [
        {
          "insightName": "metrics",
          "insightParameters": null,
          "measurement": "faithfulness",
          "operator": ">",
          "value": 0.9
        }
      ],
      "subpopulationFilters": null,
      "mode": "development",
      "usesValidationDataset": true,
      "usesTrainingDataset": false,
      "usesMlModel": false,
      "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689"
    }
  ]
  ```

  ```json Monitoring theme={null}
  [
    {
      "name": "Faithfulness above 0.9",
      "description": "Ensure that generated responses are faithful to the provided context with a score above 0.9",
      "type": "performance",
      "subtype": "metricThreshold",
      "thresholds": [
        {
          "insightName": "metrics",
          "insightParameters": null,
          "measurement": "faithfulness",
          "operator": ">",
          "value": 0.9
        }
      ],
      "subpopulationFilters": null,
      "mode": "monitoring",
      "usesProductionData": true,
      "evaluationWindow": 3600,
      "delayWindow": 0,
      "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689"
    }
  ]
  ```
</CodeGroup>

## Related

* [Ragas integration](/integrations/ragas) - Learn more about Ragas metrics.
* [Context utilization test](/tests/catalog/context-utilization) - Evaluate how well context is used.
* [Answer correctness test](/tests/catalog/answer-correctness) - Measure factual accuracy against ground truth.
* [Correctness test](/tests/catalog/correctness) - Measure overall correctness of answers.
* [Aggregate metrics](/tests/performance/aggregate-metrics) - Overview of all available metrics.
