Definition

The groundedness test evaluates whether every factual statement in the AI assistant’s response is grounded in provided context. This LLM-as-a-judge evaluation ensures that the model doesn’t hallucinate information and only makes claims that are supported by the given context.

Taxonomy

  • Task types: LLM.
  • Availability: and .

Why it matters

  • Groundedness is crucial for RAG (Retrieval-Augmented Generation) systems where responses must be based on retrieved information.
  • This metric helps prevent hallucination by ensuring that all factual claims are supported by the provided context.
  • It’s essential for applications where accuracy and trustworthiness are paramount, such as customer support, medical information, or legal assistance.
  • Helps maintain user trust by ensuring the AI doesn’t make up information that sounds plausible but is unsupported.

Required columns

To compute this metric, your dataset must contain the following columns:
  • Outputs: The generated response from your LLM
  • Context: The provided context or retrieved information that should ground the response
To use this test, you must select the underlying LLM used as the evaluator and provide the required API credentials. You can check the OpenAI and Anthropic integration guides for details.

Evaluation criteria

The LLM evaluator assesses responses based on:
  1. Factual Statement Verification: Does every factual statement have a clear basis in the provided context?
  2. Information Source Alignment: Are all specific details, numbers, dates, names, and facts directly supported by the retrieved information?
  3. Hallucination Detection: Does the response contain information that appears to be made up or not present in the context?

Scoring guidelines

  • Score 1 (Grounded): All factual statements are clearly supported by the provided context
  • Score 0 (Not Grounded): Contains factual statements that lack clear support in the provided context

Examples of violations

  • Making specific claims about dates, numbers, or facts not mentioned in the context
  • Stating opinions as facts without contextual support
  • Providing specific details about people, places, or events not referenced in the context

Examples of acceptable responses

  • “Based on the provided information, [specific fact from context]”
  • “The context shows that [directly supported claim]”
  • “According to the retrieved information, [factual statement from context]“