Integrity
Column contains string
Definition
Let A be a column in a dataset containing strings. Let B be a column in a dataset containing lists of strings.
The column contains string test asserts that the list of strings in B contains the string in A on a per-row basis.
For example:
A | B | Result |
---|---|---|
”a” | [“a”, “b”, “c”] | ✓ Passed |
”b” | [“a”, “b”, “c”] | ✓ Passed |
”c” | [“a”, “b”, “c”] | ✓ Passed |
”d” | [“a”, “b”, “c”] | x Failed |
Since “d” is not in the list [“a”, “b”, “c”], the test fails.
Taxonomy
- Category: Integrity.
- Task types: LLM, tabular classification, tabular regression, text classification.
- Availability: and .
Why it matters
- In particular for RAG LLM projects, the context retriever will return a list of the top K contexts. The column contains string test can be used to ensure that the context retriever returns at least one of the correct contexts.
Test configuration examples
If you are writing a tests.json
, here are a few valid configurations for the character length test:
Related
Was this page helpful?