Consistency
Training-validation leakage
Definition
The training-validation lekage test allows you to detect training rows that are also present in the validation dataset.
Taxonomy
- Category: Consistency.
- Task types: LLM, tabular classification, tabular regression, text classification.
- Availability: .
Why it matters
- The training and validation datasets must be completely disjoint. Otherwise, all evaluation insights extracted from the validation set are unreliable and overly optimistic.
Test configuration examples
If you are writing a tests.json
, here are a few valid configurations for the character length test:
Was this page helpful?