Definition

The training-validation lekage test allows you to detect training rows that are also present in the validation dataset.

Taxonomy

  • Category: Consistency.
  • Task types: LLM, tabular classification, tabular regression, text classification.
  • Availability: .

Why it matters

  • The training and validation datasets must be completely disjoint. Otherwise, all evaluation insights extracted from the validation set are unreliable and overly optimistic.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the character length test: