Definition

The feature drift test allows you to set a threshold on the number of features that have drifted. To compute drift for each feature, Openlayer automatically selects the best drift detection method among the supported ones based on the feature type and range of values.

If you want full flexibility on the drift detection method and the threshold, you can use the Column drift test instead.

Drift is measured by comparing a reference dataset with a current dataset.

  • In development projects, the training set is used as the reference and the validation set as the current dataset.
  • In monitoring projects, the reference dataset is uploaded by the user and the production data is the current dataset.

Taxonomy

  • Category: Consistency.
  • Task types: Tabular classification, tabular regression.
  • Availability: and .

Why it matters

  • Measuring drift is crucial to maintain the relevance of your models. In development, it allows you to ensure that the data you use to validate your model is similar to the data you used to train it. In monitoring, it allows you to detect when the data your model is receiving is different from the data considered as reference.
  • Over time, changes in the underlying data distribution can degrade the performance of your model. Measuring drift helps in identifying these changes early, enabling timely updates or retraining of the model to maintain its performance.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the character length test: