Uploading a reference dataset
A reference dataset is usually a representative sample of the training data used by the model. It is required to monitor data drift — as its distribution serves as a reference to compare the distribution of your published data.
How to upload a reference dataset
You can upload a reference dataset to your inference pipeline on Openlayer with the Python SDK.
See full Python example
Load your dataset as a pandas DataFrame
Let’s say that your reference dataset looks like the one below. For simplicity, we show a single row.
Prepare the dataset configuration
The dataset config is a dictionary containing information that helps Openlayer understand your data.
For example, the dataset above is from a tabular classification task, so our dataset config will have information such as the feature names, class names, and others:
Upload to Openlayer
Now, you can upload your reference dataset alongside its config to Openlayer:
Was this page helpful?