- provide a way for Openlayer to run your model on your datasets, or
- before pushing, generate the model outputs yourself and push them alongside your artifacts.
Providing a way for Openlayer to run your model on your datasets
Openlayer uses the information provided in the openlayer.json to run your model on your datasets. To do so, it goes through the following steps:1
Runtime setup
Set up the runtime environment specified in the
runtime
field from your
openlayer.json
. Then, it runs the installCommand
from your
openlayer.json
, to install your dependencies.2
Run the model
Run the
batchCommand
from your openlayer.json
.batchCommand
iterates through your datasets, runs your models in each of them, and
creates the directory specified in outputDirectory
that has the following structure:

{dataset[i].name}
is the name of the i-th dataset specified in the datasets
array in the openlayer.json
,
dataset.json
is the corresponding dataset with an extra column with the model outputs, and config.json
is a config file for the dataset.
If you are leveraging one of Openlayer’s SDKs, you don’t need to worry about the output directory structure or the
configs.
You can browse a template from our Template gallery
that feels closest to your use case and see what the
openlayer.json
and the
run script look like using Openlayer’s SDKs.batchCommand
should call a script you wrote and append it with
- parses command line arguments
--dataset-path
and--output-dir
so it knows which dataset to generate batch outputs on, and where to write the generated outputs. - loads the dataset specified in
--dataset-path
into memory and calls your code that generates outputs for a single row. - writes the generated outputs along with additional fields and the input data
to a
dataset.json
(or CSV) file to a directory that adheres to the output directory structure presented above.
How Openlayer checks if it should compute outputs
Regardless of the method you choose, right after you push artifacts to the Openlayer platform, it checks if the directory specified as theoutputDirectory
in the model
section of your openlayer.json
exists and if it contains the output files Openlayer expects.
If both conditions are satisfied, Openlayer interprets this as signaling that you already
ran your model on your datasets before pushing. Therefore, Openlayer will not try to compute
the model predictions again.
However, if one of the conditions above is not satisfied, Openlayer will try to compute your model
outputs for your datasets.