Learn how to use the LLM evaluation test
The LLM evaluation test allows you to create tests using an LLM as a judge. You can write descriptive evaluations like “Make sure the outputs are in Portuguese,” and Openlayer will use an LLM to grade your agent or model given this criterion. Besides producing a score, the LLM will also explain its evaluation.
If you are writing a tests.json
, here are a few valid configurations for the character length test:
Learn how to use the LLM evaluation test
The LLM evaluation test allows you to create tests using an LLM as a judge. You can write descriptive evaluations like “Make sure the outputs are in Portuguese,” and Openlayer will use an LLM to grade your agent or model given this criterion. Besides producing a score, the LLM will also explain its evaluation.
If you are writing a tests.json
, here are a few valid configurations for the character length test: