Google Gemini

If you are building an AI system with Google Gemini models and want to evaluate it, you can use the SDKs to make Openlayer part of your workflow. This integration guide shows how you can do it.

Building multi-agent systems with Google Agent Development Kit? Check out the Google ADK integration page for comprehensive tracing of agent conversations, handoffs, and tool usage.

Evaluating Google Gemini models

You can set up Openlayer tests to evaluate your Google Gemini models in monitoring and development.

Monitoring

To use the monitoring mode, you must instrument your code to publish the requests your AI system receives to the Openlayer platform. To set it up, you must follow the steps in the code snippet below:

Python

# 1. Install required packages
# !pip install google-generativeai openlayer

# 2. Set the environment variables
import os
import google.generativeai as genai

os.environ["GOOGLE_AI_API_KEY"] = "YOUR_GOOGLE_AI_API_KEY_HERE"
os.environ["OPENLAYER_API_KEY"] = "YOUR_OPENLAYER_API_KEY_HERE"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "YOUR_OPENLAYER_INFERENCE_PIPELINE_ID_HERE"

# 3. Configure the Gemini API
genai.configure(api_key=os.environ["GOOGLE_AI_API_KEY"])

# 4. Import the `trace_gemini` function and wrap the Gemini model
from openlayer.lib import trace_gemini

model = genai.GenerativeModel("gemini-2.5-flash")
traced_model = trace_gemini(model)

# 5. From now on, every generation call with
# the `traced_model` is traced by Openlayer. E.g.,
response = traced_model.generate_content("How are you doing today?")

See full Python example

Once the code is instrumented, all your Google Gemini model calls are automatically published to Openlayer, along with metadata, such as latency, number of tokens, cost estimate, and more.

If the Google Gemini model call is just one of the steps of your AI system, you can use the code snippets above together with tracing. In this case, your Gemini calls get added as a step of a larger trace. Refer to the Tracing guide for details.

After your AI system requests are continuously published and logged by Openlayer, you can create tests that run at a regular cadence on top of them. Refer to the Monitoring overview, for details on Openlayer’s monitoring mode, to the Publishing data guide, for more information on setting it up, or to the Tracing guide, to understand how to trace more complex systems.

Development

In development mode, Openlayer becomes a step in your CI/CD pipeline, and your tests get automatically evaluated after being triggered by some events. Openlayer tests often rely on your AI system’s outputs on a validation dataset. As discussed in the Configuring output generation guide, you have two options:

either provide a way for Openlayer to run your AI system on your datasets, or
before pushing, generate the model outputs yourself and push them alongside your artifacts.

For AI systems built with Google Gemini models, if you are not computing your system’s outputs yourself, you must provide your API credentials. To do so, navigate to “Workspace settings” -> “Environment variables,” and click on “Add secret” to add your GOOGLE_AI_API_KEY. If you don’t add the required Google AI API key, you’ll encounter a “Missing API key” error when Openlayer tries to run your AI system to get its outputs.

Make sure to configure the Gemini API with the API key from the environment in the script you provide as the batchCommand in the openlayer.json:

import os
import google.generativeai as genai

genai.configure(api_key=os.environ["GOOGLE_AI_API_KEY"])

Integrations

LLM Providers

Frameworks

No-Code Platforms

Observability

Data Platforms

Evaluation & Quality

Data Labeling

Collaboration

Evaluating Google Gemini models

Monitoring

See full Python example

Development

Integrations

LLM Providers

Frameworks

No-Code Platforms

Observability

Data Platforms

Evaluation & Quality

Data Labeling

Collaboration

​Evaluating Google Gemini models

​Monitoring

See full Python example

​Development

Evaluating Google Gemini models

Monitoring

Development