Oracle OCI hero Openlayer integrates with Oracle Cloud Infrastructure (OCI) Generative AI to provide observability for your models hosted on Oracle’s cloud environment. If you are building an AI system with Oracle OCI Generative AI models and want to evaluate it, you can use the SDKs to make Openlayer part of your workflow and add comprehensive observability to your Oracle-hosted models. This integration guide shows you how to set up monitoring for your Oracle OCI Generative AI models.

Evaluating Oracle OCI Generative AI Models

You can set up Openlayer tests to evaluate your Oracle OCI Generative AI models in development and monitoring.

Development

In development mode, Openlayer becomes a step in your CI/CD pipeline, and your tests get automatically evaluated after being triggered by some events. Openlayer tests often rely on your AI system’s outputs on a validation dataset. As discussed in the Configuring output generation guide, you have two options:
  1. either provide a way for Openlayer to run your AI system on your datasets, or
  2. before pushing, generate the model outputs yourself and push them alongside your artifacts.
For AI systems built with Oracle OCI Generative AI models, if you are not computing your system’s outputs yourself, you must provide your OCI credentials. To do so, navigate to “Settings” > “Workspace secrets,” and add the required OCI configuration secrets such as OCI_USER_ID, OCI_FINGERPRINT, OCI_TENANCY_ID, OCI_REGION, and OCI_KEY_FILE or configure your OCI config file appropriately.

Monitoring

To use the monitoring mode, you must set up a way to publish the requests your AI system receives to the Openlayer platform. This process is streamlined for Oracle OCI Generative AI models. To set it up, you must follow the steps in the code snippet below:
Python
# Install required packages (uncomment if needed)
# !pip install oci openlayer

# Set up Openlayer environment variables
import os
import oci
from oci.generative_ai_inference import GenerativeAiInferenceClient
from oci.generative_ai_inference.models import Message, ChatDetails, GenericChatRequest

# Configure Openlayer API credentials
os.environ["OPENLAYER_API_KEY"] = "your-openlayer-api-key-here"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "your-inference-pipeline-id-here"

# Import the Openlayer tracer
from openlayer.lib.integrations import trace_oci_genai

# Configure your OCI settings
COMPARTMENT_ID = "your-compartment-ocid-here"  # Replace with your compartment OCID
ENDPOINT = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com"  # Replace with your region's endpoint

# Load OCI configuration
config = oci.config.from_file()  # Uses default config file location
# Alternatively, you can specify a custom config file:
# config = oci.config.from_file("~/.oci/config", "DEFAULT")

# Create the OCI Generative AI client
client = GenerativeAiInferenceClient(config=config, service_endpoint=ENDPOINT)

# Apply Openlayer tracing to the OCI client
# With token estimation enabled (default)
traced_client = trace_oci_genai(client, estimate_tokens=True)

# Alternative: Disable token estimation to get None values when tokens are not available
# traced_client = trace_oci_genai(client, estimate_tokens=False)

# Create a chat request
chat_request = GenericChatRequest(
    messages=[Message(role="user", content="Hello! Can you explain what Oracle Cloud Infrastructure is?")],
    model_id="cohere.command-r-plus",
    max_tokens=200,
    temperature=0.7,
    is_stream=False,  # Non-streaming
)

chat_details = ChatDetails(compartment_id=COMPARTMENT_ID, chat_request=chat_request)

# Make the request - the tracer will automatically capture it
response = traced_client.chat(chat_details)

See full Python example

Once the code is set up, all your Oracle OCI Generative AI calls are automatically published to Openlayer, along with metadata, such as latency, number of tokens, cost estimate, and more.
Token Estimation: Some Oracle OCI Generative AI models do not return usage details including total tokens processed in their responses. When this happens, Openlayer can estimate token counts using a rule of thumb (string length divided by 3/4).The trace_oci_genai() function accepts an optional estimate_tokens parameter:
  • estimate_tokens=True (default): Estimates token counts when not provided by OCI response
  • estimate_tokens=False: Returns None for token fields when not available in the response
This ensures you always have token metrics for cost tracking and performance monitoring, even when the underlying model doesn’t provide them directly.
If you navigate to the “Requests” page of your Openlayer inference pipeline, you can see the traces for each request.
If the Oracle OCI Generative AI call is just one of the steps of your AI system, you can use the code snippets above together with tracing. In this case, your Oracle OCI calls get added as a step of a larger trace. Refer to the Tracing guide for details.
After your AI system requests are continuously published and logged by Openlayer, you can create tests that run at a regular cadence on top of them. Refer to the Monitoring overview, for details on Openlayer’s monitoring mode, to the Publishing data guide, for more information on setting it up, or to the Tracing guide, to understand how to trace more complex systems.

Benefits of Oracle OCI Integration

By integrating Openlayer with Oracle OCI Generative AI, you get:
  • Comprehensive Observability: Monitor your Oracle-hosted models with detailed metrics and traces
  • Cost Tracking: Track usage and costs across your Oracle OCI Generative AI deployments
  • Performance Monitoring: Monitor latency, token usage, and model performance
  • Quality Assurance: Run automated tests to ensure your models maintain quality standards
  • Easy Setup: Simple integration with just a few lines of code

Supported Oracle OCI Models

The integration supports all Oracle OCI Generative AI models, including:
  • Cohere Command models (command-r-plus, command-r, etc.)
  • Meta Llama models
  • Other models available through Oracle OCI Generative AI service
Make sure your OCI configuration is properly set up with the necessary permissions to access the Generative AI service in your compartment.