Oracle OCI Generative AI

Openlayer integrates with Oracle Cloud Infrastructure (OCI) Generative AI to provide observability for your models hosted on Oracle’s cloud environment. If you are building an AI system with Oracle OCI Generative AI models and want to evaluate it, you can use the SDKs to make Openlayer part of your workflow. This integration guide shows how you can do it.

Evaluating Oracle OCI Generative AI Models

You can set up Openlayer tests to evaluate your Oracle OCI Generative AI models in monitoring and development.

Monitoring

To use the monitoring mode, you must instrument your code to publish the requests your AI system receives to the Openlayer platform. To set it up, you must follow the steps in the code snippet below:

Python

# Install required packages (uncomment if needed)
# !pip install oci openlayer

# Set up Openlayer environment variables
import os
import oci
from oci.generative_ai_inference import GenerativeAiInferenceClient
from oci.generative_ai_inference.models import Message, ChatDetails, GenericChatRequest

# Configure Openlayer API credentials
os.environ["OPENLAYER_API_KEY"] = "your-openlayer-api-key-here"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "your-inference-pipeline-id-here"

# Import the Openlayer tracer
from openlayer.lib.integrations import trace_oci_genai

# Configure your OCI settings
COMPARTMENT_ID = "your-compartment-ocid-here"  # Replace with your compartment OCID
ENDPOINT = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com"  # Replace with your region's endpoint

# Load OCI configuration
config = oci.config.from_file()  # Uses default config file location
# Alternatively, you can specify a custom config file:
# config = oci.config.from_file("~/.oci/config", "DEFAULT")

# Create the OCI Generative AI client
client = GenerativeAiInferenceClient(config=config, service_endpoint=ENDPOINT)

# Apply Openlayer tracing to the OCI client
# With token estimation enabled (default)
traced_client = trace_oci_genai(client, estimate_tokens=True)

# Alternative: Disable token estimation to get None values when tokens are not available
# traced_client = trace_oci_genai(client, estimate_tokens=False)

# Create a chat request
chat_request = GenericChatRequest(
    messages=[Message(role="user", content="Hello! Can you explain what Oracle Cloud Infrastructure is?")],
    model_id="cohere.command-r-plus",
    max_tokens=200,
    temperature=0.7,
    is_stream=False,  # Non-streaming
)

chat_details = ChatDetails(compartment_id=COMPARTMENT_ID, chat_request=chat_request)

# Make the request - the tracer will automatically capture it
response = traced_client.chat(chat_details)

See full Python example

Once the code is instrumented, all your Oracle OCI Generative AI calls are automatically published to Openlayer, along with metadata, such as latency, number of tokens, cost estimate, and more.

Token Estimation: Some Oracle OCI Generative AI models do not return usage details including total tokens processed in their responses. When this happens, Openlayer can estimate token counts using a rule of thumb (string length divided by 3/4).The trace_oci_genai() function accepts an optional estimate_tokens parameter:

estimate_tokens=True (default): Estimates token counts when not provided by OCI response
estimate_tokens=False: Returns None for token fields when not available in the response

This ensures you always have token metrics for cost tracking and performance monitoring, even when the underlying model doesn’t provide them directly.

If you navigate to the “Data” page of your Openlayer data source, you can see the traces for each request.

If the Oracle OCI Generative AI call is just one of the steps of your AI system, you can use the code snippets above together with tracing. In this case, your Oracle OCI calls get added as a step of a larger trace. Refer to the Tracing guide for details.

After your AI system requests are continuously published and logged by Openlayer, you can create tests that run at a regular cadence on top of them. Refer to the Monitoring overview, for details on Openlayer’s monitoring mode, to the Publishing data guide, for more information on setting it up, or to the Tracing guide, to understand how to trace more complex systems.

Development

In development mode, Openlayer becomes a step in your CI/CD pipeline, and your tests get automatically evaluated after being triggered by some events. Openlayer tests often rely on your AI system’s outputs on a validation dataset. As discussed in the Configuring output generation guide, you have two options:

either provide a way for Openlayer to run your AI system on your datasets, or
before pushing, generate the model outputs yourself and push them alongside your artifacts.

For AI systems built with Oracle OCI Generative AI models, if you are not computing your system’s outputs yourself, you must provide your OCI credentials. To do so, navigate to “Workspace settings” -> “Environment variables,” and add the required OCI configuration secrets such as OCI_USER_ID, OCI_FINGERPRINT, OCI_TENANCY_ID, OCI_REGION, and OCI_KEY_FILE or configure your OCI config file appropriately.

Supported Oracle OCI Models

The integration supports all Oracle OCI Generative AI models, including:

Cohere Command models (command-r-plus, command-r, etc.)
Meta Llama models
Other models available through Oracle OCI Generative AI service

Make sure your OCI configuration is properly set up with the necessary permissions to access the Generative AI service in your compartment.

Integrations

Model

Data

Instrumentation

Notifications

Other

Oracle OCI Generative AI

Evaluating Oracle OCI Generative AI Models

Monitoring

See full Python example

Development

Supported Oracle OCI Models

Integrations

Model

Data

Instrumentation

Notifications

Other

​Evaluating Oracle OCI Generative AI Models

​Monitoring

See full Python example

​Development

​Supported Oracle OCI Models

Evaluating Oracle OCI Generative AI Models

Monitoring

Development

Supported Oracle OCI Models