Tracing shows you what happens inside your system — every step, input, output, latency, cost, and more. It is especially useful for multi-step pipelines like RAG, LLM chains, or agents. In the monitoring mode of an Openlayer project, traces are captured for each live request your AI system receives. This allows you to observe workflows step by step, measure performance, and identify bottlenecks. This guide shows how you can set up tracing with Openlayer’s SDKs to achieve a result similar to the one below.
If you use a framework that supports OpenTelemetry (OTel), you can export traces directly to Openlayer without needing to use Openlayer’s SDKs. See the OpenTelemetry integration for more details.
Requests and traces

How to set up tracing

You must use one of Openlayer’s SDKs to trace your system. After installing the SDK in your language of choice, follow the steps:
If you prefer, feel free to refer to a notebook example. Our templates gallery also has complete sample projects that show how tracing works for development and monitoring.
1

Set environment variables

Before sending traces, tell Openlayer where to upload them by setting two environment variables:
Shell
OPENLAYER_API_KEY=YOUR_OPENLAYER_API_KEY
OPENLAYER_INFERENCE_PIPELINE_ID=YOUR_OPENLAYER_INFERENCE_PIPELINE_ID
2

Instrument the code you want to trace

Annotate all the functions you want to trace with Openlayer’s SDK.
import openai
from openlayer.lib import trace, trace_openai

# Wrap the OpenAI client Openlayer's `trace_openai`
openai_client = trace_openai(openai.OpenAI(api_key="sk-..."))

# Decorate all the functions you want to trace
@trace()
def main(user_query: str) -> str:
    context = retrieve_context(user_query)
    answer = generate_answer(user_query, context)
    return answer

@trace()
def retrieve_context(user_query: str) -> str:
    return "Some context"

@trace()
def generate_answer(user_query: str, context: str) -> str:
    result = openai_client.chat.completions.create(
        messages=[{"role": "user", "content": user_query + " " + context}],
        model="gpt-4o"
    )
    return result.choices[0].message.content
The traced generate_answer function in the example above uses an OpenAI LLM. However, tracing also works for other LLM providers. If you set up any of the streamlined approaches described in the Publishing data guide, it will get added to the trace as well.
3

Use the instrumented code

All data that goes through the instrumented code is automatically streamed to the Openlayer platform, where your tests and alerts are defined.In the example above, if we call main:
main("what is the meaning of life?")
the resulting trace would be:TraceThe main function has two nested steps: retrieve_context, and generate_answer. The generate_answer has a chat completion call within it. The cost, number of tokens, latency, and other metadata are all computed automatically behind the scenes.