Documentation
- Start guides
- Workspaces and projects
- Development
- Monitoring
- Tests
- Security
Publishing data
You must set up a way to publish the requests your AI system is receiving to the Openlayer platform to use Openlayer’s monitoring mode.
This guide expands on the use of Openlayer’s SDKs to publish data to the Openlayer platform. Alternatively, you can use Openlayer’s REST API, as discussed in the Monitoring overview.
Data publishing methods
The data publishing methods are categorized as streamlined approaches and the manual approach.
The streamlined approaches exist for common AI patterns and frameworks. To use them, you need to wrap or decorate your code a certain way, and Openlayer automatically captures relevant data and metadata, such as the number of tokens, cost, latency, etc. This data is then published to the Openlayer platform.
The manual approach is system-agnostic. It is equivalent to hitting the relevant endpoint from Openlayer’s REST API but via Openlayer’s SDKs.
Streamlined approaches
There is a streamlined approach for each of the frameworks below:
To monitor chat completions and completion calls to OpenAI LLMs, you need to:
That’s it! Now, your calls are being published to Openlayer, along with metadata, such as latency, number of tokens, cost estimate, and more.
Refer to the OpenAI integration page for more details.
To monitor chat completions and completion calls to Azure OpenAI LLMs, you need to:
# 1. Set the environment variables
import os
os.environ["AZURE_OPENAI_ENDPOINT"] = "YOUR_AZURE_OPENAI_ENDPOINT_HERE"
os.environ["AZURE_OPENAI_API_KEY"] = "YOUR_AZURE_OPENAI_API_KEY_HERE"
os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"] = "YOUR_AZURE_OPENAI_DEPLOYMENT_NAME_HERE"
os.environ["OPENLAYER_API_KEY"] = "YOUR_OPENLAYER_API_KEY_HERE"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "YOUR_OPENLAYER_INFERENCE_PIPELINE_ID_HERE"
# 2. Import the `trace_openai` function
from openai import AzureOpenAI
from openlayer.lib import trace_openai
azure_client = trace_openai(
AzureOpenAI(
api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
api_version="2024-02-01",
azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
)
)
# 3. From now on, every chat completion/completion call with
# the `azure_client`is traced by Openlayer. E.g.,
completion = azure_client.chat.completions.create(
model=os.environ.get("AZURE_OPENAI_DEPLOYMENT_NAME"),
messages=[
{"role": "user", "content": "How are you doing today?"},
]
)
That’s it! Now, your calls are being published to Openlayer, along with metadata, such as latency, number of tokens, and more.
Refer to the Azure OpenAI integration page for more details.
See full Python example
To monitor chat completions models and chains built with LangChain, you need to:
# 1. Set the environment variables
import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY_HERE"
os.environ["OPENLAYER_API_KEY"] = "YOUR_OPENLAYER_API_KEY_HERE"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "YOUR_OPENLAYER_INFERENCE_PIPELINE_ID_HERE"
# 2. Instantiate the `OpenlayerHandler`
from openlayer.lib.integrations import langchain_callback
openlayer_handler = langchain_callback.OpenlayerHandler()
# 3. Pass the handler to your LLM/chain invocations
from langchain_openai import ChatOpenAI
chat = ChatOpenAI(max_tokens=25, callbacks=[openlayer_handler])
chat.invoke("What's the meaning of life?")
That’s it! Now, your calls are being published to Openlayer, along with metadata, such as latency, number of tokens, cost estimate, and more.
The code snippet above uses LangChain’s ChatOpenAI
. However, the Openlayer
Callback Handler works for all LangChain chat
models and
LLMs.
Refer to the LangChain integration page for more details.
See full Python example
To trace a multi-step LLM system (such as a RAG system or LLM chains), you just need to decorate all the functions you are interested in adding to a trace with Openlayer’s decorator. For example:
import os
from openlayer.lib import trace
# Set the environment variables
os.environ["OPENLAYER_API_KEY"] = "YOUR_OPENLAYER_API_KEY_HERE"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "YOUR_OPENLAYER_INFERENCE_PIPELINE_ID_HERE"
# Decorate all the functions you want to trace
@trace()
def main(user_query: str) -> str:
context = retrieve_context(user_query)
answer = generate_answer(user_query, context)
return answer
@trace()
def retrieve_context(user_query: str) -> str:
return "Some context"
@trace()
def generate_answer(user_query: str, context: str) -> str:
return "Some answer"
# Every time the main function is called, the data is automatically
# streamed to your Openlayer project. E.g.:
main("What is the meaning of life?")
You can use the decorator together with the other streamlined methods. For example, if
your generate_answer
function uses a wrapped version of the OpenAI client,
the chat completion calls will get added to the trace under the generate_answer
function step.
See full Python example
To monitor Anthropic LLMs, you need to:
# 1. Set the environment variables
import anthropic
import os
os.environ["ANTHROPIC_API_KEY"] = "YOUR_ANTHROPIC_API_KEY_HERE"
os.environ["OPENLAYER_API_KEY"] = "YOUR_OPENLAYER_API_KEY_HERE"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "YOUR_OPENLAYER_INFERENCE_PIPELINE_ID_HERE"
# 2. Import the `trace_anthropic` function
from openlayer.lib import trace_anthropic
anthropic_client = trace_anthropic(anthropic.Anthropic())
# 3. From now on, every message creation call with
# the `anthropic_client`is traced by Openlayer. E.g.,
completion = anthropic_client.messages.create(
model="claude-3-opus-20240229",
max_tokens=1024,
messages=[
{"role": "user", "content": "How are you doing today?"}
],
)
That’s it! Now, your calls are being published to Openlayer, along with metadata, such as latency, number of tokens, cost estimate, and more.
Refer to the Anthropic integration page for more details.
See full Python example
To monitor Mistral AI LLMs, you need to:
# 1. Set the environment variables
import os
os.environ["OPENLAYER_API_KEY"] = "YOUR_OPENLAYER_API_KEY_HERE"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "YOUR_OPENLAYER_INFERENCE_PIPELINE_ID_HERE"
# 2. Import the `trace_mistral` function and wrap the Mistral client
from mistralai import Mistral
from openlayer.lib import trace_mistral
mistral_client = trace_mistral(Mistral(api_key=os.environ["MISTRAL_API_KEY"]))
# 3. From now on, every chat completion or streaming call with
# the `mistral_client` is traced by Openlayer. E.g.,
completion = mistral_client.chat.complete(
model="mistral-large-latest",
messages = [
{"role": "user", "content": "What is the best French cheese?"},
]
)
That’s it! Now, your calls are being published to Openlayer, along with metadata, such as latency, number of tokens, cost estimate, and more.
Refer to the Mistral AI integration page for more details.
See full Python example
To monitor Groq LLMs, you need to:
# 1. Set the environment variables
import os
os.environ["GROQ_API_KEY"] = "YOUR_GROQ_API_KEY_HERE"
os.environ["OPENLAYER_API_KEY"] = "YOUR_OPENLAYER_API_KEY_HERE"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "YOUR_OPENLAYER_INFERENCE_PIPELINE_ID_HERE"
# 2. Import the `trace_groq` function and wrap the Groq client
import groq
from openlayer.lib import trace_groq
groq_client = trace_groq(groq.Groq())
# 3. From now on, every chat completion call with
# the `groq_client` is traced by Openlayer. E.g.,
completion = groq_client.chat.completions.create(
messages=[
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Explain the importance of fast language models",
}
],
model="llama3-8b-8192",
)
That’s it! Now, your calls are being published to Openlayer, along with metadata, such as latency, number of tokens, cost estimate, and more.
Refer to the Groq integration page for more details.
See full Python example
To monitor runs from OpenAI Assistants, you need to:
- Set the environment variables:
- Instantiate the OpenAI client:
- Create assistant, thread, and run it
That’s it! Now, your calls are being published to Openlayer, along with metadata, such as latency, number of tokens, cost estimate, and more.
Manual approach
To manually stream data to Openlayer, you can use the stream
method, which hits the
/data-stream
endpoit of the Openlayer REST API.