It is possible to update data previously streamed to the Openlayer platform. It usually happens when:
  • The ground truths for the data streamed to the platform were not available during the model inference time, but became available after some time.
  • You want to add human feedback associated with a request, but this feedback was not available during model inference time.
This guide shows how to use Openlayer SDKs to update previously published data.

How to update data

Every row streamed to Openlayer has an inference_id — a unique identifier of the row. You can provide the inference_id during stream time, and if you don’t, Openlayer will assign unique IDs to your rows.
Enhanced Tracing with Metadata: When using the @trace decorator, you can dynamically add metadata and set custom inference IDs: - Custom Inference IDs: Use update_current_trace(inferenceId="your_id") for request correlation and future data updates - Trace Metadata: Add context with update_current_trace(user_id="123", session="abc") - Step Metadata: Enrich individual steps with update_current_step(model="gpt-4", tokens=150) Key Benefit: Custom inference IDs enable you to easily add user feedback, ground truth labels, and signals after the initial request. See the Tracing guide for comprehensive examples.
You must use the inference_id to specify the rows you want to update. Let’s say that you want to add a column called label with ground truths. If you have your data in a pandas DataFrame similar to:
Python
>>> df
            inference_id  label
0             d56d2b2c      0
1             3b0b2521      1
2             8c294a3a      0
First, you need to retrieve the inference pipeline object with:
Python
import openlayer

client = openlayer.OpenlayerClient("YOUR_API_KEY_HERE")

project = client.load_project(name="Churn prediction")

inference_pipeline = project.load_inference_pipeline(
    name="production",
)
Then, you can update the data specified by the inference IDs with:
Python
inference_pipeline.update_data(
    df=df,
    inference_id_column_name='inference_id',
    ground_truth_column_name='label',
)

Using Custom Inference IDs for User Feedback and Signals

When you use custom inference IDs with the @trace decorator, you can easily collect and update data with user feedback, ratings, and other signals after the initial request. This creates powerful feedback loops for improving your AI system.

Complete Workflow Example

Here’s a comprehensive example showing how to set up custom inference IDs for future updates:
import time
from openlayer.lib import trace, update_current_trace

@trace()
def process*chat_message(user_id: str, message: str, conversation_id: str): # Create a meaningful custom inference ID
inference_id = f"chat*{conversation*id}*{user*id}*{int(time.time())}"

    update_current_trace(
        inferenceId=inference_id,
        user_id=user_id,
        conversation_id=conversation_id,
        message_type="user_query"
    )

    # Your AI processing logic here
    response = generate_ai_response(message)

    # Store the inference ID for later feedback collection
    store_for_feedback(inference_id, user_id, conversation_id, message, response)

    return response, inference_id

def store_for_feedback(inference_id, user_id, conversation_id, message, response): # Save to your database for future feedback collection
feedback_db.insert({
"inference_id": inference_id,
"user_id": user_id,
"conversation_id": conversation_id,
"user_message": message,
"ai_response": response,
"created_at": time.time(),
"feedback_collected": False
})

Advanced Use Cases

You can also update data with more sophisticated signals:
Python
# Example: Adding business metrics and conversion signals
business_signals = [
    {
        "inference_id": "chat_conv123_user456_1725123456",
        "user_converted": True,
        "conversion_value": 99.99,
        "time_to_conversion": 1800,  # 30 minutes
        "conversion_type": "subscription"
    },
    {
        "inference_id": "chat_conv124_user457_1725123789",
        "user_converted": False,
        "session_duration": 120,  # 2 minutes
        "pages_viewed": 3
    }
]

df_signals = pd.DataFrame(business_signals)

client.inference_pipelines.data.update(
    id="YOUR_INFERENCE_PIPELINE_ID",
    df=df_signals,
    inference_id_column_name="inference_id"
)

Benefits of This Approach

  • Seamless Integration: Custom inference IDs created during tracing are automatically available for updates
  • Rich Feedback Loops: Collect user ratings, business signals, and ground truth labels
  • Better Model Evaluation: Use real user feedback to assess model performance
  • Continuous Improvement: Identify patterns in user satisfaction to improve your AI system
  • A/B Testing: Track different model versions and their user satisfaction rates
Schedule regular syncing of feedback data to Openlayer (e.g., hourly or daily) to keep your monitoring dashboard up-to-date with real user sentiment and business metrics.