Observability
Monitor and track performance of your OpenResponses deployment
Observability
OpenResponses delivers enterprise-level observability out of the box, with zero configuration required to start collecting telemetry data across multiple dimensions.
Production-Grade Observability
- Launch and Monitor: Start tracking critical GenAI metrics the moment your service goes live
- No Instrumentation Burden: Skip weeks of custom instrumentation work with pre-built telemetry
- Immediate Insights: Gain instant visibility into model performance, token usage, and system health
- Scale with Confidence: Production-ready monitoring that grows with your deployment
Overview
OpenResponses uses OpenTelemetry standards to instrument:
- Model API calls across all providers
- Built-in tool executions
- Message content (user, system, assistant, and choices)
- Token usage metrics
- Performance metrics for various operations
Key Components
The primary component responsible for telemetry is the TelemetryService
.
This service provides methods to:
- Create and manage observations (spans)
- Record metrics
- Emit GenAI-specific events
- Track token usage
Metrics
You can view the collected metrics via the Spring Actuator endpoint: http://localhost:8080/actuator/metrics
The system produces the following key metrics:
Metric Name | Description |
---|---|
builtin.tool.execute | Measures tool execution performance |
gen_ai.client.operation.duration | Tracks duration of model API calls |
gen_ai.client.token.usage | Counts input and output token usage |
open.responses.create | Measures non-streaming response generation time |
open.responses.createStream | Measures streaming response generation time |
Traces
The system creates traces for:
- HTTP POST requests to
/v1/responses
- Model response generation via
open.responses.create
- Built-in tool execution via
builtin.tool.execute
- Streaming response generation via
open.responses.createStream
How to Export Telemetry Data
OpenResponses is designed to work with the OpenTelemetry ecosystem for exporting telemetry data to various backends.
Setting Up the OpenTelemetry Collector
-
To enable the OpenTelemetry collector integration, start the service with:
-
The OpenTelemetry collector collects data from the service using OTLP (OpenTelemetry Protocol) over gRPC or HTTP.
-
Configuration of the collector is done via its config file, typically located in the deployment environment.
Exporting to Monitoring Tools
The collected data can be shipped to various monitoring tools:
- For Metrics: Prometheus and Grafana
- For Traces: Jaeger, Zipkin, or other tracing backends
- For Logs: Various log aggregation systems
Observability in Action
Below are some examples of the observability insights available in OpenResponses:
Distributed Tracing with Conversation Logs
The following image shows distributed tracing of a Brave search agent with streaming, including the complete conversation logs:
GenAI Performance Metrics
This dashboard displays token usage and model performance metrics:
System Health Monitoring
Monitor the overall health and performance of your OpenResponses service:
Standard Metrics
In addition to GenAI-specific observability, OpenResponses emits standard Spring Boot metrics, including:
- JVM statistics (memory usage, garbage collection)
- HTTP request metrics (response times, error rates)
- System metrics (CPU usage, disk I/O)
- Logging metrics
- Thread pool statistics
These metrics provide a holistic view of the application’s performance beyond just the AI model interactions.
OpenTelemetry Compatibility
The built-in observability system in OpenResponses is highly flexible and compatible with any OpenTelemetry-compliant tool. This allows you to:
- Use SigNoz, Jaeger, or Dynatrace for distributed tracing
- Implement Prometheus and Grafana for metrics and dashboards
- Integrate with OpenTelemetry-compatible GenAI evaluation stacks like LangFuse
This flexibility ensures that OpenResponses can fit into your existing observability infrastructure without requiring proprietary monitoring solutions.
OpenTelemetry Compliance
The implementation follows OpenTelemetry specifications for:
- Spans: GenAI Agent Spans
- Metrics: GenAI Metrics
- Events: GenAI Events
This ensures compatibility with standard observability tools and dashboards that support OpenTelemetry.