This guide will help you get up and running with OpenResponses API quickly. We’ll cover several deployment options and provide example API calls to get you started.
OpenResponses comes with built-in tools that enhance the capabilities of your AI applications. The pre-packaged mcp-servers-config.json file includes configurations for Brave Web Search and GitHub tools.
curl --location 'http://localhost:8080/v1/responses' \--header 'Content-Type: application/json' \--header 'Authorization: Bearer ANTHROPIC_API_KEY' \--header 'x-model-provider: claude' \--data '{ "model": "claude-3-7-sonnet-20250219", "stream": false, "tools": [ {"type": "think"} ], "input": [ { "role": "system", "content": "You are an experienced system design architect. Use the think tool to cross confirm thoughts before preparing the final answer." }, { "role": "user", "content": "Give me the guidelines on designing a multi-agent distributed system with the following constraints in mind: 1. compute costs minimal, 2. the system should be horizontally scalable, 3. the behavior should be deterministic." } ]}'
Before running any example, ensure you are in the openai-agents-python folder.
Run the service with MCP tools:
Copy
docker-compose --profile mcp up open-responses-mcp
Copy
docker-compose --profile mcp up open-responses-mcp
Copy
docker-compose --profile mcp up open-responses-mcp-windows
If you require SDK traces, ensure that you set the environment variable OPENAI_API_KEY.
Otherwise, you may see a warning “OPENAI_API_KEY is not set, skipping trace export”.
To disable tracing explicitly, add the statement:
Copy
set_tracing_disabled(disabled=False)
Set up your environment variables:
GROQ_API_KEY
OPEN_RESPONSES_URL
CLAUDE_API_KEY (for specific examples)
Run the example scripts:
Copy
# Run agent_hands_off.pypython -m examples.open_responses.agent_hands_off
Copy
# Run brave_search_agent_with_groq.pypython -m examples.open_responses.brave_search_agent_with_groq
Copy
# Run brave_search_agent_with_groq_stream (requires CLAUDE_API_KEY)python -m examples.open_responses.brave_search_agent_with_groq_stream
Copy
# Run think_tool_agent_with_claude.py (requires CLAUDE_API_KEY)python -m examples.open_responses.think_tool_agent_with_claude
When you make a request to store a conversation, you’ll receive a response that includes a unique ID which can be used for continuing the conversation:
OpenResponses can be run with a comprehensive observability stack that includes OpenTelemetry, Jaeger, Prometheus, and Grafana. This setup provides detailed monitoring, tracing, and metrics visualization capabilities.
For detailed instructions on setting up and running OpenResponses with the observability stack, please refer to our Running with Observability Stack guide.
RAG (Retrieval-Augmented Generation) enables you to enhance AI responses with information retrieved from your documents. OpenResponses provides a complete RAG solution with vector search capabilities powered by Qdrant.
You can directly search for content in your vector store:
Copy
curl --location 'http://localhost:8080/v1/vector_stores/YOUR_VECTOR_STORE_ID/search' \--header 'Content-Type: application/json' \--header 'Authorization: Bearer YOUR_LLM_API_KEY' \--data '{ "query": "What does the document say about configuration?", "max_num_results": 5, "filters": { "type": "eq", "key": "language", "value": "en" }}'
5
Ask Questions Using RAG
Use the file_search tool to answer questions based on your documents:
Copy
curl --location 'http://localhost:8080/v1/responses' \--header 'Content-Type: application/json' \--header 'Authorization: Bearer YOUR_LLM_API_KEY' \--data '{ "model": "gpt-4o", "tools": [{ "type": "file_search", "vector_store_ids": ["YOUR_VECTOR_STORE_ID"], "max_num_results": 5, "filters": { "type": "and", "filters": [ { "type": "eq", "key": "language", "value": "en" } ] } }], "input": "What does the document say about configuration?", "instructions": "Answer questions using information from the provided documents only. If the information is not in the documents, say so."}'
If no x-model-provider header is specified, GROQ is used as the default provider.
curl --location 'http://localhost:8080/v1/responses' \--header 'Content-Type: application/json' \--header 'Authorization: Bearer OPENAI_API_KEY' \--header 'x-model-provider: openai' \--data '{ "model": "gpt-4o", "tools": [{ "type": "file_search", "vector_store_ids": ["YOUR_VECTOR_STORE_ID"], "max_num_results": 3 }], "input": "Summarize what the document says about API usage", "instructions": "Use only information from the documents to answer."}'
Copy
curl --location 'http://localhost:8080/v1/responses' \--header 'Content-Type: application/json' \--header 'Authorization: Bearer OPENAI_API_KEY' \--header 'x-model-provider: openai' \--data '{ "model": "gpt-4o", "tools": [{ "type": "file_search", "vector_store_ids": ["YOUR_VECTOR_STORE_ID"], "max_num_results": 3 }], "input": "Summarize what the document says about API usage", "instructions": "Use only information from the documents to answer."}'
Copy
curl --location 'http://localhost:8080/v1/responses' \--header 'Content-Type: application/json' \--header 'Authorization: Bearer ANTHROPIC_API_KEY' \--header 'x-model-provider: claude' \--data '{ "model": "claude-3-5-sonnet-20241022", "tools": [{ "type": "file_search", "vector_store_ids": ["YOUR_VECTOR_STORE_ID"], "max_num_results": 3 }], "input": "Summarize what the document says about API usage", "instructions": "Use only information from the documents to answer."}'
Copy
curl --location 'http://localhost:8080/v1/responses' \--header 'Content-Type: application/json' \--header 'Authorization: Bearer GROQ_API_KEY' \--data '{ "model": "llama-3.2-3b-preview", "tools": [{ "type": "file_search", "vector_store_ids": ["YOUR_VECTOR_STORE_ID"], "max_num_results": 3 }], "input": "Summarize what the document says about API usage", "instructions": "Use only information from the documents to answer."}'
Groq is the default provider if no x-model-provider header is specified.