Quick Start Guide

This guide will help you get up and running with OpenResponses API quickly. We’ll cover several deployment options and provide example API calls to get you started.

Setup Options

Basic Setup

Run OpenResponses with default configuration

Tool-enabled Setup

Run with built-in tools like web search and GitHub integration

Custom Tool Configuration

Configure your own custom tools

OpenAI Agent SDK Integration

Run examples with OpenAI Agent SDK

Persistent Storage

Configure with MongoDB for persistent storage

Observability Stack

Run with monitoring and observability capabilities

RAG Setup

Configure Retrieval-Augmented Generation with vector search

Basic Setup

Prerequisites

Ensure port 8080 is available
Docker daemon must be running on your local machine

Run with Docker

The simplest way to get started is using Docker:

docker run -p 8080:8080 masaicai/open-responses:latest

Clone the Repository

If you prefer using Docker Compose or want to explore the codebase, begin by cloning the repository:

git clone https://github.com/masaic-ai-platform/open-responses.git
cd open-responses

Run with Docker Compose

# Start with Docker Compose
docker-compose up open-responses

Example API Calls

Once OpenResponses is running, you can start making API calls. Here are some examples:

For a complete list of all supported model providers with detailed curl examples, check out our Model Providers documentation.

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "stream": false,
    "input": [
        {
            "role": "user",
            "content": "Tell me a joke"
        }
    ]
}'

Get your OpenAI key here

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "stream": false,
    "input": [
        {
            "role": "user",
            "content": "Tell me a joke"
        }
    ]
}'

Get your OpenAI key here

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ANTHROPIC_API_KEY' \
--header 'x-model-provider: claude' \
--data '{
    "model": "claude-3-5-sonnet-20241022",
    "stream": false,
    "input": [
        {
            "role": "user",
            "content": "Tell me a joke"
        }
    ]
}'

Get your Anthropic key here

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer GROQ_API_KEY' \
--data '{
    "model": "llama-3.2-3b-preview",
    "stream": true,
    "input": [
        {
            "role": "user",
            "content": "Tell me a joke"
        }
    ]
}'

Open your Groq key page to create a key if you haven’t done so already.

Tool-enabled Setup

OpenResponses comes with built-in tools that enhance the capabilities of your AI applications. The pre-packaged mcp-servers-config.json file includes configurations for Brave Web Search and GitHub tools.

Prerequisites

Before using built-in tools, you’ll need:

GitHub Personal Access Token - Generate here
Brave Search API key - Get from Brave API Dashboard

Configuration

Create a .env file with your API keys:

GITHUB_TOKEN=your_github_token
BRAVE_API_KEY=your_brave_key_value

Run with Built-in Tools

docker-compose --profile mcp up open-responses-mcp

Stop previously running docker-compose (if any) before running this command.

Example Tool-enabled API Calls

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer GROQ_API_KEY' \
--data '{
    "model": "qwen-2.5-32b",
    "stream": false,
    "tools": [
        {
            "type": "brave_web_search"
        }
    ],
    "input": [
        {
            "role": "user",
            "content": "Where did NVIDIA GTC happened in 2025 and what were the major announcements?"
        }
    ]
}'

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer GROQ_API_KEY' \
--data '{
    "model": "qwen-2.5-32b",
    "stream": false,
    "tools": [
        {
            "type": "brave_web_search"
        }
    ],
    "input": [
        {
            "role": "user",
            "content": "Where did NVIDIA GTC happened in 2025 and what were the major announcements?"
        }
    ]
}'

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "stream": false,
    "tools": [
        {
            "type": "search_repositories"
        }
    ],
    "input": [
        {
            "role": "user",
            "content": "Give me details of all repositories in github org masaic-ai-platform"
        }
    ]
}'

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ANTHROPIC_API_KEY' \
--header 'x-model-provider: claude' \
--data '{
    "model": "claude-3-7-sonnet-20250219",
    "stream": false,
    "tools": [
       {"type": "think"}
    ],
    "input": [
        {
            "role": "system",
            "content": "You are an experienced system design architect. Use the think tool to cross confirm thoughts before preparing the final answer."
        },
        {
            "role": "user",
            "content": "Give me the guidelines on designing a multi-agent distributed system with the following constraints in mind: 1. compute costs minimal, 2. the system should be horizontally scalable, 3. the behavior should be deterministic."
        }
    ]
}'

Custom Tool Configuration

If you want to add your own custom tools, you’ll need to create or modify the mcp-servers-config.json file with your own tool definitions.

Update the .env File

Add or update the following property in your .env file:

MCP_CONFIG_FILE_PATH=path_to_mcp_config_file

Run the Service with Custom MCP Configuration

docker-compose --profile mcp up open-responses-custom-mcp

Stop previously running docker-compose (if any) before running this command.

OpenAI Agent SDK Integration

You can run examples provided by the openai-agent-python SDK using your locally deployed open-responses API.

Running Example Scripts with OpenAI Agent SDK

Start the service using:

docker-compose up open-responses-with-openai

Stop previously running docker-compose (if any) before running this command.

Clone the forked repository:

git clone https://github.com/masaic-ai-platform/openai-agents-python.git
cd openai-agents-python

Configure the SDK in your Python script:
- Define the environment variable OPENAI_API_KEY with your OpenAI API key
- Define the environment variable OPEN_RESPONSES_URL to specify the URL for your local open-responses API
- If not set, it will default to “http://localhost:8080/v1”
Run the examples:
- Follow Get started instructions to install the SDK
- All examples should work as expected except for the research_bot example which uses OpenAI’s proprietary WebSearchTool

Running Agent Examples with Built-In Tools

Before running any example, ensure you are in the openai-agents-python folder.

Run the service with MCP tools:

docker-compose --profile mcp up open-responses-mcp

If you require SDK traces, ensure that you set the environment variable OPENAI_API_KEY. Otherwise, you may see a warning “OPENAI_API_KEY is not set, skipping trace export”. To disable tracing explicitly, add the statement:

set_tracing_disabled(disabled=False)

Set up your environment variables:
- GROQ_API_KEY
- OPEN_RESPONSES_URL
- CLAUDE_API_KEY (for specific examples)
Run the example scripts:

# Run agent_hands_off.py
python -m examples.open_responses.agent_hands_off

# Run brave_search_agent_with_groq.py
python -m examples.open_responses.brave_search_agent_with_groq

# Run brave_search_agent_with_groq_stream (requires CLAUDE_API_KEY)
python -m examples.open_responses.brave_search_agent_with_groq_stream

# Run think_tool_agent_with_claude.py (requires CLAUDE_API_KEY)
python -m examples.open_responses.think_tool_agent_with_claude

You can review the example scripts under the open_responses examples repository.

Setup with Persistent Storage

For production environments, you can enable persistent storage with MongoDB:

Prerequisites

Ensure port 8080 is available for the Open Responses service
Ensure port 27017 is available for MongoDB
Docker daemon must be running on your local machine

Run with MongoDB

# Start MongoDB and OpenResponses
docker-compose --profile mongodb up

This will launch both the MongoDB database and the Open Responses service configured to use MongoDB as its storage backend.

Testing Persistent Storage

To test that responses are being stored, make an API call with the store parameter set to true:

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "store": true,
    "input": [
        {
            "role": "user",
            "content": "Write a short poem about persistence"
        }
    ]
}'

Sample Response

When you make a request to store a conversation, you’ll receive a response that includes a unique ID which can be used for continuing the conversation:

{
    "id": "chatcmpl-BLB88IDIyyHjupNiAqOZQvAfT2T3K",
    "created_at": 1744387372,
    "error": null,
    "incomplete_details": null,
    "instructions": null,
    "metadata": null,
    "model": "gpt-4o-2024-08-06",
    "object": "response",
    "output": [
        {
            "id": "589f0a06-ad6e-4a65-ae9f-1d2189c07813",
            "content": [
                {
                    "annotations": [],
                    "text": "In the tranquil heart of an ancient forest...",
                    "type": "output_text"
                }
            ],
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": null,
    "temperature": 1.0,
    "tool_choice": "none",
    "top_p": null,
    "max_output_tokens": null,
    "previous_response_id": null,
    "reasoning": null,
    "status": "completed",
    "usage": {
        "input_tokens": 18,
        "output_tokens": 84,
        "output_tokens_details": {
            "reasoning_tokens": 0
        },
        "total_tokens": 102
    }
}

The response will include an id that you can use to reference this conversation in future requests.

Following Up on a Stored Conversation

To continue a stored conversation, use the previous_response_id from your earlier response:

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "store": true,
    "previous_response_id": "chatcmpl-BLB88IDIyyHjupNiAqOZQvAfT2T3K",
    "input": [
        {
            "role": "user",
            "content": "Make it longer and add a rhyme scheme"
        }
    ]
}'

For more details on response storage options and configuration, refer to the Response Store Configuration documentation.

Setup with Observability

OpenResponses can be run with a comprehensive observability stack that includes OpenTelemetry, Jaeger, Prometheus, and Grafana. This setup provides detailed monitoring, tracing, and metrics visualization capabilities.

For detailed instructions on setting up and running OpenResponses with the observability stack, please refer to our Running with Observability Stack guide.

Setup with RAG

RAG (Retrieval-Augmented Generation) enables you to enhance AI responses with information retrieved from your documents. OpenResponses provides a complete RAG solution with vector search capabilities powered by Qdrant.

Prerequisites

Ensure ports 8080 (Open Responses), 6333 (Qdrant REST API), and 6334 (Qdrant gRPC API) are available
Docker daemon must be running on your local machine
OpenAI API key for embedding generation and AI responses (or another supported provider)

Option 1: Basic RAG Setup with Qdrant

This setup includes just the vector database for document storage and retrieval:

# Start Qdrant and OpenResponses with RAG capabilities
docker-compose --profile qdrant up

Option 2: Complete RAG Setup with MongoDB Storage

For production environments, this setup combines vector search with persistent storage:

# Start MongoDB, Qdrant, and OpenResponses with full RAG capabilities
docker-compose --profile qdrant-mongodb up

Using the RAG System

Upload a File

First, upload a file to be processed and stored:

curl --location 'http://localhost:8080/v1/files' \
--header 'Authorization: Bearer YOUR_LLM_API_KEY' \
--form 'file=@"./path/to/your/document.pdf"' \
--form 'purpose="vector_search"'

You’ll receive a file_id in the response.

Create a Vector Store

Create a vector store to hold your document embeddings:

curl --location 'http://localhost:8080/v1/vector_stores' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_LLM_API_KEY' \
--data '{
  "name": "my-documents"
}'

This returns a vector_store_id.

Add File to Vector Store

Add your uploaded file to the vector store:

curl --location 'http://localhost:8080/v1/vector_stores/YOUR_VECTOR_STORE_ID/files' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_LLM_API_KEY' \
--data '{
  "file_id": "YOUR_FILE_ID",
  "chunking_strategy": {
    "type": "static",
    "static": {
      "max_chunk_size_tokens": 1000,
      "chunk_overlap_tokens": 200
    }
  },
  "attributes": {
    "category": "documentation",
    "language": "en"
  }
}'

Search the Vector Store

You can directly search for content in your vector store:

curl --location 'http://localhost:8080/v1/vector_stores/YOUR_VECTOR_STORE_ID/search' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_LLM_API_KEY' \
--data '{
  "query": "What does the document say about configuration?",
  "max_num_results": 5,
  "filters": {
    "type": "eq",
    "key": "language",
    "value": "en"
  }
}'

Ask Questions Using RAG

Use the file_search tool to answer questions based on your documents:

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_LLM_API_KEY' \
--data '{
  "model": "gpt-4o",
  "tools": [{
    "type": "file_search",
    "vector_store_ids": ["YOUR_VECTOR_STORE_ID"],
    "max_num_results": 5,
    "filters": {
      "type": "and",
      "filters": [
        {
          "type": "eq",
          "key": "language",
          "value": "en"
        }
      ]
    }
  }],
  "input": "What does the document say about configuration?",
  "instructions": "Answer questions using information from the provided documents only. If the information is not in the documents, say so."
}'

If no x-model-provider header is specified, GROQ is used as the default provider.

Using RAG with Different AI Models

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "tools": [{
      "type": "file_search",
      "vector_store_ids": ["YOUR_VECTOR_STORE_ID"],
      "max_num_results": 3
    }],
    "input": "Summarize what the document says about API usage",
    "instructions": "Use only information from the documents to answer."
}'

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "tools": [{
      "type": "file_search",
      "vector_store_ids": ["YOUR_VECTOR_STORE_ID"],
      "max_num_results": 3
    }],
    "input": "Summarize what the document says about API usage",
    "instructions": "Use only information from the documents to answer."
}'

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ANTHROPIC_API_KEY' \
--header 'x-model-provider: claude' \
--data '{
    "model": "claude-3-5-sonnet-20241022",
    "tools": [{
      "type": "file_search",
      "vector_store_ids": ["YOUR_VECTOR_STORE_ID"],
      "max_num_results": 3
    }],
    "input": "Summarize what the document says about API usage",
    "instructions": "Use only information from the documents to answer."
}'

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer GROQ_API_KEY' \
--data '{
    "model": "llama-3.2-3b-preview",
    "tools": [{
      "type": "file_search",
      "vector_store_ids": ["YOUR_VECTOR_STORE_ID"],
      "max_num_results": 3
    }],
    "input": "Summarize what the document says about API usage",
    "instructions": "Use only information from the documents to answer."
}'

Groq is the default provider if no x-model-provider header is specified.

Best Practices for RAG

Break large documents into logical sections before uploading
Use custom attributes to organize and filter your documents
Set a reasonable chunk size (800-1200 tokens) with some overlap
Provide clear and specific queries to get better results
Use filters to narrow down search results when you have many documents

For more details on using the RAG system, refer to the RAG Documentation.

Next Steps

Explore OpenAI Compatibility for detailed API compatibility information
Learn about Tool Integrations for extending functionality
Configure Observability for production monitoring

Get Started

Announcements

Model Providers

Use Cases & Demos

OpenResponses API

Retrieval (RAG)

Built-In Tools

​Quick Start Guide

​Setup Options

Basic Setup

Tool-enabled Setup

Custom Tool Configuration

OpenAI Agent SDK Integration

Persistent Storage

Observability Stack

RAG Setup

​Basic Setup

​Prerequisites

​Run with Docker

​Clone the Repository

​Run with Docker Compose

​Example API Calls

​Tool-enabled Setup

​Prerequisites

​Configuration

​Run with Built-in Tools

​Example Tool-enabled API Calls

​Custom Tool Configuration

​Update the .env File

​Run the Service with Custom MCP Configuration

​OpenAI Agent SDK Integration

​Running Example Scripts with OpenAI Agent SDK

​Running Agent Examples with Built-In Tools

​Setup with Persistent Storage

​Prerequisites

​Run with MongoDB

​Testing Persistent Storage

​Sample Response

​Following Up on a Stored Conversation

​Setup with Observability

​Setup with RAG

​Prerequisites

​Option 1: Basic RAG Setup with Qdrant

​Option 2: Complete RAG Setup with MongoDB Storage

​Using the RAG System

​Using RAG with Different AI Models

​Best Practices for RAG

​Next Steps

Quick Start Guide

Setup Options

Basic Setup

Prerequisites

Run with Docker

Clone the Repository

Run with Docker Compose

Example API Calls

Tool-enabled Setup

Prerequisites

Configuration

Run with Built-in Tools

Example Tool-enabled API Calls

Custom Tool Configuration

Update the .env File

Run the Service with Custom MCP Configuration

OpenAI Agent SDK Integration

Running Example Scripts with OpenAI Agent SDK

Running Agent Examples with Built-In Tools

Setup with Persistent Storage

Prerequisites

Run with MongoDB

Testing Persistent Storage

Sample Response

Following Up on a Stored Conversation

Setup with Observability

Setup with RAG

Prerequisites

Option 1: Basic RAG Setup with Qdrant

Option 2: Complete RAG Setup with MongoDB Storage

Using the RAG System

Using RAG with Different AI Models

Best Practices for RAG

Next Steps