Quick Start Guide

This guide will help you get up and running with OpenResponses API quickly. We’ll cover several deployment options and provide example API calls to get you started.

Setup Options

Basic Setup

Prerequisites

  • Ensure port 8080 is available
  • Docker daemon must be running on your local machine

Run with Docker

The simplest way to get started is using Docker:

docker run -p 8080:8080 masaicai/open-responses:latest

Clone the Repository

If you prefer using Docker Compose or want to explore the codebase, begin by cloning the repository:

git clone https://github.com/masaic-ai-platform/open-responses.git
cd open-responses

Run with Docker Compose

# Start with Docker Compose
docker-compose up open-responses

Example API Calls

Once OpenResponses is running, you can start making API calls. Here are some examples:

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "stream": false,
    "input": [
        {
            "role": "user",
            "content": "Tell me a joke"
        }
    ]
}'

Get your OpenAI key here

Tool-enabled Setup

OpenResponses comes with built-in tools that enhance the capabilities of your AI applications. The pre-packaged mcp-servers-config.json file includes configurations for Brave Web Search and GitHub tools.

Prerequisites

Before using built-in tools, you’ll need:

Configuration

Create a .env file with your API keys:

GITHUB_TOKEN=your_github_token
BRAVE_API_KEY=your_brave_key_value

Run with Built-in Tools

docker-compose --profile mcp up open-responses-mcp

Stop previously running docker-compose (if any) before running this command.

Example Tool-enabled API Calls

Custom Tool Configuration

If you want to add your own custom tools, you’ll need to create or modify the mcp-servers-config.json file with your own tool definitions.

Update the .env File

Add or update the following property in your .env file:

MCP_CONFIG_FILE_PATH=path_to_mcp_config_file

Run the Service with Custom MCP Configuration

docker-compose --profile mcp up open-responses-custom-mcp

Stop previously running docker-compose (if any) before running this command.

OpenAI Agent SDK Integration

You can run examples provided by the openai-agent-python SDK using your locally deployed open-responses API.

Running Example Scripts with OpenAI Agent SDK

  1. Start the service using:
docker-compose up open-responses-with-openai

Stop previously running docker-compose (if any) before running this command.

  1. Clone the forked repository:
git clone https://github.com/masaic-ai-platform/openai-agents-python.git
cd openai-agents-python
  1. Configure the SDK in your Python script:

    • Define the environment variable OPENAI_API_KEY with your OpenAI API key
    • Define the environment variable OPEN_RESPONSES_URL to specify the URL for your local open-responses API
    • If not set, it will default to “http://localhost:8080/v1
  2. Run the examples:

    • Follow Get started instructions to install the SDK
    • All examples should work as expected except for the research_bot example which uses OpenAI’s proprietary WebSearchTool

Running Agent Examples with Built-In Tools

Before running any example, ensure you are in the openai-agents-python folder.

  1. Run the service with MCP tools:
docker-compose --profile mcp up open-responses-mcp

If you require SDK traces, ensure that you set the environment variable OPENAI_API_KEY. Otherwise, you may see a warning “OPENAI_API_KEY is not set, skipping trace export”. To disable tracing explicitly, add the statement:

set_tracing_disabled(disabled=False)
  1. Set up your environment variables:

    • GROQ_API_KEY
    • OPEN_RESPONSES_URL
    • CLAUDE_API_KEY (for specific examples)
  2. Run the example scripts:

# Run agent_hands_off.py
python -m examples.open_responses.agent_hands_off
# Run brave_search_agent_with_groq.py
python -m examples.open_responses.brave_search_agent_with_groq
# Run brave_search_agent_with_groq_stream (requires CLAUDE_API_KEY)
python -m examples.open_responses.brave_search_agent_with_groq_stream
# Run think_tool_agent_with_claude.py (requires CLAUDE_API_KEY)
python -m examples.open_responses.think_tool_agent_with_claude

You can review the example scripts under the open_responses examples repository.

Setup with Persistent Storage

For production environments, you can enable persistent storage with MongoDB:

Prerequisites

  • Ensure port 8080 is available for the Open Responses service
  • Ensure port 27017 is available for MongoDB
  • Docker daemon must be running on your local machine

Run with MongoDB

# Start MongoDB and OpenResponses
docker-compose --profile mongodb up

This will launch both the MongoDB database and the Open Responses service configured to use MongoDB as its storage backend.

Testing Persistent Storage

To test that responses are being stored, make an API call with the store parameter set to true:

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "store": true,
    "input": [
        {
            "role": "user",
            "content": "Write a short poem about persistence"
        }
    ]
}'

Sample Response

When you make a request to store a conversation, you’ll receive a response that includes a unique ID which can be used for continuing the conversation:

{
    "id": "chatcmpl-BLB88IDIyyHjupNiAqOZQvAfT2T3K",
    "created_at": 1744387372,
    "error": null,
    "incomplete_details": null,
    "instructions": null,
    "metadata": null,
    "model": "gpt-4o-2024-08-06",
    "object": "response",
    "output": [
        {
            "id": "589f0a06-ad6e-4a65-ae9f-1d2189c07813",
            "content": [
                {
                    "annotations": [],
                    "text": "In the tranquil heart of an ancient forest...",
                    "type": "output_text"
                }
            ],
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": null,
    "temperature": 1.0,
    "tool_choice": "none",
    "top_p": null,
    "max_output_tokens": null,
    "previous_response_id": null,
    "reasoning": null,
    "status": "completed",
    "usage": {
        "input_tokens": 18,
        "output_tokens": 84,
        "output_tokens_details": {
            "reasoning_tokens": 0
        },
        "total_tokens": 102
    }
}

The response will include an id that you can use to reference this conversation in future requests.

Following Up on a Stored Conversation

To continue a stored conversation, use the previous_response_id from your earlier response:

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "store": true,
    "previous_response_id": "chatcmpl-BLB88IDIyyHjupNiAqOZQvAfT2T3K",
    "input": [
        {
            "role": "user",
            "content": "Make it longer and add a rhyme scheme"
        }
    ]
}'

For more details on response storage options and configuration, refer to the Response Store Configuration documentation.

Setup with Observability

OpenResponses can be run with a comprehensive observability stack that includes OpenTelemetry, Jaeger, Prometheus, and Grafana. This setup provides detailed monitoring, tracing, and metrics visualization capabilities.

For detailed instructions on setting up and running OpenResponses with the observability stack, please refer to our Running with Observability Stack guide.

Setup with RAG

RAG (Retrieval-Augmented Generation) enables you to enhance AI responses with information retrieved from your documents. OpenResponses provides a complete RAG solution with vector search capabilities powered by Qdrant.

Prerequisites

  • Ensure ports 8080 (Open Responses), 6333 (Qdrant REST API), and 6334 (Qdrant gRPC API) are available
  • Docker daemon must be running on your local machine
  • OpenAI API key for embedding generation and AI responses (or another supported provider)

Option 1: Basic RAG Setup with Qdrant

This setup includes just the vector database for document storage and retrieval:

# Start Qdrant and OpenResponses with RAG capabilities
docker-compose --profile qdrant up

Option 2: Complete RAG Setup with MongoDB Storage

For production environments, this setup combines vector search with persistent storage:

# Start MongoDB, Qdrant, and OpenResponses with full RAG capabilities
docker-compose --profile qdrant-mongodb up

Using the RAG System

1

Upload a File

First, upload a file to be processed and stored:

curl --location 'http://localhost:8080/v1/files' \
--header 'Authorization: Bearer YOUR_LLM_API_KEY' \
--form 'file=@"./path/to/your/document.pdf"' \
--form 'purpose="vector_search"'

You’ll receive a file_id in the response.

2

Create a Vector Store

Create a vector store to hold your document embeddings:

curl --location 'http://localhost:8080/v1/vector_stores' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_LLM_API_KEY' \
--data '{
  "name": "my-documents"
}'

This returns a vector_store_id.

3

Add File to Vector Store

Add your uploaded file to the vector store:

curl --location 'http://localhost:8080/v1/vector_stores/YOUR_VECTOR_STORE_ID/files' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_LLM_API_KEY' \
--data '{
  "file_id": "YOUR_FILE_ID",
  "chunking_strategy": {
    "type": "static",
    "static": {
      "max_chunk_size_tokens": 1000,
      "chunk_overlap_tokens": 200
    }
  },
  "attributes": {
    "category": "documentation",
    "language": "en"
  }
}'
4

Search the Vector Store

You can directly search for content in your vector store:

curl --location 'http://localhost:8080/v1/vector_stores/YOUR_VECTOR_STORE_ID/search' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_LLM_API_KEY' \
--data '{
  "query": "What does the document say about configuration?",
  "max_num_results": 5,
  "filters": {
    "type": "eq",
    "key": "language",
    "value": "en"
  }
}'
5

Ask Questions Using RAG

Use the file_search tool to answer questions based on your documents:

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_LLM_API_KEY' \
--data '{
  "model": "gpt-4o",
  "tools": [{
    "type": "file_search",
    "vector_store_ids": ["YOUR_VECTOR_STORE_ID"],
    "max_num_results": 5,
    "filters": {
      "type": "and",
      "filters": [
        {
          "type": "eq",
          "key": "language",
          "value": "en"
        }
      ]
    }
  }],
  "input": "What does the document say about configuration?",
  "instructions": "Answer questions using information from the provided documents only. If the information is not in the documents, say so."
}'

If no x-model-provider header is specified, GROQ is used as the default provider.

Using RAG with Different AI Models

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "tools": [{
      "type": "file_search",
      "vector_store_ids": ["YOUR_VECTOR_STORE_ID"],
      "max_num_results": 3
    }],
    "input": "Summarize what the document says about API usage",
    "instructions": "Use only information from the documents to answer."
}'

Best Practices for RAG

  • Break large documents into logical sections before uploading
  • Use custom attributes to organize and filter your documents
  • Set a reasonable chunk size (800-1200 tokens) with some overlap
  • Provide clear and specific queries to get better results
  • Use filters to narrow down search results when you have many documents

For more details on using the RAG system, refer to the RAG Documentation.

Next Steps