The File Search tool provides direct access to your vector stores, enabling semantic search to find relevant content based on the meaning of your query rather than just keyword matching.

What It Is

The File Search tool connects directly to your vector stores, allowing you to search through your document collections using semantic understanding rather than just keyword matching. It’s designed for immediate results on simple, well-defined queries.

Query: “What is the maximum connection timeout value?”
Results:
From: config/database.md
The maximum connection timeout value is 300 seconds (5 minutes). This can be configured in the database settings with the DB_CONNECTION_TIMEOUT environment variable. For high-latency networks, consider increasing this value.

File Search Tool in Action

Configuration

Basic configuration requires a query parameter:

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $YOUR_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "tools": [
      {
        "type": "file_search",
        "vector_store_ids": ["vs_product_docs"]
      }
    ],
    "input": "What is the recommended database connection timeout?",
    "instructions": "Search the product documentation to find specific recommendations."
}'

Configuration Parameters

ParameterDescriptionDefaultRequired
queryThe search query-Yes
vector_store_idsList of vector store IDs to search-Yes
filtersOptional filters to narrow search resultsnullNo
max_num_resultsMaximum number of results to return20No

Response Format

The response includes relevant content chunks with metadata:

{
  "data": [
    {
      "file_id": "file_abc123",
      "filename": "config/database.md",
      "score": 0.92,
      "content": "The maximum connection timeout value is 300 seconds (5 minutes). This can be configured in the database settings with the `DB_CONNECTION_TIMEOUT` environment variable. For high-latency networks, consider increasing this value.",
      "annotations": [
        {
          "type": "file_citation",
          "index": 0,
          "file_id": "file_abc123",
          "filename": "config/database.md"
        }
      ]
    },
    {
      "file_id": "file_def456",
      "filename": "deployment/networking.md",
      "score": 0.87,
      "content": "Connection timeout settings are crucial for system stability. Default values vary by service: database (300s), API services (60s), worker processes (120s).",
      "annotations": [
        {
          "type": "file_citation",
          "index": 0,
          "file_id": "file_def456",
          "filename": "deployment/networking.md"
        }
      ]
    }
  ]
}

When to Use It

The File Search tool excels in specific scenarios where direct, immediate access to information is required.

Recommended ForConsider Alternatives For
  • Simple, factual queries
  • Finding specific documentation
  • Known-item searches
  • Code examples for specific tasks
  • Looking up configurations or settings
  • Complex research questions
  • Multi-faceted analysis
  • Questions requiring synthesis of multiple sources
  • Exploratory research without clear goals

Use Cases for File Search

For complex questions requiring deep research and multiple search iterations, consider using the Agentic Search Tool instead.

Example Queries

Simple Factual Queries

{
  "query": "What is the maximum file upload size?"
}

Finding Code Examples

{
  "query": "Show me examples of error handling in async functions"
}

Looking Up Configuration

{
  "query": "Default Redis cache expiration settings"
}

Advanced Usage

Filtering by Metadata

The filters parameter allows you to narrow search results based on document metadata:

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "tools": [{
      "type": "file_search",
      "vector_store_ids": ["vs_security_docs"],
      "max_num_results": 5,
      "query": "Authentication best practices",
      "filters": {
        "type": "and",
        "filters": [
          {
            "type": "eq",
            "key": "compliance",
            "value": "GDPR"
          },
          {
            "type": "gt",
            "key": "updated_date",
            "value": "2023-01-01"
          }
        ]
      }
    }],
    "input": "What are the best practices for authentication that comply with GDPR?",
    "instructions": "Answer questions using information from the provided documents."
}'

Filters dramatically improve search relevance by narrowing the search space. Always use them when you know specific attributes of the documents you’re looking for.

Multi-Store Searching

Searching across multiple vector stores allows you to find information across different document collections:

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "tools": [{
      "type": "file_search",
      "vector_store_ids": ["vs_technical_docs", "vs_code_examples", "vs_architecture"],
      "max_num_results": 5,
      "query": "API rate limiting implementation"
    }],
    "input": "Explain how to implement API rate limiting",
    "instructions": "Answer questions using information from the provided documents."
}'

Best Practices

Integration Examples

Tool Usage with OpenAI Models

curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
    "model": "gpt-4o",
    "tools": [{
      "type": "file_search",
      "vector_store_ids": ["vs_technical_docs"],
      "max_num_results": 5,
      "filters": {
        "type": "and",
        "filters": [
          {
            "type": "eq",
            "key": "category",
            "value": "documentation"
          },
          {
            "type": "eq",
            "key": "updated_after",
            "value": "2023-01-01"
          }
        ]
      }
    }],
    "input": "How do I implement rate limiting?",
    "instructions": "Answer questions using information from the provided documents."
}'

The file_search tool automatically performs the search and provides the results to the AI model in the same request. This creates a seamless RAG experience where the model can access and use the information without additional API calls.

GitHub Example Implementation

For a complete working example of the File Search tool, check out this Python example on GitHub. This example demonstrates:

  • Uploading files and creating vector stores
  • Setting up and configuring the FileSearchTool
  • Performing direct searches with different parameters
  • Comparing results with the Agentic Search approach
  • Integration with the OpenAI Agents SDK

Using with Custom Search Logic

For applications that need more control over the search process, you can implement custom search logic:

async function enhancedResponseWithSearch(query, topic) {
  // First perform a targeted search
  const searchFilters = {
    type: "and", 
    filters: [
      { type: "eq", key: "topic", value: topic }
    ]
  };
  
  // Use the results to enhance your AI response
  const response = await openai.responses.create({
    model: "gpt-4o",
    tools: [{
      type: "file_search",
      vector_store_ids: ["vs_docs", "vs_code_examples"],
      max_num_results: 3,
      filters: searchFilters
    }],
    input: query,
    instructions: "Provide a detailed answer based on the documentation."
  });
  
  return response;
}

Comparing with Other Tools

FeatureFile SearchAgentic Search
Query complexitySimpleComplex
Search iterationsSingleMultiple
Response timeFastSlower
Result qualityDirect matchesComprehensive
Best forKnown-item searchesResearch questions

File Search vs. Agentic Search