> ## Documentation Index
> Fetch the complete documentation index at: https://docs.octen.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Chat Completions

> Creates a chat completion. Compatible with the OpenAI Chat Completions protocol, with an optional built-in `octen_search` Web Search tool.


## OpenAPI

````yaml /api-reference/openapi.json post /v1/chat/completions
openapi: 3.1.0
info:
  title: Octen API
  description: >-
    Octen API provides Broad Search, Web Search, Image Search, Video Search,
    Extract, Embeddings, VL Embeddings, Answer, and Deep Research services. The
    Web Search API searches ranked web results with optional filters,
    highlights, and full content. The Image Search API searches for images from
    a text query, an image, or both, with an optional design mode that returns a
    structured summary and a reusable HTML snippet for each result. The Video
    Search API searches for videos from a text query. The Broad Search API
    decomposes a query into multiple sub-queries, searches them in parallel, and
    returns results grouped by sub-query. The Extract API extracts clean content
    from URLs, with optional query-focused highlights, page classification, and
    multimedia resources. The Embeddings API converts text into vector
    representations. The VL Embeddings API converts multimodal inputs into
    vector representations. The Answer API decomposes queries into multiple
    sub-queries for comprehensive search and synthesis. The Deep Research API
    runs a multi-round adaptive research pipeline that produces a structured
    research plan, executes iterative searches, and streams a final long-form
    report.
  version: 1.0.0
servers:
  - url: https://api.octen.ai
security:
  - apiKeyAuth: []
paths:
  /v1/chat/completions:
    post:
      summary: Chat Completions
      description: >-
        Creates a chat completion. Compatible with the OpenAI Chat Completions
        protocol, with an optional built-in `octen_search` Web Search tool.
      operationId: chat-completions
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatCompletionRequest'
            examples:
              pureModel:
                summary: Plain model call
                value:
                  model: anthropic/claude-opus-4.8
                  messages:
                    - role: user
                      content: Explain attention in one sentence.
                  stream: false
                  max_tokens: 2048
                  temperature: 1
              withSearch:
                summary: Built-in search tool
                value:
                  model: anthropic/claude-sonnet-4.6
                  messages:
                    - role: system
                      content: You are a helpful assistant.
                    - role: user
                      content: What's the weather in Beijing today?
                  tools:
                    - type: octen_search
                      parameters:
                        max_searches: 3
                        count: 10
                        highlight:
                          enable: true
                          max_tokens: 300
                        format: markdown
                  tool_choice: auto
                  stream: false
              functionAndSearch:
                summary: Custom function + built-in search
                value:
                  model: openai/gpt-5.5
                  messages:
                    - role: user
                      content: >-
                        Check tomorrow's weather in Shanghai, then create a
                        travel reminder.
                  tools:
                    - type: octen_search
                    - type: function
                      function:
                        name: create_reminder
                        description: Create a reminder
                        parameters:
                          type: object
                          properties:
                            content:
                              type: string
                              description: Reminder content
                            time:
                              type: string
                              description: Reminder time, ISO 8601
                          required:
                            - content
                            - time
                  tool_choice: auto
              promptCaching:
                summary: Prompt caching
                value:
                  model: anthropic/claude-sonnet-4.6
                  messages:
                    - role: system
                      content:
                        - type: text
                          text: Product manual (long stable prefix)...
                          cache_control:
                            type: ephemeral
                            ttl: 5m
                    - role: user
                      content: What is the return policy?
                  max_tokens: 1024
      responses:
        '200':
          description: >-
            Successful chat completion. When `stream=false`, returns a single
            `chat.completion` object. When `stream=true`, returns a stream of
            `chat.completion.chunk` objects (`search_done`, `content`, `finish`,
            `usage`), followed by `data: [DONE]`.
          content:
            application/json:
              schema:
                oneOf:
                  - $ref: '#/components/schemas/ChatCompletionResponse'
                  - $ref: '#/components/schemas/ChatCompletionChunk'
                discriminator:
                  propertyName: object
                  mapping:
                    chat.completion:
                      $ref: '#/components/schemas/ChatCompletionResponse'
                    chat.completion.chunk:
                      $ref: '#/components/schemas/ChatCompletionChunk'
              examples:
                nonStreaming:
                  summary: Non-streaming, plain model call
                  value:
                    id: gen-1749812456-xyz7890
                    object: chat.completion
                    created: 1749812456
                    model: anthropic/claude-opus-4.8
                    choices:
                      - index: 0
                        finish_reason: stop
                        message:
                          role: assistant
                          content: >-
                            Attention lets a model dynamically weight its inputs
                            and focus on the most relevant information.
                          refusal: null
                          reasoning: null
                    usage:
                      prompt_tokens: 18
                      completion_tokens: 32
                      total_tokens: 50
                promptCache:
                  summary: Non-streaming, prompt cache hit
                  value:
                    id: gen-1749812999-cache01
                    object: chat.completion
                    created: 1749812999
                    model: anthropic/claude-sonnet-4.6
                    choices:
                      - index: 0
                        finish_reason: stop
                        message:
                          role: assistant
                          content: >-
                            Per the manual, 7-day no-reason returns are
                            supported.
                          refusal: null
                          reasoning: null
                    usage:
                      prompt_tokens: 2106
                      completion_tokens: 18
                      total_tokens: 2124
                      prompt_tokens_details:
                        cached_tokens: 2048
                withSearch:
                  summary: Non-streaming, with built-in search
                  value:
                    id: gen-1749812345-abcd1234
                    object: chat.completion
                    created: 1749812345
                    model: anthropic/claude-sonnet-4.6
                    choices:
                      - index: 0
                        finish_reason: stop
                        message:
                          role: assistant
                          content: >-
                            OpenRouter is a unified platform aggregating
                            multiple LLM providers behind one API.
                          refusal: null
                          reasoning: null
                          annotations:
                            - type: url_citation
                              url_citation:
                                url: https://openrouter.ai/docs
                                title: OpenRouter Documentation
                                start_index: 0
                                end_index: 31
                    search_results:
                      - query: What is OpenRouter AI
                        results:
                          - title: OpenRouter Documentation
                            url: https://openrouter.ai/docs
                            highlights: >-
                              OpenRouter provides a unified API for multiple LLM
                              providers
                            authors: OpenRouter
                            time_published: '2024-11-02T00:00:00Z'
                            time_last_crawled: '2025-01-10T12:30:00Z'
                    usage:
                      num_search_queries: 1
                      prompt_tokens: 1058
                      completion_tokens: 132
                      total_tokens: 1190
                toolCall:
                  summary: Non-streaming, custom function call
                  value:
                    id: gen-1749812800-tool0001
                    object: chat.completion
                    created: 1749812800
                    model: openai/gpt-5.5
                    choices:
                      - index: 0
                        finish_reason: tool_calls
                        message:
                          role: assistant
                          content: null
                          tool_calls:
                            - id: call_abc123
                              type: function
                              function:
                                name: create_reminder
                                arguments: >-
                                  {"content": "Rain in Shanghai tomorrow, bring
                                  an umbrella", "time":
                                  "2026-06-11T08:00:00+08:00"}
                          refusal: null
                          reasoning: null
                    usage:
                      prompt_tokens: 230
                      completion_tokens: 45
                      total_tokens: 275
                streamSearchDone:
                  summary: Streaming, search_done chunk
                  value:
                    type: search_done
                    id: 20260318143837845RJQ9P28ZEC
                    object: chat.completion.chunk
                    created: 1773844717
                    model: anthropic/claude-opus-4.6
                    search_results:
                      - query: weather Beijing today
                        results:
                          - title: Beijing Weather
                            url: https://weather.com/beijing
                            highlights: Partly cloudy, high of 32C
                            authors: weather.com
                            time_published: '2026-06-10T00:00:00Z'
                            time_last_crawled: '2026-06-10T06:00:00Z'
                streamContent:
                  summary: Streaming, content chunk
                  value:
                    type: content
                    id: gen-1749812600-stream001
                    object: chat.completion.chunk
                    created: 1749812600
                    model: openai/gpt-5.4
                    choices:
                      - index: 0
                        delta:
                          role: assistant
                          content: 'OpenRouter is '
                        finish_reason: null
                streamFinish:
                  summary: Streaming, finish chunk
                  value:
                    type: finish
                    id: gen-1749812600-stream001
                    object: chat.completion.chunk
                    created: 1749812600
                    model: openai/gpt-5.4
                    choices:
                      - index: 0
                        delta: {}
                        finish_reason: stop
                streamUsage:
                  summary: Streaming, usage chunk
                  value:
                    type: usage
                    id: 20260318143837845RJQ9P28ZEC
                    object: chat.completion.chunk
                    created: 1773844727
                    model: anthropic/claude-opus-4.6
                    usage:
                      num_search_queries: 2
                      full_content_tokens: 18
                      prompt_tokens: 1841
                      completion_tokens: 638
                      total_tokens: 2479
                      completion_tokens_details:
                        reasoning_tokens: 30
                refusal:
                  summary: Safety refusal
                  value:
                    id: gen-1749812700-refusal
                    object: chat.completion
                    created: 1749812700
                    model: openai/gpt-5.4
                    choices:
                      - index: 0
                        finish_reason: stop
                        message:
                          role: assistant
                          content: null
                          refusal: I can't help with that request.
                          reasoning: null
        '400':
          description: Missing or invalid parameter
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/OpenAIErrorResponse'
              example:
                error:
                  message: Missing or invalid parameter
                  type: invalid_request_error
                  param: null
                  code: null
        '401':
          description: Invalid API Key
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/OpenAIErrorResponse'
              example:
                error:
                  message: Invalid API Key
                  type: authentication_error
                  param: null
                  code: null
        '403':
          description: Insufficient balance in account
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/OpenAIErrorResponse'
              example:
                error:
                  message: Insufficient balance in account
                  type: permission_error
                  param: null
                  code: null
        '404':
          description: Model or resource not found
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/OpenAIErrorResponse'
              example:
                error:
                  message: Model or resource not found
                  type: not_found_error
                  param: null
                  code: null
        '429':
          description: Exceeding the rate limit
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/OpenAIErrorResponse'
              example:
                error:
                  message: Exceeding the rate limit
                  type: rate_limit_error
                  param: null
                  code: null
        '500':
          description: Internal error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/OpenAIErrorResponse'
              example:
                error:
                  message: Internal error
                  type: api_error
                  param: null
                  code: null
      security:
        - apiKeyAuth: []
        - bearerAuth: []
components:
  schemas:
    ChatCompletionRequest:
      type: object
      required:
        - model
        - messages
      description: >-
        Request body for the Chat Completions API. Some parameters apply only to
        certain models; unsupported parameters are ignored for the selected
        model.
      properties:
        model:
          type: string
          enum:
            - anthropic/claude-opus-4.8
            - anthropic/claude-opus-4.6
            - anthropic/claude-sonnet-4.6
            - anthropic/claude-haiku-4.5
            - google/gemini-3.5-flash
            - google/gemini-3.1-pro-preview
            - google/gemini-3.1-flash-lite
            - google/gemini-3-flash-preview
            - openai/gpt-5.5-pro
            - openai/gpt-5.5
            - openai/gpt-5.4
            - moonshotai/kimi-k2.6
            - moonshotai/kimi-k2.5
            - minimax/minimax-m2.5
            - qwen/qwen3.6-plus
          description: The model to use for chat completion.
        messages:
          type: array
          items:
            $ref: '#/components/schemas/ChatMessage'
          description: >-
            The conversation so far. System prompt plus user and assistant
            messages in chronological order.
        tools:
          type: array
          items:
            $ref: '#/components/schemas/ChatToolDefinition'
          description: >-
            Tool definitions. Supports custom `function` tools and the built-in
            `octen_search` server tool.
        tool_choice:
          description: >-
            Controls tool invocation. `none`: never call tools; `auto`: model
            decides (default); `required`: must call a tool. Can also be an
            object to force a specific tool. Only valid when `tools` is set.
          oneOf:
            - type: string
              enum:
                - none
                - auto
                - required
            - $ref: '#/components/schemas/ChatToolChoiceObject'
        parallel_tool_calls:
          type: boolean
          default: true
          description: >-
            Whether the model may issue multiple tool calls in one reply. When
            `false`, at most one tool call per turn.
        stream:
          type: boolean
          default: false
          description: >-
            Whether to enable streaming output. When `true`, returns
            `chat.completion.chunk` objects incrementally.
        max_tokens:
          type: integer
          minimum: 1
          description: >-
            Maximum number of tokens the model can output. If not set, the
            model's internal default limit is used.
        max_completion_tokens:
          type: integer
          minimum: 1
          description: >-
            Maximum completion tokens, including reasoning and visible output
            tokens. If not set, the model's internal default limit is used.
        temperature:
          type: number
          minimum: 0
          maximum: 2
          default: 1
          description: >-
            Controls randomness in generation. Higher values produce more
            diverse output; lower values produce more deterministic output.
        top_p:
          type: number
          exclusiveMinimum: 0
          maximum: 1
          default: 1
          description: >-
            Nucleus sampling. Only tokens with cumulative probability up to
            `top_p` are considered.
        top_k:
          type: integer
          minimum: 0
          default: 0
          description: Sample only from the top K most probable tokens. `0` disables it.
        min_p:
          type: number
          minimum: 0
          maximum: 1
          default: 0
          description: >-
            Minimum probability threshold relative to the most probable token.
            Tokens below it are filtered out. `0` disables it.
        top_a:
          type: number
          minimum: 0
          maximum: 1
          default: 0
          description: >-
            Dynamic filtering threshold based on the most probable token. `0`
            disables it.
        repetition_penalty:
          type: number
          exclusiveMinimum: 0
          maximum: 2
          default: 1
          description: >-
            Penalizes tokens already present in the input. Above 1 suppresses
            repetition; below 1 encourages it.
        frequency_penalty:
          type: number
          minimum: -2
          maximum: 2
          default: 0
          description: >-
            Penalizes tokens by their frequency in the output so far. Positive
            values reduce repetition.
        presence_penalty:
          type: number
          minimum: -2
          maximum: 2
          default: 0
          description: >-
            Penalizes tokens that have already appeared. Positive values
            encourage new topics.
        response_format:
          $ref: '#/components/schemas/ChatResponseFormat'
        stop:
          type: array
          items:
            type: string
          description: >-
            Stop sequences. Generation stops when any of these strings is
            encountered.
        seed:
          type: integer
          description: >-
            Seed for reproducibility. With the same parameters and model
            version, output should be as consistent as possible.
        reasoning:
          $ref: '#/components/schemas/ChatReasoningOptions'
        verbosity:
          type: string
          enum:
            - low
            - medium
            - high
          default: medium
          description: Controls how verbose the reply is.
        logit_bias:
          type: object
          additionalProperties:
            type: number
            minimum: -100
            maximum: 100
          description: >-
            A JSON object mapping token IDs to bias values (-100 to 100), added
            to the logits before sampling.
        logprobs:
          type: boolean
          default: false
          description: Whether to return the log probabilities of the output tokens.
        top_logprobs:
          type: integer
          minimum: 0
          maximum: 20
          description: >-
            Number of most likely tokens to return at each position. Requires
            `logprobs` to be `true`.
        user:
          type: string
          description: >-
            A unique identifier for the end user. Use hashed or pseudonymous
            identifiers to avoid passing personally identifiable information.
        previous_response_id:
          type: string
          description: >-
            The `id` of a previous response, used to chain state across turns.
            Only effective for Responses-API models; the `gen-`-prefixed ids of
            regular completions cannot be used.
    ChatCompletionResponse:
      type: object
      required:
        - id
        - object
        - created
        - model
        - choices
      description: A non-streaming chat completion response. Returned when `stream=false`.
      properties:
        id:
          type: string
          description: The unique identifier for this request.
        object:
          type: string
          enum:
            - chat.completion
          description: >-
            The object type, always `chat.completion` for non-streaming
            responses.
        created:
          type: integer
          description: Unix timestamp (in seconds) of when the completion was created.
        model:
          type: string
          description: The model used for this completion.
        choices:
          type: array
          items:
            $ref: '#/components/schemas/ChatCompletionChoice'
          description: A list of completion choices.
        search_results:
          type: array
          items:
            $ref: '#/components/schemas/ChatSearchResultGroup'
          description: >-
            Search results. Present only when `octen_search` was actually
            triggered.
        usage:
          $ref: '#/components/schemas/ChatCompletionUsage'
        warning:
          type: string
          description: Warning message, if any.
    ChatCompletionChunk:
      type: object
      required:
        - object
        - created
        - model
      description: >-
        A streaming chunk. Returned when `stream=true`. The `type` field
        indicates the chunk kind: `search_done` (search results), `content`
        (incremental content), `finish` (generation complete), `usage` (token
        usage). After the last chunk, `data: [DONE]` is sent.
      properties:
        type:
          type: string
          enum:
            - search_done
            - content
            - finish
            - usage
          description: The type of this streaming chunk.
        id:
          type: string
          description: The unique identifier for this request.
        object:
          type: string
          enum:
            - chat.completion.chunk
          description: >-
            The object type, always `chat.completion.chunk` for streaming
            responses.
        created:
          type: integer
          description: Unix timestamp (in seconds) of when the chunk was created.
        model:
          type: string
          description: The model used for this completion.
        choices:
          type: array
          items:
            $ref: '#/components/schemas/ChatCompletionChunkChoice'
          description: Incremental choices. Present in `content` and `finish` chunks.
        search_results:
          type: array
          items:
            $ref: '#/components/schemas/ChatSearchResultGroup'
          description: Search results. Present in `search_done` chunks.
        usage:
          $ref: '#/components/schemas/ChatCompletionUsage'
    OpenAIErrorResponse:
      type: object
      description: Error body in the OpenAI protocol format.
      required:
        - error
      properties:
        error:
          type: object
          properties:
            message:
              type: string
              description: A human-readable description of the error.
            type:
              type: string
              description: The error category, e.g. `invalid_request_error`.
            param:
              type: string
              nullable: true
              description: The parameter related to the error, if any.
            code:
              type: string
              nullable: true
              description: A machine-readable error code, if any.
          required:
            - message
            - type
    ChatMessage:
      oneOf:
        - $ref: '#/components/schemas/SystemMessage'
        - $ref: '#/components/schemas/DeveloperMessage'
        - $ref: '#/components/schemas/UserMessage'
        - $ref: '#/components/schemas/AssistantMessage'
        - $ref: '#/components/schemas/ToolMessage'
      discriminator:
        propertyName: role
        mapping:
          system:
            $ref: '#/components/schemas/SystemMessage'
          developer:
            $ref: '#/components/schemas/DeveloperMessage'
          user:
            $ref: '#/components/schemas/UserMessage'
          assistant:
            $ref: '#/components/schemas/AssistantMessage'
          tool:
            $ref: '#/components/schemas/ToolMessage'
    ChatToolDefinition:
      type: object
      required:
        - type
      description: >-
        A tool definition. Two types are supported: `function` (a custom tool
        executed by the caller) and `octen_search` (the built-in Web Search
        server tool, executed by Octen).
      properties:
        type:
          type: string
          enum:
            - function
            - octen_search
          description: The type of tool.
        function:
          allOf:
            - $ref: '#/components/schemas/ChatFunctionDefinition'
          description: The function definition. Required when `type` is `function`.
        parameters:
          allOf:
            - $ref: '#/components/schemas/OctenSearchToolParameters'
          description: >-
            Search behavior configuration. Optional, used when `type` is
            `octen_search`.
    ChatToolChoiceObject:
      type: object
      required:
        - type
      description: Forces a specific tool. The named tool must be declared in `tools`.
      properties:
        type:
          type: string
          enum:
            - function
            - octen_search
          description: The type of tool to force.
        function:
          type: object
          description: The function to call. Required when `type` is `function`.
          required:
            - name
          properties:
            name:
              type: string
              description: The name of the function to call.
    ChatResponseFormat:
      type: object
      description: >-
        Controls the output format. Some models may not support structured
        output and will automatically fall back to `text`.
      properties:
        type:
          type: string
          enum:
            - text
            - json_object
            - json_schema
          default: text
          description: The output format type.
        json_schema:
          $ref: '#/components/schemas/ChatJsonSchemaSpec'
    ChatReasoningOptions:
      type: object
      description: Options for reasoning models. Sets the thinking effort and budget.
      properties:
        effort:
          type: string
          enum:
            - xhigh
            - high
            - medium
            - low
            - minimal
            - none
          description: The reasoning effort level.
        max_tokens:
          type: integer
          minimum: 1024
          description: Thinking token budget.
    ChatCompletionChoice:
      type: object
      description: A single completion choice in a non-streaming response.
      properties:
        index:
          type: integer
          description: The index of this choice. Usually 0.
        finish_reason:
          type: string
          enum:
            - stop
            - length
            - tool_calls
            - content_filter
            - error
          description: >-
            The reason the model stopped generating. `stop` indicates normal
            completion.
        message:
          $ref: '#/components/schemas/ChatCompletionMessage'
        logprobs:
          $ref: '#/components/schemas/ChatLogprobs'
    ChatSearchResultGroup:
      type: object
      description: >-
        Search results for a single auto-generated query. One element per
        search.
      properties:
        query:
          type: string
          description: >-
            The query the model actually searched for. One message may produce
            multiple queries.
        results:
          type: array
          items:
            $ref: '#/components/schemas/ChatSearchResultItem'
          description: The search results for this query.
    ChatCompletionUsage:
      type: object
      required:
        - prompt_tokens
        - completion_tokens
        - total_tokens
      description: >-
        Token usage information. When `stream=true`, returned only in the final
        `usage` chunk.
      properties:
        num_search_queries:
          type: integer
          description: Number of searches executed. Returned only when search was used.
        full_content_tokens:
          type: integer
          description: >-
            Full content tokens returned by search, used for search billing.
            Returned only when `full_content` is enabled.
        prompt_tokens:
          type: integer
          description: >-
            Number of input tokens. Cache read and write tokens are included
            here, not listed separately.
        completion_tokens:
          type: integer
          description: Number of output tokens (reasoning tokens are included).
        total_tokens:
          type: integer
          description: Total tokens (prompt_tokens + completion_tokens).
        prompt_tokens_details:
          type: object
          description: >-
            Breakdown of input tokens. Returned only when the prompt cache is
            hit.
          properties:
            cached_tokens:
              type: integer
              description: >-
                Portion of `prompt_tokens` served from cache (a subset of
                `prompt_tokens`).
            cache_write_tokens:
              type: integer
              description: >-
                Tokens written to cache (a subset of `prompt_tokens`). Returned
                only when `cache_control` with a `ttl` was used and the model
                supports cache writes.
        completion_tokens_details:
          type: object
          description: Breakdown of output tokens. Returned only for reasoning models.
          properties:
            reasoning_tokens:
              type: integer
              description: >-
                Number of reasoning tokens (already included in
                `completion_tokens`).
    ChatCompletionChunkChoice:
      type: object
      description: A single choice in a streaming chunk.
      properties:
        index:
          type: integer
          description: The index of this choice. Usually 0.
        delta:
          $ref: '#/components/schemas/ChatCompletionDelta'
        finish_reason:
          type: string
          nullable: true
          enum:
            - stop
            - length
            - tool_calls
            - content_filter
            - error
            - null
          description: >-
            The reason generation stopped. `null` for intermediate chunks; set
            in the `finish` chunk.
        logprobs:
          $ref: '#/components/schemas/ChatLogprobs'
    SystemMessage:
      type: object
      required:
        - role
        - content
      description: A system prompt message that sets the behavior or context for the model.
      properties:
        role:
          type: string
          enum:
            - system
          description: The role of the message author. Always `system`.
        content:
          description: >-
            The system prompt content. A plain string or an array of content
            blocks.
          oneOf:
            - type: string
            - type: array
              items:
                $ref: '#/components/schemas/ChatContentBlock'
    DeveloperMessage:
      type: object
      required:
        - role
        - content
      description: >-
        A developer message. The OpenAI-protocol equivalent of `system` (sent by
        newer OpenAI SDKs); handled as `system`.
      properties:
        role:
          type: string
          enum:
            - developer
          description: The role of the message author. Always `developer`.
        content:
          description: The content. A plain string or an array of content blocks.
          oneOf:
            - type: string
            - type: array
              items:
                $ref: '#/components/schemas/ChatContentBlock'
    UserMessage:
      type: object
      required:
        - role
        - content
      description: A message from the user.
      properties:
        role:
          type: string
          enum:
            - user
          description: The role of the message author. Always `user`.
        content:
          description: >-
            The content of the message. A plain string or an array of content
            blocks (text and/or image).
          oneOf:
            - type: string
            - type: array
              items:
                $ref: '#/components/schemas/ChatContentBlock'
    AssistantMessage:
      type: object
      required:
        - role
      description: >-
        A message from the assistant. When replaying a multi-turn conversation
        with tool use, include the assistant's `tool_calls` so they can be
        matched with the subsequent `tool` messages.
      properties:
        role:
          type: string
          enum:
            - assistant
          description: The role of the message author. Always `assistant`.
        content:
          type: string
          nullable: true
          description: >-
            The assistant's text content. May be `null` or omitted when the
            assistant only produces tool calls.
        tool_calls:
          type: array
          items:
            $ref: '#/components/schemas/ChatToolCall'
          description: >-
            Tool calls generated by the model in a previous turn, replayed
            verbatim. Only valid when `role` is `assistant`.
    ToolMessage:
      type: object
      required:
        - role
        - tool_call_id
        - content
      description: >-
        A tool result message, returning the output of a custom function call.
        The `tool_call_id` must match the `id` of a preceding assistant
        `tool_calls` entry.
      properties:
        role:
          type: string
          enum:
            - tool
          description: The role of the message author. Always `tool`.
        tool_call_id:
          type: string
          description: The ID of the tool call this message responds to.
        content:
          type: string
          description: The tool output, typically a JSON string with the function result.
    ChatFunctionDefinition:
      type: object
      required:
        - name
      description: A custom function definition.
      properties:
        name:
          type: string
          description: The name of the function.
        description:
          type: string
          description: A description of what the function does.
        parameters:
          type: object
          description: The function's parameter definition in JSON Schema format.
        strict:
          type: boolean
          default: false
          description: >-
            Whether to enable strict mode. When enabled, generated arguments
            strictly conform to `parameters`.
    OctenSearchToolParameters:
      type: object
      description: >-
        Behavior configuration for the built-in `octen_search` tool. All
        parameters are optional and share the same semantics and defaults as the
        Web Search API. The query is generated automatically by the model; a
        single request may trigger multiple searches.
      properties:
        max_searches:
          type: integer
          default: 5
          description: Maximum number of searches allowed in a request.
        count:
          type: integer
          minimum: 1
          maximum: 100
          description: Number of results to return per search.
        include_domains:
          type: array
          items:
            type: string
          description: Domains to include in search results.
        exclude_domains:
          type: array
          items:
            type: string
          description: Domains to exclude from search results.
        include_text:
          type: array
          items:
            type: string
          maxItems: 5
          description: Strings that must appear in the result page text.
        exclude_text:
          type: array
          items:
            type: string
          maxItems: 5
          description: Strings that must not appear in the result page text.
        time_basis:
          type: string
          enum:
            - auto
            - published
            - crawled
          description: Determines which time field is used for time filtering.
        start_time:
          type: string
          format: date-time
          description: Start time for filtering results. ISO 8601 format.
        end_time:
          type: string
          format: date-time
          description: End time for filtering results. ISO 8601 format.
        highlight:
          $ref: '#/components/schemas/HighlightOptions'
        format:
          type: string
          enum:
            - markdown
            - text
          description: Controls the formatting of highlight outputs.
        safesearch:
          type: string
          enum:
            - 'off'
            - strict
          description: Controls filtering of explicit/adult content.
        full_content:
          $ref: '#/components/schemas/FullContentOptions'
    ChatJsonSchemaSpec:
      type: object
      required:
        - name
      description: >-
        JSON Schema specification for structured output. Required when
        `response_format.type` is `json_schema`.
      properties:
        name:
          type: string
          description: A user-defined name for the schema.
        strict:
          type: boolean
          default: true
          description: Whether the model output must strictly conform to the schema.
        description:
          type: string
          description: A description of the schema.
        schema:
          type: object
          description: >-
            The JSON Schema definition object. May contain `type`, `properties`,
            `required`, `additionalProperties`, etc.
    ChatCompletionMessage:
      type: object
      description: The assistant's response message (non-streaming).
      properties:
        role:
          type: string
          enum:
            - assistant
          description: The role of the message author. Always `assistant`.
        content:
          type: string
          nullable: true
          description: >-
            The assistant's text content. May be `null` when the model only
            produces tool calls or refuses.
        tool_calls:
          type: array
          items:
            $ref: '#/components/schemas/ChatToolCall'
          description: >-
            Custom function calls generated by the model. Present only when
            `finish_reason` is `tool_calls`.
        annotations:
          type: array
          items:
            $ref: '#/components/schemas/ChatAnnotation'
          description: Citation annotations. Present only when `octen_search` was used.
        refusal:
          type: string
          nullable: true
          description: A model-generated refusal message, or `null`.
        reasoning:
          type: string
          nullable: true
          description: Reasoning text. Present only for reasoning models, otherwise `null`.
        reasoning_details:
          type: array
          items:
            $ref: '#/components/schemas/ChatReasoningDetail'
          description: Detailed reasoning information. Present only for reasoning models.
    ChatLogprobs:
      type: object
      nullable: true
      description: >-
        Log probability information for the output tokens. Present only when
        `logprobs` is `true`, otherwise `null`.
      properties:
        content:
          type: array
          nullable: true
          items:
            type: object
            properties:
              token:
                type: string
                description: The token text.
              logprob:
                type: number
                description: The log probability of this token.
              bytes:
                type: array
                nullable: true
                items:
                  type: integer
                description: UTF-8 byte representation.
              top_logprobs:
                type: array
                description: The top candidate tokens at this position, same structure.
                items:
                  type: object
    ChatSearchResultItem:
      type: object
      description: A single search result.
      properties:
        title:
          type: string
          description: The title of the result page.
        url:
          type: string
          description: The URL of the result page.
        highlights:
          type: string
          description: >-
            Query-relevant highlight snippets. Returned only if
            `highlight.enable` is true.
        full_content:
          type: string
          description: >-
            Full raw page content. Returned only if `full_content.enable` is
            true.
        authors:
          type: string
          description: Website name or author.
        time_published:
          type: string
          format: date-time
          description: Publish time in ISO 8601.
        time_last_crawled:
          type: string
          format: date-time
          description: Last crawl time in ISO 8601.
    ChatCompletionDelta:
      type: object
      description: >-
        Incremental content in a streaming chunk. Contains only the new fields
        for this chunk.
      properties:
        role:
          type: string
          enum:
            - assistant
          description: The role. Typically included in the first chunk.
        content:
          type: string
          description: Incremental text content.
        tool_calls:
          type: array
          items:
            $ref: '#/components/schemas/ChatToolCallDelta'
          description: Incremental tool call data.
    ChatContentBlock:
      type: object
      description: A content block within a message.
      required:
        - type
      properties:
        type:
          type: string
          enum:
            - text
            - image
            - image_url
          description: The type of content block.
        text:
          type: string
          description: The text content. Required when `type` is `text`.
        image:
          type: string
          description: Base64-encoded image. Required when `type` is `image`.
        image_url:
          type: object
          description: Image URL object. Required when `type` is `image_url`.
          required:
            - url
          properties:
            url:
              type: string
              description: The URL of the image.
        cache_control:
          $ref: '#/components/schemas/CacheControl'
    ChatToolCall:
      type: object
      required:
        - id
        - type
        - function
      description: A custom function call generated by the model.
      properties:
        index:
          type: integer
          description: The index of this tool call in the array.
        id:
          type: string
          description: >-
            A unique identifier for this tool call. Referenced by the
            corresponding `tool` message's `tool_call_id`.
        type:
          type: string
          enum:
            - function
          description: The type of tool call. Always `function`.
        function:
          $ref: '#/components/schemas/ChatToolCallFunction'
    HighlightOptions:
      type: object
      description: Controls highlight extraction from result pages.
      properties:
        enable:
          type: boolean
          default: true
          description: If true, returns query-relevant highlight in each result.
        max_tokens:
          type: integer
          default: 512
          minimum: 100
          maximum: 20000
          description: Max tokens returned per highlight.
    FullContentOptions:
      type: object
      description: Controls whether to return the full raw content of each result page.
      properties:
        enable:
          type: boolean
          default: false
          description: If true, returns full_content for each result.
        max_tokens:
          type: integer
          default: 2048
          minimum: 100
          maximum: 100000
          description: Maximum tokens of full content included per result.
    ChatAnnotation:
      type: object
      description: >-
        A citation annotation. Returned only when `octen_search` was used,
        currently only for `anthropic/` models. For other models, results are
        still available in `search_results`.
      properties:
        type:
          type: string
          enum:
            - url_citation
          description: The annotation type.
        url_citation:
          type: object
          properties:
            url:
              type: string
              description: The URL of the cited source.
            title:
              type: string
              description: The title of the cited source.
            start_index:
              type: integer
              description: Start position of the cited span in `content`.
            end_index:
              type: integer
              description: End position of the cited span in `content`.
    ChatReasoningDetail:
      type: object
      description: Detailed reasoning information. Returned only for reasoning models.
      properties:
        format:
          type: string
          enum:
            - unknown
            - openai-responses-v1
            - azure-openai-responses-v1
            - xai-responses-v1
            - anthropic-claude-v1
            - google-gemini-v1
          description: The format of the reasoning information.
        index:
          type: integer
          description: The position of this entry in the output. Usually 0.
        type:
          type: string
          enum:
            - reasoning.text
            - reasoning.summary
            - reasoning.encrypted
          description: The kind of reasoning information.
        text:
          type: string
          description: Reasoning text. Present only when `type` is `reasoning.text`.
        summary:
          type: string
          description: Reasoning summary. Present only when `type` is `reasoning.summary`.
        data:
          type: string
          description: >-
            Encrypted reasoning. Present only when `type` is
            `reasoning.encrypted`.
    ChatToolCallDelta:
      type: object
      description: >-
        Incremental tool call data in a streaming chunk. The first chunk for a
        tool call includes `id`, `type`, and `function.name`. Subsequent chunks
        append to `function.arguments`.
      properties:
        index:
          type: number
          description: The index of this tool call in the tool_calls array.
        id:
          type: string
          description: >-
            The tool call ID. Only present in the first chunk for this tool
            call.
        type:
          type: string
          enum:
            - function
          description: The type of tool call.
        function:
          type: object
          description: Incremental function call data.
          properties:
            name:
              type: string
              description: >-
                The function name. Only present in the first chunk for this tool
                call.
            arguments:
              type: string
              description: >-
                Incremental JSON string of the function arguments. Concatenate
                across chunks to build the complete arguments.
    CacheControl:
      type: object
      description: >-
        Prompt caching marker. Sets a cache breakpoint so the stable prefix up
        to this block can be reused.
      required:
        - type
      properties:
        type:
          type: string
          enum:
            - ephemeral
          description: The cache control type. Always `ephemeral`.
        ttl:
          type: string
          enum:
            - 5m
            - 1h
          default: 5m
          description: Cache lifetime.
    ChatToolCallFunction:
      type: object
      required:
        - name
        - arguments
      description: The function invocation details within a tool call.
      properties:
        name:
          type: string
          description: The name of the function to call.
        arguments:
          type: string
          description: >-
            The arguments to the function, as a JSON string generated by the
            model.
  securitySchemes:
    apiKeyAuth:
      type: apiKey
      in: header
      name: x-api-key
      description: >-
        API key used for request authentication. Obtain an API key before using
        the API. Note: A payment method is required to use the API.
    bearerAuth:
      type: http
      scheme: bearer
      description: >-
        Bearer token authentication. Compatible with OpenAI protocol. Pass the
        API key as `Authorization: Bearer <your-api-key>`.

````