Octen | The Search infrastructure for AI

Authorizations

x-api-key

string

header

required

API key used for request authentication. Obtain an API key before using the API. Note: A payment method is required to use the API.

Body

application/json

Request body for the Chat Completions API. Some parameters apply only to certain models; unsupported parameters are ignored for the selected model.

model

enum<string>

required

The model to use for chat completion.

Available options:

anthropic/claude-opus-4.8,

anthropic/claude-opus-4.6,

anthropic/claude-sonnet-4.6,

anthropic/claude-haiku-4.5,

google/gemini-3.5-flash,

google/gemini-3.1-pro-preview,

google/gemini-3.1-flash-lite,

google/gemini-3-flash-preview,

openai/gpt-5.5-pro,

openai/gpt-5.5,

openai/gpt-5.4,

moonshotai/kimi-k2.6,

moonshotai/kimi-k2.5,

minimax/minimax-m2.5,

qwen/qwen3.6-plus

messages

object[]

required

The conversation so far. System prompt plus user and assistant messages in chronological order.

A system prompt message that sets the behavior or context for the model.

Option 1
Option 2
Option 3
Option 4
Option 5

Show child attributes

tools

object[]

Tool definitions. Supports custom function tools and the built-in octen_search server tool.

Show child attributes

tool_choice

Controls tool invocation. none: never call tools; auto: model decides (default); required: must call a tool. Can also be an object to force a specific tool. Only valid when tools is set.

Available options:

none,

auto,

required

parallel_tool_calls

boolean

default:true

Whether the model may issue multiple tool calls in one reply. When false, at most one tool call per turn.

stream

boolean

default:false

Whether to enable streaming output. When true, returns chat.completion.chunk objects incrementally.

max_tokens

integer

Maximum number of tokens the model can output. If not set, the model's internal default limit is used.

Required range: x >= 1

max_completion_tokens

integer

Maximum completion tokens, including reasoning and visible output tokens. If not set, the model's internal default limit is used.

Required range: x >= 1

temperature

number

default:1

Controls randomness in generation. Higher values produce more diverse output; lower values produce more deterministic output.

Required range: 0 <= x <= 2

top_p

number

default:1

Nucleus sampling. Only tokens with cumulative probability up to top_p are considered.

Required range: x <= 1

top_k

integer

default:0

Sample only from the top K most probable tokens. 0 disables it.

Required range: x >= 0

min_p

number

default:0

Minimum probability threshold relative to the most probable token. Tokens below it are filtered out. 0 disables it.

Required range: 0 <= x <= 1

top_a

number

default:0

Dynamic filtering threshold based on the most probable token. 0 disables it.

Required range: 0 <= x <= 1

repetition_penalty

number

default:1

Penalizes tokens already present in the input. Above 1 suppresses repetition; below 1 encourages it.

Required range: x <= 2

frequency_penalty

number

default:0

Penalizes tokens by their frequency in the output so far. Positive values reduce repetition.

Required range: -2 <= x <= 2

presence_penalty

number

default:0

Penalizes tokens that have already appeared. Positive values encourage new topics.

Required range: -2 <= x <= 2

response_format

object

Controls the output format. Some models may not support structured output and will automatically fall back to text.

Show child attributes

stop

string[]

Stop sequences. Generation stops when any of these strings is encountered.

seed

integer

Seed for reproducibility. With the same parameters and model version, output should be as consistent as possible.

reasoning

object

Options for reasoning models. Sets the thinking effort and budget.

Show child attributes

verbosity

enum<string>

default:medium

Controls how verbose the reply is.

Available options:

low,

medium,

high

logit_bias

object

A JSON object mapping token IDs to bias values (-100 to 100), added to the logits before sampling.

Show child attributes

logprobs

boolean

default:false

Whether to return the log probabilities of the output tokens.

top_logprobs

integer

Number of most likely tokens to return at each position. Requires logprobs to be true.

Required range: 0 <= x <= 20

user

string

A unique identifier for the end user. Use hashed or pseudonymous identifiers to avoid passing personally identifiable information.

previous_response_id

string

The id of a previous response, used to chain state across turns. Only effective for Responses-API models; the gen--prefixed ids of regular completions cannot be used.

Response

Successful chat completion. When stream=false, returns a single chat.completion object. When stream=true, returns a stream of chat.completion.chunk objects (search_done, content, finish, usage), followed by data: [DONE].

Option 1
Option 2

A non-streaming chat completion response. Returned when stream=false.

string

required

The unique identifier for this request.

object

enum<string>

required

The object type, always chat.completion for non-streaming responses.

Available options:

chat.completion

created

integer

required

Unix timestamp (in seconds) of when the completion was created.

model

string

required

The model used for this completion.

choices

object[]

required

A list of completion choices.

Show child attributes

search_results

object[]

Search results. Present only when octen_search was actually triggered.

Show child attributes

usage

object

Token usage information. When stream=true, returned only in the final usage chunk.

Show child attributes

warning

string

Warning message, if any.