Web Chat

Authorizations

x-api-key

string

header

required

API key used for request authentication. Obtain an API key before using the API. Note: A payment method is required to use the API.

Body

application/json

Request body for the Web Chat API.

model

enum<string>

required

The model to use for chat completion.

Available options:

anthropic/claude-sonnet-4.6,

anthropic/claude-opus-4.6,

anthropic/claude-haiku-4.5,

google/gemini-3-flash-preview,

google/gemini-3.1-pro-preview,

google/gemini-3.1-flash-lite-preview,

openai/gpt-5.4,

openai/gpt-oss-120b,

moonshotai/kimi-k2.5,

minimax/minimax-m2.5

messages

object[]

required

A list of messages comprising the conversation so far. System prompt + user and assistant messages in chronological order.

A single message in the conversation. Discriminated by role.

Option 1
Option 2
Option 3
Option 4

Show child attributes

web_search

enum<string>

default:auto

Whether to enable search augmentation.

Available options:

auto,

on,

off

web_search_options

object

Search-related options for the Web Chat API. All parameters are optional and share the same semantics and defaults as the Search API. The query is automatically generated from the messages and does not need to be provided.

Show child attributes

stream

boolean

default:false

Whether to enable streaming output. When true, returns chat.completion.chunk objects incrementally.

max_tokens

integer

Maximum number of tokens the model can output. If not set, the model's internal default limit is used.

Required range: x >= 1

max_completion_tokens

integer

Maximum completion tokens (including reasoning tokens and visible output tokens). If not set, the model's internal default limit is used.

Required range: x >= 1

temperature

number

default:1

Controls randomness in generation. Higher values produce more diverse output; lower values produce more deterministic output.

Required range: 0 <= x <= 2

top_p

number

default:1

Nucleus sampling parameter. Only tokens with cumulative probability up to top_p are considered. Smaller values produce more conservative output.

Required range: x <= 1

frequency_penalty

number

default:0

Penalizes tokens based on their frequency in the output so far. Positive values reduce repetition; negative values encourage consistency.

Required range: -2 <= x <= 2

presence_penalty

number

default:0

Penalizes tokens that have already appeared in the output. Positive values encourage topic diversity; negative values encourage focus.

Required range: -2 <= x <= 2

response_format

object

Controls the output format. Some models may not support structured output and will automatically fall back to text.

Show child attributes

stop

string[]

Stop sequences. Generation stops when any of these strings is encountered. May not be supported by all models.

seed

integer

Seed for reproducibility. With the same parameters and model version, output should be as consistent as possible.

reasoning

object

Options for reasoning models. When stream=false, reasoning models (google/gemini-3.1-pro-preview, moonshotai/kimi-k2.5, openai/gpt-oss-120b, minimax/minimax-m2.5) include their reasoning process as <think>...</think> tags within the content field. When stream=true, reasoning content is delivered via a separate delta.reasoning_content field (not in content). In both cases, the token count is reported in reasoning_tokens in the usage metadata.

Show child attributes

logit_bias

object

A JSON object mapping token IDs to bias values (-100 to 100). The bias is added to the model's logits before sampling.

Show child attributes

tools

object[]

Tool definitions for function calling. Follows the OpenAI tool calling format; non-OpenAI models are automatically converted.

Show child attributes

tool_choice

Controls tool invocation behavior. none: never call tools; auto: model decides; required: must call a tool. Can also be an object to specify a particular function: {"type": "function", "function": {"name": "my_function"}}.

Available options:

none,

auto,

required

user

string

A unique identifier for the end user. Use hashed or pseudonymous identifiers to avoid passing personally identifiable information.

Response

Successful chat completion response. When stream=false, returns a single chat.completion object. When stream=true, returns a stream of chat.completion.chunk objects with types: search_done (search results), content (incremental content), finish (completion signal), and usage (token usage).

Option 1
Option 2

A non-streaming chat completion response. Returned when stream=false.

request_id

string

required

The unique identifier for this request.

object

enum<string>

required

The object type, always chat.completion for non-streaming responses.

Available options:

chat.completion

created

number

required

Unix timestamp (in seconds) of when the completion was created.

model

string

required

The model used for this completion.

choices

object[]

required

A list of completion choices.

Show child attributes

search_results

object[]

Search results used to augment the response. Only present when search was used.

Show child attributes

API Reference

Showcase

SDKs

Authorizations

Body

Response

API Reference

Showcase

SDKs

Documentation Index

Authorizations

Body

Response