Automatically decomposes user messages into multiple sub-queries, performs searches, and synthesizes results using an LLM.
API key used for request authentication. Obtain an API key before using the API. Note: A payment method is required to use the API.
Request body for the Broad Search API.
A list of messages comprising the conversation so far. User and assistant messages in chronological order for multi-turn conversations.
A single message in the conversation. Discriminated by role.
The model to use for query decomposition and response synthesis.
anthropic/claude-sonnet-4.6, anthropic/claude-opus-4.6, anthropic/claude-haiku-4.5, google/gemini-3-flash-preview, google/gemini-3.1-pro-preview, google/gemini-3.1-flash-lite-preview, openai/gpt-5.4, openai/gpt-oss-120b, moonshotai/kimi-k2.5, minimax/minimax-m2.5 Controls the execution depth. queries_only: only decompose the message into sub-queries without performing searches; queries_and_search: decompose into sub-queries and return search results without LLM synthesis; full: decompose, search, and synthesize a final response using the LLM.
queries_only, queries_and_search, full Maximum number of sub-queries to generate.
1 <= x <= 30Search-related options. Shares the same parameters and defaults as the Search API, except highlight.max_tokens defaults to 256 (instead of 512). Queries are automatically generated from the messages.
Whether to enable streaming output. When true, returns chat.completion.chunk objects incrementally with types: queries, search_done, content, finish, and usage.
Successful broad search response. When stream=false, returns a single chat.completion object with queries and search_results at the top level. When stream=true, returns a stream of chat.completion.chunk objects with types: queries (generated sub-queries), search_done (search results), content (incremental content), finish (completion signal), and usage (token usage).
A non-streaming Broad Search response. Returned when stream=false.
The unique identifier for this request.
The object type, always chat.completion for non-streaming responses.
chat.completion Unix timestamp (in seconds) of when the completion was created.
The model used for this completion.
A list of completion choices containing the synthesized response.
The list of sub-queries automatically generated from the user message by the system.
Search results grouped by query. Each auto-generated sub-query has a corresponding result group.
Metadata for the Broad Search response.
Warning message, if any.