Search Optimization

This guide details how to leverage Web Search parameters effectively to optimize performance, relevance, and token usage for LLM-integrated applications.

Key Benefits

Semantic Intelligence: Hybrid keyword and semantic search ensures high relevance from natural language queries without complex engineering.
Context Control: Granular filters (domain, time, content) reduce noise and align results with specific use cases.
Cost & Latency Optimization: Retrieving highlights instead of full content minimizes token consumption, essential for Agentic workflows.

Search Configuration

Safety (safesearch):
- "strict" (Default): Excludes adult content. Standard for public-facing chatbots to prevent toxic content generation.
- "off": Unfiltered results. Required only for specialized research agents (e.g., medical or biological studies) where standard safety filters might incorrectly flag necessary anatomical or scientific content.

Precision Filtering

Domain Filtering (include_domains / exclude_domains): Functions as a whitelist or blacklist.
- Trusted Knowledge Base: Using include_domains to restrict retrieval to high-authority sources (e.g., official documentation, .gov, or .edu sites) creates a “walled garden” that significantly reduces hallucination risks in professional contexts.
- Noise Reduction: Using exclude_domains to filter out user-generated content platforms or content farms prevents the LLM from ingesting colloquial or unverified information.
Content Constraints (include_text / exclude_text): Enforces or forbids specific keywords within the page content. For example, requiring “quarterly earnings” to appear when searching for financial reports, or excluding “rumor” to filter out speculative content.
Time Sensitivity:
- Breaking News Mode: Combining time_basis: "published" with a strict start_time (e.g., past 24 hours) forces the engine to ignore SEO-optimized evergreen content. This strategy is essential for news summarization or market analysis agents.
Result Count (count): Defaults to 5. For direct Q&A tasks, retrieving 3-5 results typically offers the best balance between context availability and latency. Higher counts (10+) are recommended for broad topic aggregation tasks.

Response Content & Format

Highlights vs. Full Content:
- Highlights (Default): Returns relevant, concise snippets. This is the most token-efficient format for Fact-Checking and Q&A, where the answer is likely contained in a single paragraph.
- Full Content: Returns parsed page text. Necessary for “Reading Assistant” agents that need to summarize entire articles, analyze writing style, or extract scattered data points from a long report.
- Hybrid Strategy (Highlight-First): A cost-effective pattern involves requesting highlights first to assess relevance, and then triggering a second request for full_content only on the specific high-value URLs.
Output Format (format): Controls the output format of the highlight snippets.
- "text" (Default): Returns plain text.
- "markdown": Returns with basic formatting (e.g., bolding of matching terms) where supported.

Performance and Usage Considerations

Token Economy: The meta.usage field monitors consumption. To minimize operational costs, applications should default to highlights and only request full_content when user intent explicitly requires deep reading.
Metadata Utilization: Fields like time_published should be used for secondary ranking on the client side (e.g., prioritizing the absolute newest article among the top 5).

Guides

Documentation Index

​Key Benefits

​Search Configuration

​Precision Filtering

​Response Content & Format

​Performance and Usage Considerations

Key Benefits

Search Configuration

Precision Filtering

Response Content & Format

Performance and Usage Considerations