> ## Documentation Index
> Fetch the complete documentation index at: https://docs.octen.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Search Optimization

This guide details how to leverage Web Search parameters effectively to optimize performance, relevance, and token usage for LLM-integrated applications.

## Key Benefits

* **Semantic Intelligence:** Hybrid keyword and semantic search ensures high relevance from natural language queries without complex engineering.
* **Context Control:** Granular filters (domain, time, content) reduce noise and align results with specific use cases.
* **Cost & Latency Optimization:** Retrieving highlights instead of full content minimizes token consumption, essential for Agentic workflows.

## Search Configuration

* **Safety (`safesearch`):**
  * **`"strict"` (Default):** Excludes adult content. Standard for public-facing chatbots to prevent toxic content generation.
  * **`"off"`:** Unfiltered results. Required only for specialized research agents (e.g., medical or biological studies) where standard safety filters might incorrectly flag necessary anatomical or scientific content.

## Precision Filtering

* **Domain Filtering (`include_domains` / `exclude_domains`):**
  Functions as a whitelist or blacklist.
  * **Trusted Knowledge Base:** Using `include_domains` to restrict retrieval to high-authority sources (e.g., official documentation, `.gov`, or `.edu` sites) creates a "walled garden" that significantly reduces hallucination risks in professional contexts.
  * **Noise Reduction:** Using `exclude_domains` to filter out user-generated content platforms or content farms prevents the LLM from ingesting colloquial or unverified information.

* **Content Constraints (`include_text` / `exclude_text`):**
  Enforces or forbids specific keywords within the page content. For example, requiring **"quarterly earnings"** to appear when searching for financial reports, or excluding **"rumor"** to filter out speculative content.

* **Time Sensitivity:**
  * **Breaking News Mode:** Combining `time_basis: "published"` with a strict `start_time` (e.g., past 24 hours) forces the engine to ignore SEO-optimized evergreen content. This strategy is essential for **news summarization** or **market analysis** agents.

* **Result Count (`count`):**
  Defaults to 5. For direct Q\&A tasks, retrieving 3-5 results typically offers the best balance between context availability and latency. Higher counts (10+) are recommended for broad topic aggregation tasks.

## Response Content & Format

* **Highlights vs. Full Content:**
  * **Highlights (Default):** Returns relevant, concise snippets. This is the most token-efficient format for **Fact-Checking** and **Q\&A**, where the answer is likely contained in a single paragraph.
  * **Full Content:** Returns parsed page text. Necessary for **"Reading Assistant"** agents that need to summarize entire articles, analyze writing style, or extract scattered data points from a long report.
  * **Hybrid Strategy (Highlight-First):** A cost-effective pattern involves requesting highlights first to assess relevance, and then triggering a second request for `full_content` only on the specific high-value URLs.

* **Output Format (`format`):**
  Controls the output format of the **highlight snippets**.
  * **`"text"` (Default):** Returns plain text.
  * **`"markdown"`:** Returns with basic formatting (e.g., bolding of matching terms) where supported.

## Performance and Usage Considerations

* **Token Economy:** The `meta.usage` field monitors consumption. To minimize operational costs, applications should default to `highlights` and only request `full_content` when user intent explicitly requires deep reading.
* **Metadata Utilization:** Fields like `time_published` should be used for secondary ranking on the client side (e.g., prioritizing the absolute newest article among the top 5).
