Web Search Best Practice

This documentation outlines best practices to optimize the API’s features for performance and relevance.

Key Benefits for LLM-Integrated Applications

Natural Language & Semantic Understanding: The API handles natural language queries using a combination of keyword and semantic search. This means you can search in plain language and still get highly relevant results without manually tweaking the query.
Flexible Result Filtering: Built-in filters (by domain, content keywords, date range, and country) allow you to adjust results to your needs, ensuring you get precisely the kind of content.
Token-Efficient Outputs: You can retrieve highlights instead of full content to drastically reduce token usage and cost. This makes it ideal for LLM applications where context window size and efficiency are important.

Leveraging Search Modes and Domain Optimization

Octen Web Search provides parameters to adjust how the search is performed and to target specific content domains:

Search Type (search_type): This controls the retrieval strategy. Stay default to "auto" in most cases – it automatically chooses the optimal approach based on query. You can force "keyword" for strict literal matching (e.g. when searching for exact terms or troubleshooting why a term isn’t appearing), or "semantic" to prioritize concept-based matching.
SafeSearch (safesearch): This setting filters out explicit or adult content. The default is "strict" (which excludes adult content). It’s recommended to keep strict for general use, especially in user-facing applications. You can set safesearch: "off" to allow unfiltered results if your use case requires it. Always ensure this setting aligns with your application’s content policy.

Filtering Results for Precision

To tailor the search results to your needs, take advantage of the API’s filtering parameters. These help narrow down results by source, content, time, and region:

Domain Filters (include_domains/ ** exclude_domains):** Use include_domains to whitelist specific websites or domains. This ensures results only come from those sites. For example, to search only on Wikipedia and official WHO site: "include_domains": ["wikipedia.org", "who.int"]. Conversely, use exclude_domains to omit certain domains from results to avoid those sources. Domain filtering is useful for focusing on high-quality or relevant sources and eliminating noise.
Content Filters (include_text/ ** exclude_text):** These parameters let you require or reject certain keywords within the text of the page results:
- include_text: An array of words/phrases that must appear in the page content. If all your results absolutely need to contain a specific term or phrase, list it here to ensure only pages containing that text are returned.
- exclude_text: An array of words/phrases that should not appear in the page content. Use this to filter out pages that contain irrelevant or undesired terms.
Time Filters (time_basis, start_time, end_time): When freshness matters, filter by publish or crawl date:
- Use time_basis to choose how the time filter is applied – "published" uses the content’s publication date, whereas "crawled" uses the search index’s last crawl time. "auto" will choose automatically.
- Set start_time and/or end_time to define a date range in ISO 8601 format. For example, to get results from the past month, you could set time_basis: "published", start_time to one month ago, and end_time to now. This ensures only content published in that range appears. If you only set start_time, it will fetch results published after that date; if only end_time, then results published up to that date.
Result Count (count): Control how many results you retrieve per query (range 1 to 100, default is 5). For most questions, the top few results are sufficient and keep the response small. However, if you are building an app that does its own analysis or summary, you might request 10 or more results for broader coverage. Best Practice: Start with a moderate number (e.g. 5) and adjust based on your needs. Keep in mind that more results will use more tokens, and may include diminishing returns in relevance.

Optimizing Response Content and Format

Octen Web Search gives you flexibility in what kind of content to retrieve. Choosing the appropriate format can make your application more efficient:

Highlights vs. Full Content: By default, the API can return highlights – short snippets of each page that are most relevant to your query. These are highly token-efficient because they include only the pertinent sentences. In contrast, requesting full_content gives you the entire page text.
- Highlights: For many use cases like question-answering, chatbots, or multi-turn RAG workflows, highlights are ideal. For example, a single highlight snippet might contain the exact fact or passage you need from a webpage, saving you from handling thousands of irrelevant tokens. You can limit the size of each highlight with highlight.max_tokens .
- Full Content: If your application needs to perform comprehensive analysis of the source material – for instance, reading an entire article or doing extraction of multiple details – enable full_content. You can limit the size of full text with full_content.max_tokens to avoid very large payloads. Full content is best when you’re unsure what part of the page is relevant and need the whole thing for context.
- Combine Strategically: You can request both highlights and full content together in one call if needed. One strategy is to primarily rely on highlights for quick insight and only fall back to the full text if something is unclear or if deeper reading is required. This way, your response includes the efficient snippets and also the full text when you truly need it, giving maximum flexibility.
Format (Markdown vs Plain Text): The API can return text in either markdown or plain text (format: "markdown" or "text"). By default, highlights are returned as text unless you specify otherwise. Choose the format that best fits your post-processing.

Performance and Usage Considerations

Efficient use of the API ensures faster responses and helps you stay within any rate limits or token budgets:

Minimize Unnecessary Data: Only request what you need. If highlight is enough, avoid full content. Likewise, don’t set count higher than required.
Control Token Usage: Be mindful that full contents will increase the token count (see the meta.usage.full_content_tokens in the response for how many tokens were returned). While highlights are free of tokens, which can lower costs, also important if you’re feeding data into an LLM with a context size limit.
Inspect Result Metadata: The API response includes metadata fields like authors, time_published, and time_last_crawled. These can be useful for additional filtering. For example, you can prefer results with a recent time_published if freshness is critical.
Error Handling: Integrate proper error handling for network issues or API errors. Common error codes include 400 for bad requests (e.g. missing required fields like query), 401 for invalid API keys, 403 for insufficient balance, 429 for rate limiting, and 500 for internal errors.
Stay Updated for Evolving LLM Integration Features: We are continuously evolving Octen to support advanced LLM and agent use cases. Watch for updates to parameters and output schemas. Best practices may evolve (for example, new parameters might be introduced for better relevancy or new filtering options).

Best Practice

​Key Benefits for LLM-Integrated Applications

​Leveraging Search Modes and Domain Optimization

​Filtering Results for Precision

​Optimizing Response Content and Format

​Performance and Usage Considerations

Key Benefits for LLM-Integrated Applications

Leveraging Search Modes and Domain Optimization

Filtering Results for Precision

Optimizing Response Content and Format

Performance and Usage Considerations