Skip to main content
POST
/
extract
curl --request POST \ --url https://api.octen.ai/extract \ --header 'Content-Type: application/json' \ --header 'x-api-key: <api-key>' \ --data ' { "urls": [ "https://docs.octen.ai/api-reference/search", "https://docs.octen.ai/api-reference/extract" ] } '
{
  "code": 0,
  "msg": "success",
  "request_id": "req_abc123def456",
  "data": {
    "results": [
      {
        "url": "https://docs.octen.ai/api-reference/search",
        "status": "success",
        "title": "Search - Octen",
        "full_content": "Octen Search API enables ranked web results...",
        "highlights": null,
        "time_published": "2026-01-15T00:00:00Z",
        "time_last_crawled": "2026-04-21T08:30:05Z",
        "page_structure": {
          "primary": "Content Page",
          "secondary": "Article"
        },
        "category": {
          "primary": "Computers, Electronics & Technology",
          "secondary": "Artificial Intelligence"
        }
      },
      {
        "url": "https://www.who.int/news-room/fact-sheets/detail/influenza-(seasonal)",
        "status": "success",
        "title": "Influenza (Seasonal) - World Health Organization (WHO)",
        "full_content": "Seasonal influenza is an acute respiratory infection caused by influenza viruses...",
        "highlights": null,
        "time_published": "2024-10-15T00:00:00Z",
        "time_last_crawled": "2026-04-21T08:30:05Z",
        "page_structure": {
          "primary": "Content Page",
          "secondary": "Article"
        },
        "category": {
          "primary": "Health",
          "secondary": "Infectious Disease"
        }
      }
    ]
  },
  "meta": {
    "usage": {
      "total_urls": 2,
      "successful_urls": 2
    },
    "latency": 1832,
    "warning": null
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.octen.ai/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

x-api-key
string
header
required

API key used for request authentication. Obtain an API key before using the API. Note: A payment method is required to use the API.

Body

application/json

Request body for the Extract API.

urls
string[]
required

List of URLs to extract content from. Maximum URLs per request: 20. Maximum length per URL: 2048. Failed URLs are not billed.

Example:
[
  "https://example.com/article-1",
  "https://example.com/article-2"
]
query
string

Intent-focused keywords. When provided, returns query-relevant highlights per URL; otherwise returns the complete page content.

Maximum string length: 500
max_age_seconds
integer
default:86400

Maximum age (in seconds) of cached content. URLs whose cached version exceeds this threshold will be re-fetched. Values outside the allowed range are adjusted to the nearest bound.

Required range: x >= 300
format
enum<string>
default:markdown

Format of the returned content.

Available options:
markdown,
text
timeout
integer
default:30

Per-URL extraction timeout in seconds. Values outside the allowed range are adjusted to the nearest bound.

Required range: 1 <= x <= 60
include_images
boolean
default:false

Whether to return image URLs detected on the page.

include_videos
boolean
default:false

Whether to return video URLs detected on the page.

include_audio
boolean
default:false

Whether to return audio URLs detected on the page.

include_favicon
boolean
default:false

Whether to return the page's favicon URL.

Response

Successful extraction response

code
integer

Business status code. 0 indicates success.

msg
string

A human-readable message describing the result.

request_id
string

The unique identifier for this request.

data
object

The main extract response payload.

meta
object

Additional metadata for the extract request.