VL Embedding

curl --request POST \ --url https://api.octen.ai/vl-embedding \ --header 'Content-Type: application/json' \ --header 'x-api-key: <api-key>' \ --data ' { "model": "octen-vl-embedding", "input": { "contents": [ { "text": "What is multimodal vector search?" } ] } } '

{ "code": 0, "msg": "success", "request_id": "a7b8c9d0-e1f2-3456-abcd-789012345678", "data": { "results": [ { "index": 0, "embedding": [ 0.0156, -0.0298, 0.0411 ], "type": "fusion" } ], "model": "octen-vl-embedding-large" }, "meta": { "usage": { "input_tokens": 6814, "text_tokens": 18, "image_tokens": 6796, "image_count": 2, "duration": 22 }, "warning": null } }

Authorizations

x-api-key

string

header

required

API key used for request authentication. Obtain an API key before using the API. Note: A payment method is required to use the API.

Body

application/json

model

enum<string>

required

The multimodal embedding model used for this request.

Available options:

octen-vl-embedding,

octen-vl-embedding-large

input

object

required

The multimodal content to be vectorized. Supports text, images, videos, and combinations. Maximum total elements per request: 20. Maximum images per request: 5. Maximum videos per request: 1.

Show child attributes

enable_fusion

boolean

default:false

Whether to generate a fused embedding. When true, all elements in contents are fused into a single vector; when false, each element produces an independent vector.

dimension

integer

The dimensionality of the output embedding vectors. Defaults to the model's max dimension (octen-vl-embedding: 2048, octen-vl-embedding-large: 4096). Any positive integer ≤ the model's max dimension is allowed.

fps

number

default:1

Controls the frame sampling density for video inputs. Smaller values reduce the number of extracted frames and lower video token consumption.

Required range: 0 <= x <= 1

instruct

string | null

Custom task description used to guide the model in understanding the query intent. Its length counts toward input_tokens and shares the 32,000-token total context limit with contents.

Response

Successful VL embedding response

code

integer

Business status code. 0 indicates success.

msg

string

A human-readable message describing the result.

request_id

string

The unique identifier for this request.

data

object

The main VL embedding response payload.

Show child attributes

API Reference

SDKs

VL Embedding

Authorizations

Body

Response

API Reference

SDKs

Documentation Index

Authorizations

Body

Response