Octen | The Search infrastructure for AI

Authorizations

x-api-key

string

header

required

API key used for request authentication. Obtain an API key before using the API. Note: A payment method is required to use the API.

Body

Request body for the Image Generation API. Text-to-image when no image is provided; image editing when an image is provided. Parameters marked (GPT only) or (Gemini only) apply only to those model families; unsupported parameters are ignored.

model

enum<string>

required

The model to use. Aliases nano-banana-pro (= google/gemini-3-pro-image) and nano-banana-2 (= google/gemini-3.1-flash-image) are also accepted.

Available options:

openai/gpt-image-2,

openai/gpt-image-1-mini,

google/gemini-3-pro-image,

google/gemini-3.1-flash-image

prompt

string

required

The text description for the model. The length limit is set by the model.

image

Reference/source image(s) for editing, base64-encoded (a data URL prefix or a raw base64 string). Providing an image enters edit mode. An array passes multiple reference images. Per-model limits apply.

mask

string

(GPT only) Mask image, base64-encoded PNG matching the input size. Transparent areas are repainted. Effective only when image is provided.

integer

default:1

(GPT only) Number of images to generate. For Gemini, request multiple images in the prompt instead.

Required range: 1 <= x <= 10

size

string

default:auto

Output size as width x height (e.g. 1024x1024) or auto. GPT accepts arbitrary sizes; Gemini maps to the nearest supported aspect ratio.

quality

enum<string>

default:auto

(GPT only) Generation quality. Higher is more detailed, slower, and more expensive.

Available options:

low,

medium,

high,

auto

background

enum<string>

default:auto

(GPT only) Background setting. gpt-image-2 does not support transparent backgrounds and returns 400 for transparent.

Available options:

transparent,

opaque,

auto

output_format

enum<string>

default:png

Output image format. GPT supports png/jpeg/webp; Gemini supports png/jpeg (webp falls back to png).

Available options:

png,

jpeg,

webp

output_compression

integer

default:100

(GPT only) Compression level (percent). Effective only when output_format is jpeg or webp.

Required range: 0 <= x <= 100

moderation

enum<string>

default:auto

(GPT only) Content moderation strength. low relaxes limits; auto is the default.

Available options:

low,

auto

thinking_level

enum<string>

default:minimal

(Gemini only) Thinking (reasoning) effort before generation.

Available options:

minimal,

high

response_modalities

enum<string>[]

(Gemini only) Controls returned modalities. ["text","image"] returns the image plus accompanying text; ["image"] returns only the image.

Available options:

text,

image

media_resolution

enum<string>

(Gemini only) Processing resolution for input reference images.

Available options:

low,

medium,

high

response_format

enum<string>

default:b64_json

Return format. Only base64 is supported.

Available options:

b64_json

stream

boolean

default:false

(GPT only) Whether to stream the response (SSE).

partial_images

integer

default:0

(GPT only) Number of intermediate preview images to stream. Effective only when stream=true. Each preview costs an extra 100 image output tokens.

Required range: 0 <= x <= 3

user

string

A unique identifier for the end user. Use hashed or pseudonymous identifiers to avoid passing personally identifiable information.

Response

Successful image response. When stream=false, returns a single object. When stream=true (GPT only), returns an SSE stream of image_generation.partial_image / image_edit.partial_image preview events followed by a image_generation.completed / image_edit.completed event.

A non-streaming image response. Images are always returned as base64; no URL is returned.

string

required

The unique identifier for this request.

created

integer

required

Unix timestamp (in seconds) of when the request was created.

data

object[]

required

The generated images, one element per image.

Show child attributes

usage

object

required

Token usage information.

Show child attributes

background

enum<string>

(GPT only) The background setting actually applied.

Available options:

transparent,

opaque

output_format

enum<string>

The output format actually applied; for Gemini, derived from the mimeType.

Available options:

png,

jpeg,

webp

quality

enum<string>

(GPT only) The quality tier actually applied.

Available options:

low,

medium,

high

size

string

(GPT only) The output size actually applied; useful when size was auto.

text

string

(Gemini only) Plain-text summary of text the model produced alongside the image. For exact interleaving, read parts.

parts

object[]

(Gemini only) Content parts in the model's original order, preserving text/image interleaving.

Show child attributes