Generate ControlNet SD1.5

POST

generate

controlnet-sd15

ImageGenerationRequest · object

prompt

Input Prompt · string

prompt_2

string

negative_prompt

string

negative_prompt_2

string

checkpoint

string

controlnet

string

vae

string

textual_inversions

object

loras

object

sampler

Sampler Name · enum<string>

height

integer

width

integer

cfg_scale

Classifier-free Guidance Scale · number

steps

Number of Steps · integer

num_images

Number of Output Images · integer

seed

integer

controlnet_image

string

init_image

string

mask_image

string

strength

Strength · number

style_preset

enum<string>

use_refiner

Use Refiner · boolean

high_noise_frac

High Noise Fraction · number

controlnet_conditioning_scale

ControlNet Scale · number

controlnet_early_stop

number

controlnet_preprocess

ControlNet Preprocessing · boolean

clip_skip

integer

outpainting

Outpainting · boolean

image_encoding

Output Image Encoding · enum<string>

transfer_images

object

{
  "images": [
    {
      "image_b64": "<string>",
      "removed_for_safety": true,
      "seed": 123,
      "safety_score": 123
    }
  ],
  "prediction_time_ms": 123
}

Body

application/json

prompt

string

required

Text describing the image content to generate.

prompt_2

string | null

Text with a high-level description of the image to generate. Used only by SD XL.

negative_prompt

string | null

Text describing image traits to avoid during generation.

negative_prompt_2

string | null

Text with a high level description of things to avoid during generation. Used only by SD XL.

checkpoint

string | null

Custom checkpoint to be used during image generation.

controlnet

string | null

ControlNet to be used during image generation

vae

string | null

Custom VAE to be used during image generation.

textual_inversions

object | null

A dictionary of textual inversions to be used during image generation. Textual inversions as keys and trigger words as values.

loras

object | null

A dictionary of LoRAs to apply. LoRAs as keys and their weights (float) as values.

sampler

enum<string>

Sampler name (also known as 'scheduler') to use during image generation.

Available options:

PNDM,

LMS,

KLMS,

DDIM,

DDPM,

HEUN,

K_HEUN,

K_EULER,

K_EULER_ANCESTRAL,

DPM_SOLVER_MULTISTEP,

DPM_PLUS_PLUS_2M_KARRAS,

DPM_SINGLE,

DPM_2,

DPM_2_ANCESTRAL,

DPM_PLUS_PLUS_SDE_KARRAS,

UNI_PC,

LCM

height

integer | null

Integer representing the height of image to generate. None will default to 512 for SD 1.5 and 1024 for SD XL and SSD. Supported resolutions (w,h): SDXL={(1536, 640), (768, 1344), (832, 1216), (1344, 768), (1152, 896), (640, 1536), (1216, 832), (896, 1152), (1024, 1024)}, SD1.5={(768, 576), (1024, 576), (640, 512), (384, 704), (640, 768), (640, 640), (1024, 768), (1536, 1024), (768, 1024), (576, 448), (1024, 1024), (896, 896), (704, 1216), (512, 512), (448, 576), (832, 512), (512, 704), (576, 768), (1216, 704), (512, 768), (512, 832), (1024, 1536), (576, 1024), (704, 384), (768, 512)}, SSD={(1536, 640), (768, 1344), (832, 1216), (1344, 768), (1152, 896), (640, 1536), (1216, 832), (896, 1152), (1024, 1024)}

width

integer | null

Integer representing the width of image to generate. None will default to 512 for SD 1.5 and 1024 for SD XL and SSD. Supported resolutions (w,h): SDXL={(1536, 640), (768, 1344), (832, 1216), (1344, 768), (1152, 896), (640, 1536), (1216, 832), (896, 1152), (1024, 1024)}, SD1.5={(768, 576), (1024, 576), (640, 512), (384, 704), (640, 768), (640, 640), (1024, 768), (1536, 1024), (768, 1024), (576, 448), (1024, 1024), (896, 896), (704, 1216), (512, 512), (448, 576), (832, 512), (512, 704), (576, 768), (1216, 704), (512, 768), (512, 832), (1024, 1536), (576, 1024), (704, 384), (768, 512)}, SSD={(1536, 640), (768, 1344), (832, 1216), (1344, 768), (1152, 896), (640, 1536), (1216, 832), (896, 1152), (1024, 1024)}

cfg_scale

number

default: 12

Floating-point number represeting how closely to adhere to prompt description. Must be a positive number no greater than 50.0.

steps

integer

default: 30

Integer repreenting how many steps of diffusion to run. Must be greater than 0 and less than or equal to 200.

num_images

integer

default: 1

Integer representing how many output images to generate with a single prompt/configuration.

seed

Integer number or list of integers representing the seeds of random generators. Fixing random seed is useful when attempting to generate a specific image. Must be greater than 0 and less than 2^32.

controlnet_image

string | null

Controlnet image encoded in b64 string for guiding image generation. Required for controlnet engines.

init_image

string | null

Starting point image encoded in b64 string for Image to Image generation mode.

mask_image

string | null

b64 encoded mask image for inpainting. White area should indicate where to paint.

strength

number

default: 0.8

Floating-point number indicating how much creative the Image to Image generation mode should be. Must be greater than 0 and less than or equal to 1.0.

style_preset

enum<string> | null

Pre-defined styles used to guide the output image towards a particular style. Pre-defined styles are only supported by SDXL.

Available options:

base,

3d-model,

analog-film,

anime,

cinematic,

comic-book,

Craft Clay,

modeling-compound,

digital-art,

enhance,

fantasy-art,

isometric,

line-art,

low-poly,

neon-punk,

origami,

photographic,

pixel-art,

tile-texture,

Advertising,

Food Photography,

Real Estate,

Abstract,

Cubist,

Graffiti,

Hyperrealism,

Impressionist,

Pointillism,

Pop Art,

Psychedelic,

Renaissance,

Steampunk,

Surrealist,

Typography,

Watercolor,

Fighting Game,

GTA,

Super Mario,

Minecraft,

Pokémon,

Retro Arcade,

Retro Game,

RPG Fantasy Game,

Strategy Game,

Street Fighter,

Legend of Zelda,

Architectural,

Disco,

Dreamscape,

Dystopian,

Fairy Tale,

Gothic,

Grunge,

Horror,

Minimalist,

Monochrome,

Nautical,

Space,

Stained Glass,

Techwear Fashion,

Tribal,

Zentangle,

Collage,

Flat Papercut,

Kirigami,

Paper Mache,

Paper Quilling,

Papercut Collage,

Papercut Shadow Box,

Stacked Papercut,

Thick Layered Papercut,

Alien,

Film Noir,

HDR,

Long Exposure,

Neon Noir,

Silhouette,

Tilt-Shift

use_refiner

boolean

default: true

Whether to enable and apply the SDXL refiner model to the image generation.

high_noise_frac

number

default: 0.8

Floating-point number that defines the fraction of steps to perform with the base model. Used only by SD XL. Must be greater than or equal to 0.0 and less than or equal to 1.0.

controlnet_conditioning_scale

number

default: 1

How strong the effect of the controlnet should be.

controlnet_early_stop

number | null

If provided, indicates fraction of steps at which to stop applying controlnet. This can be used to sometimes generate better outputs.

controlnet_preprocess

boolean

default: true

Whether to apply automatic ControlNet preprocessing.

clip_skip

integer | null

Optionally skip later layers of the text encoder. Higher values lead to more abstract interpretations of the prompt.

outpainting

boolean

default: false

Whether the request requires outpainting or not. If so, special preprocessing is applied for better results.

image_encoding

enum<string>

Define which encoding process should be applied before returning the generated image(s).

Available options:

jpeg,

png

transfer_images

object | null

A dictionary containing a mapping of trigger words to a list of sample images which demonstrate the desired object or style to transfer.

Response

200 - application/json

images

object[]

required

List of ImageGeneration(s) generated by the request.

prediction_time_ms

number

required

Total runtime of the image generation(s).

Was this page helpful?

Generate SSD Generate ControlNet SDXL

{
  "images": [
    {
      "image_b64": "<string>",
      "removed_for_safety": true,
      "seed": 123,
      "safety_score": 123
    }
  ],
  "prediction_time_ms": 123
}

OctoAI API

Text Generation API

Media Generation API

Media Utilities API

Fine Tuning API

Asset Library API

Generate ControlNet SD1.5

Body

Response