Our video generation model is accessible via REST API. Below, you’ll find straightforward examples using cURL/Python SDK and TypeScript SDK for our video generation endpoints, complete with explanations of all parameters.

The endpoint URL for video generation is https://image.octoai.run/generate/svd.

This encompasses image-to-video conversion. Additionally, we offer support for a text-to-video workflow, which involves utilizing the text-to-image API (Image Gen API) followed by the image-to-video API.

Request payload

Parameters:

  • image (base64 encoded image, required) - Starting point image encoded in base64 string

  • height (int; optional) - Integer representing the height of video/animation to generate- If not provided, the output height will be inferred from the input ‘image’, and the closest resolution supported will be chosen.

  • width (int; optional) - Integer representing the width of video/animation to generate- If not provided, the output width will be inferred from the input ‘image’, and the closest resolution supported will be chosen.

    Supported resolutions are (w,h): (576, 1024), (1024, 576), (768, 768)

  • cfg_scale (float; optional) - Floating-point number representing how closely to adhere to ‘image’ description- Must be a positive number no greater than 10.0.

  • fps (int; optional) - How fast the generated frames should play back.

  • steps (int; optional) - Integer representing how many steps of diffusion to run- Must be greater than 0 and less than or equal to 50.

  • motion_scale (float; optional) - A floating point number between 0 and 1 indicating how much motion should be in the generated animation.

  • noise_aug_strength (float; optional) - How much noise to add to the initial image- higher values encourage creativity.

  • num_videos (int; optional) - Integer representing how many output videos/animations to generate with a single image and configuration. You can generate upto 16 videos in a single API request. All videos will be generated in sequence within the same configurations but different seed values.

  • seed (int; optional) - Integer number or list of integers representing the seeds of random generators. Fixing random seed is useful when attempting to generate a specific video (or set of videos).

Response

  • videos (list) - List of generation(s) generated by the request.
  • prediction_time_ms (float) - Total runtime of the video/animations(s) generation(s).
curl -X POST "https://image.octoai.run/generate/svd" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $OCTOAI_TOKEN" \
    --data-raw '{
        "image": "<BASE_64_STRING>",
        "steps": 40,
        "cfg_scale": 3,
        "fps": 4,
        "motion_scale": 0.2,
        "noise_aug_strength": 0.55,
        "num_videos": 1,
        "seed": "2138732363"
    }' | jq -r ".videos[0].video" | base64 -d >result.mp4