OctoML and AWS Better Together
OctoAI is a compute service to run, tune, and scale your generative AI, built on top of AWS. It allows developers to quickly and cost effectively take generative AI applications to production on AWS.

Benefits of OctoAI, powered by AWS
OctoAI complements the AWS core infrastructure offerings to ensure models are run in a hardware configuration that is optimized for the model and for the application. OctoAI's model acceleration reduces latency and cost for popular foundation models including Stable Diffusion, Whisper, LLaMA and Falcon, as well as custom models built or trained by customers.

Ease of Use
Ready to use deployment templates for popular OSS models
Customize OSS models
Easily integrate with app dev and model dev workflows
Auto-selection of hardware

Efficiency
Fastest foundation models for generative AI made possible through our model acceleration technology
Accelerate and run your custom models
Flexibility to make price-performance tradeoffs

Make Accessible
Customers may select and run accelerated OSS foundation models, fine tune models, upgrade to new models as they emerge, or bring their own custom models
No lock-in into the model or service
OctoAI powered by AWS

OctoML Model Acceleration on AWS

Nightcafe Studio, a pioneer in AI art creation, recently worked with the OctoML team to launch a new personalized image generation application for its customers. Within a period of two weeks, the new application’s usage scaled 4x from its initial daily usage of 100K, to over 400K daily images, with no drop in latency or image creation success rates for end users.


Running generative models in the cloud requires reliable infrastructure that can load and run these large models efficiently, and scale up and down quickly to handle bursts of traffic. Some OctoAI customers today generate over a million Stable Diffusion images a day. OctoAI is built to support this kind of scale and burstiness in a predictable manner.




OpenAI deserves its flowers for creating the “iPhone opportunity” for the AI industry with ChatGPT. Thanks to the magnanimity of Meta, the incredible power of world-class LLMs will be accessible to everyone. Meta has catalyzed an infinite opportunity with its Android moment in releasing Llama 2. Sharing a high quality commercially viable open source Large Language Model (LLM) allows every company and developer to shine, not just one.

Start building with ease in minutes using OctoAI
Our mission is empowering developers to build AI applications that delight users by leveraging fast models running on the most efficient hardware. Sign up and start building in minutes.
