MISSION & VISION

We enable users to harness value from
AI innovations

To get there, OctoAI delivers efficient, reliable, and customizable AI systems.

We built GenAI optimization into each layer of the stack. We know building with your models and serving them on-the-fly is key to your experiences. And, we provide a broad range of hardware options, or run our stack in your environment.

Our Story

OctoAI was spun out of the University of Washington by the original creators of Apache TVM, an open source stack for ML portability and performance. TVM enables ML models to run efficiently on any hardware backend, and has quickly become a key part of the architecture of popular consumer devices like Amazon Alexa.

Recognizing the potential for TVM and technologies like it to transform the full scope of the ML lifecycle, OctoAI was born.

OctoAI by the numbers

2019

Founded in

100+

Global employees

$132m

In funding (seed, A, B,C)

12,000+

TVM commits

What's in a Name?

Thinking about the type of company we wanted to build, we took inspiration from the playful, clever, curious octopus. These unconventional thinkers have a unique, distributed intelligence that spans their entire body.

They are adaptive enough to camouflage at a moment’s notice, and creative enough to complete puzzles, build gardens, and use tools. Plus, like any good engineer, they love to take things apart.

OctoAI Leadership

Luis Ceze

CEO, Co-founder

Jared Roesch

Chief Technology Officer, Co-founder

Jason Knight

VP of ML, Co-founder

Tianqi Chen

Chief Technologist, Co-founder

Amanda Robles

VP of People & Operations

Thierry Moreau

Head of DevRel, Co-founder

Itay Neeman

VP of Engineering

Anna Connolly

VP of Customer Success & Experience

David Messina

Chief Marketing Officer

Tony Tzeng

Chief Product Officer

Luis Ceze

CEO, Co-founder

Jared Roesch

Chief Technology Officer, Co-founder

Jason Knight

VP of ML, Co-founder

Tianqi Chen

Chief Technologist, Co-founder

Amanda Robles

VP of People & Operations

Thierry Moreau

Head of DevRel, Co-founder

Itay Neeman

VP of Engineering

Anna Connolly

VP of Customer Success & Experience

David Messina

Chief Marketing Officer

Tony Tzeng

Chief Product Officer

Luis Ceze

CEO, Co-founder

Jared Roesch

Chief Technology Officer, Co-founder

Jason Knight

VP of ML, Co-founder

Tianqi Chen

Chief Technologist, Co-founder

Amanda Robles

VP of People & Operations

Thierry Moreau

Head of DevRel, Co-founder

Itay Neeman

VP of Engineering

Anna Connolly

VP of Customer Success & Experience

David Messina

Chief Marketing Officer

Tony Tzeng

Chief Product Officer

Our Board of Directors

Our Investors

Read about our work

Apr 15, 2024

12 minutes

Supercharge RAG Performance Using OctoAI and Unstructured Embeddings

Together, OctoAI and Unstructured.io provide a solution for data challenges, enabling users to derive meaningful insights from vast amounts of information.

Ronny Hoesada & Pedro Torruella

Apr 11, 2024

5 minutes

Mixtral 8x22B is now available on OctoAI

Mixtral 8x22B now available on OctoAI. You can run inferences against the Mixtral 8x22B base model using /completions API, using curl, or using the OpenAI SDK.

Matt Shumer & Ben Hamm & Deepak Mohan

Apr 5, 2024

2 minutes

NightCafe Studio now delivering over a million image generation inferences

NightCafe, a long time OctoAI customer, has seamlessly scaled to delivering image generation inferences for its customers with no drops in latency.

Pete Sarabia

Apr 2, 2024

7 minutes

OctoStack, an industry-leading private GenAI production stack for the enterprise

OctoStack is a production-ready GenAI inference stack that allows enterprises to efficiently and reliably serve models inside their environment.

Jordan Janes & Rodney Shetler

Apr 1, 2024

2 minutes

OctoAI and Pinecone partnership for GenAI using RAG

OctoAI and Pinecone are partnering to help developers build robust and performant GenAI applications with Retrieval Augmented Generation (RAG).

Bear Douglas & Pedro Torruella

Apr 1, 2024

3 minutes

OctoAI Signs Strategic Collaboration Agreement with AWS to Expand Developer Access to Generative AI

OctoAI is using AWS accelerated computing infrastructure services to deliver production-grade GenAI solutions that are empowering builders to launch the next generation of AI applications.

Luis Ceze

Mar 27, 2024

4 minutes

OctoAI’s Adetailer automatically fixes faces and hands

OctoAI's Adetailer tool saves you valuable time and also addresses common issues in GenAI image generation like distorted faces and hands.

Janisha Anand & Andrew Reusch

Mar 20, 2024

1 minutes

Announcing the collaboration between OctoAI and Unstructured

This collaboration helps with unstructured data ingestion and processing for LLM applications. This new integration combines Unstructured's enterprise-grade connectors with OctoAI’s embeddings API.

Pedro Torruella & Ronny Hoesada

Mar 7, 2024

4 minutes

Introducing Stable Video Diffusion (SVD) on OctoAI

Stable Video Diffusion (SVD) 1.1 now on OctoAI empowers developers to easily add engaging animations and motion to GenAI-powered images.

Janisha Anand & Michal Piszczek

Mar 4, 2024

5 minutes

Capitol AI increases speeds by 4x and reduces costs by 75% on OctoAI

Capitol AI and OctoAI worked together to achieve a 4x improvement in speed and 75% reduction in large language model (LLM) usage costs, through fine-tuned versions of Mistral models.

Tom Hallaran & Haleh Lewis

Feb 16, 2024

6 minutes

Businesses can generate customizable avatars using OctoAI’s Photo Merge feature

OctoAI Image Gen Solution introduces Photo Merge, allowing you to integrate a photo’s subject into high-quality output, eliminating the need to create time-consuming custom facial fine-tunes.

Janisha Anand & Josh Fromm & Brunno Goldstein

Feb 21, 2024

9 minutes

Using Mixtral, Milvus and OctoAI to build RAG apps

In this tutorial, we cover how to build RAG with Milvus, Mixtral hosted through OctoAI, and LangChain. It's possible to build RAG apps without OpenAI.

Yujian Tang & Thierry Moreau

Apr 15, 2024

12 minutes

Supercharge RAG Performance Using OctoAI and Unstructured Embeddings

Together, OctoAI and Unstructured.io provide a solution for data challenges, enabling users to derive meaningful insights from vast amounts of information.

Ronny Hoesada & Pedro Torruella

Apr 11, 2024

5 minutes

Mixtral 8x22B is now available on OctoAI

Mixtral 8x22B now available on OctoAI. You can run inferences against the Mixtral 8x22B base model using /completions API, using curl, or using the OpenAI SDK.

Matt Shumer & Ben Hamm & Deepak Mohan

Apr 5, 2024

2 minutes

NightCafe Studio now delivering over a million image generation inferences

NightCafe, a long time OctoAI customer, has seamlessly scaled to delivering image generation inferences for its customers with no drops in latency.

Pete Sarabia

Apr 2, 2024

7 minutes

OctoStack, an industry-leading private GenAI production stack for the enterprise

OctoStack is a production-ready GenAI inference stack that allows enterprises to efficiently and reliably serve models inside their environment.

Jordan Janes & Rodney Shetler

Apr 1, 2024

2 minutes

OctoAI and Pinecone partnership for GenAI using RAG

OctoAI and Pinecone are partnering to help developers build robust and performant GenAI applications with Retrieval Augmented Generation (RAG).

Bear Douglas & Pedro Torruella

Apr 1, 2024

3 minutes

OctoAI Signs Strategic Collaboration Agreement with AWS to Expand Developer Access to Generative AI

OctoAI is using AWS accelerated computing infrastructure services to deliver production-grade GenAI solutions that are empowering builders to launch the next generation of AI applications.

Luis Ceze

Mar 27, 2024

4 minutes

OctoAI’s Adetailer automatically fixes faces and hands

OctoAI's Adetailer tool saves you valuable time and also addresses common issues in GenAI image generation like distorted faces and hands.

Janisha Anand & Andrew Reusch

Mar 20, 2024

1 minutes

Announcing the collaboration between OctoAI and Unstructured

This collaboration helps with unstructured data ingestion and processing for LLM applications. This new integration combines Unstructured's enterprise-grade connectors with OctoAI’s embeddings API.

Pedro Torruella & Ronny Hoesada

Mar 7, 2024

4 minutes

Introducing Stable Video Diffusion (SVD) on OctoAI

Stable Video Diffusion (SVD) 1.1 now on OctoAI empowers developers to easily add engaging animations and motion to GenAI-powered images.

Janisha Anand & Michal Piszczek

Mar 4, 2024

5 minutes

Capitol AI increases speeds by 4x and reduces costs by 75% on OctoAI

Capitol AI and OctoAI worked together to achieve a 4x improvement in speed and 75% reduction in large language model (LLM) usage costs, through fine-tuned versions of Mistral models.

Tom Hallaran & Haleh Lewis

Feb 16, 2024

6 minutes

Businesses can generate customizable avatars using OctoAI’s Photo Merge feature

OctoAI Image Gen Solution introduces Photo Merge, allowing you to integrate a photo’s subject into high-quality output, eliminating the need to create time-consuming custom facial fine-tunes.

Janisha Anand & Josh Fromm & Brunno Goldstein

Feb 21, 2024

9 minutes

Using Mixtral, Milvus and OctoAI to build RAG apps

In this tutorial, we cover how to build RAG with Milvus, Mixtral hosted through OctoAI, and LangChain. It's possible to build RAG apps without OpenAI.

Yujian Tang & Thierry Moreau

Apr 15, 2024

12 minutes

Supercharge RAG Performance Using OctoAI and Unstructured Embeddings

Together, OctoAI and Unstructured.io provide a solution for data challenges, enabling users to derive meaningful insights from vast amounts of information.

Ronny Hoesada & Pedro Torruella

Apr 11, 2024

5 minutes

Mixtral 8x22B is now available on OctoAI

Mixtral 8x22B now available on OctoAI. You can run inferences against the Mixtral 8x22B base model using /completions API, using curl, or using the OpenAI SDK.

Matt Shumer & Ben Hamm & Deepak Mohan

Apr 5, 2024

2 minutes

NightCafe Studio now delivering over a million image generation inferences

NightCafe, a long time OctoAI customer, has seamlessly scaled to delivering image generation inferences for its customers with no drops in latency.

Pete Sarabia

Apr 2, 2024

7 minutes

OctoStack, an industry-leading private GenAI production stack for the enterprise

OctoStack is a production-ready GenAI inference stack that allows enterprises to efficiently and reliably serve models inside their environment.

Jordan Janes & Rodney Shetler

Apr 1, 2024

2 minutes

OctoAI and Pinecone partnership for GenAI using RAG

OctoAI and Pinecone are partnering to help developers build robust and performant GenAI applications with Retrieval Augmented Generation (RAG).

Bear Douglas & Pedro Torruella

Apr 1, 2024

3 minutes

OctoAI Signs Strategic Collaboration Agreement with AWS to Expand Developer Access to Generative AI

OctoAI is using AWS accelerated computing infrastructure services to deliver production-grade GenAI solutions that are empowering builders to launch the next generation of AI applications.

Luis Ceze

Mar 27, 2024

4 minutes

OctoAI’s Adetailer automatically fixes faces and hands

OctoAI's Adetailer tool saves you valuable time and also addresses common issues in GenAI image generation like distorted faces and hands.

Janisha Anand & Andrew Reusch

Mar 20, 2024

1 minutes

Announcing the collaboration between OctoAI and Unstructured

This collaboration helps with unstructured data ingestion and processing for LLM applications. This new integration combines Unstructured's enterprise-grade connectors with OctoAI’s embeddings API.

Pedro Torruella & Ronny Hoesada

Mar 7, 2024

4 minutes

Introducing Stable Video Diffusion (SVD) on OctoAI

Stable Video Diffusion (SVD) 1.1 now on OctoAI empowers developers to easily add engaging animations and motion to GenAI-powered images.

Janisha Anand & Michal Piszczek

Mar 4, 2024

5 minutes

Capitol AI increases speeds by 4x and reduces costs by 75% on OctoAI

Capitol AI and OctoAI worked together to achieve a 4x improvement in speed and 75% reduction in large language model (LLM) usage costs, through fine-tuned versions of Mistral models.

Tom Hallaran & Haleh Lewis

Feb 16, 2024

6 minutes

Businesses can generate customizable avatars using OctoAI’s Photo Merge feature

OctoAI Image Gen Solution introduces Photo Merge, allowing you to integrate a photo’s subject into high-quality output, eliminating the need to create time-consuming custom facial fine-tunes.

Janisha Anand & Josh Fromm & Brunno Goldstein

Feb 21, 2024

9 minutes

Using Mixtral, Milvus and OctoAI to build RAG apps

In this tutorial, we cover how to build RAG with Milvus, Mixtral hosted through OctoAI, and LangChain. It's possible to build RAG apps without OpenAI.

Yujian Tang & Thierry Moreau

Your choice of models on our SaaS or in your environment

Run any model or checkpoint on our efficient, reliable, and customizable API endpoints. Sign up and start building in minutes.

We enable users to harness value fromAI innovations

Our Story

OctoAI by the numbers

2019

100+

$132m

12,000+

What's in a Name?

OctoAI Leadership

Our Board of Directors

Our Investors

Read about our work

Your choice of models on our SaaS or in your environment

We enable users to harness value from
AI innovations