Blog

Welcome to the OctoAI Blog

Explore the latest OctoAI product updates, TVM updates, and overall GenAI news.

OctoStack, an industry-leading private GenAI production stack for the enterprise

OctoStack is a production-ready GenAI inference stack that allows enterprises to efficiently and reliably serve models inside their environment.

Jordan Janes & Rodney Shetler

Apr 2, 2024

Capitol AI increases speeds by 4x and reduces costs by 75% on OctoAI

Capitol AI and OctoAI worked together to achieve a 4x improvement in speed and 75% reduction in large language model (LLM) usage costs, through fine-tuned versions of Mistral models.

Tom Hallaran & Haleh Lewis

Mar 4, 2024

Low latency JSON mode, now available with all LLMs on OctoAI

OctoAI’s implementation of JSON mode is built into the OctoAI Systems Stack, making the feature immediately available for all LLMs in the platform with no disruption to the LLM or its quality.

Luis Vega & Ben Hamm

Mar 18, 2024

OctoAI BLOG

All Posts

Your choice of models on our SaaS or in your environment

Run any model or checkpoint on our efficient, reliable, and customizable API endpoints. Sign up and start building in minutes.