See latency and cost reductions across popular models
Dive into the results of the hardware portability
We’re excited to share with you the transformative results from OctoML customers using the platform to explore migration from Cascade Lake to AWS Graviton3 CPUs.
By moving AI/ML workloads from GCP with Intel to AWS with Graviton3, customers can:
Save 73% on compute costs
Gain up to 2.5x reductions in latency
Achieve those benefits in hours, not months, with OctoML's automation
Portability meets performance with OctoML
OctoML's platform automates the complex dependencies between model and hardware, then optimizes the performance of your model. The result is a model with faster inference and lower throughput that you can port to any hardware target.