OctoML CLI for Ops
We also wanted to ensure we covered the full end-to-end production deployment scenario as many operators are challenged to scale their AI/ML operations in the same way they have done with traditional applications using DevOps methodologies.
Here is the TransparentAI demo for Ops. Here we cover how to address making a model more reliable in production, accelerate it, make it hardware independent so we can choose the right hardware for the service, and then operationally scale it using our existing Kubernetes environment.
For you fellow operators out there, ML deployment operations cannot scale like all other software workloads due the complexity we affectionately call the Tensor from Hell. The Tensor from Hell is a metaphor for the rigid set of dependencies between the ML training framework (e.g. Pytorch), the ML model/model type and the hardware it needs to run on at various stages of its lifecycle. To tame the Tensor from Hell requires a platform that automatically: produces customized code for the specific HW parameters, selects the right libraries, compiler options; and guides configuration settings to deliver peak performance and meet any other SLA requirements for the hardware employed at every stage of the DevOps lifecycle.
That platform has to have insights across a comprehensive fleet of 80+ deployment targets – in the cloud (AWS, Azure and GCP) and at the edge with accelerated computing including GPU, CPU, NPU from NVIDIA, Intel, AMD, ARM and AWS Graviton – used for automated compatibility testing, performance analysis and optimizations on actual hardware. And the platform has to have an expansive software catalog covering all major ML frameworks, acceleration engines, like TVM, software stacks from the chip providers, and all other software dependencies required for deployment anywhere.
Performance and compatibility insights must be backed by real-world scenarios (not simulated) to accurately inform deployment decisions and ensure SLAs around performance, cost and user experience are met.