Jared Roesch

Jared Roesch

Chief Architect and Co-Founder

2 Articles

Jared Roesch

T

Dec 16, 2021

Jared Roesch

T

Dec 16, 2021

Write Python with blazing fast CUDA-level performance

By using TVMScript, TVM's embedded domain specific language (DSL), OctoML engineers are able to demonstrate a 20x speedup over a straightforward PyTorch implementation on CPU, and a 1.3x speedup over handwritten CUDA implementation on GPU for a real-world kernel.

Phil Mazenett
Jared Roesch

Dec 9, 2021

Phil Mazenett
Jared Roesch

Dec 9, 2021

OctoML’s BERT Model Acceleration Proves Apple M1 Pro and Max Chips Make AI Accessible to Everyone

OctoML, the company behind the leading Machine Learning (ML) Deployment Platform, has recently spearheaded ML acceleration work with Hugging Face’s implementation of BERT to automatically optimize its performance for the “most powerful chips Apple has ever built.”

1

Accelerate Performance and Deployment Time