Chris Hoge

Chris Hoge

Jan 13, 2022

Chris Hoge

Chris Hoge

Jan 13, 2022

TVMCon 2021 Wrapup

The Apache TVM Community and OctoML closed out 2021 with the fourth annual Apache TVM and Open Source ML Acceleration Conference. It was the TVM community’s largest event ever, with 700 attendees from 34 nations coming together for a virtual conference...

Jared Roesch

T

Dec 16, 2021

Jared Roesch

T

Dec 16, 2021

Write Python with blazing fast CUDA-level performance

By using TVMScript, TVM's embedded domain specific language (DSL), OctoML engineers are able to demonstrate a 20x speedup over a straightforward PyTorch implementation on CPU, and a 1.3x speedup over handwritten CUDA implementation on GPU for a real-world kernel.

Byungsoo Jeon
Sunghyun Park

Dec 15, 2021

Byungsoo Jeon
Sunghyun Park

Dec 15, 2021

Collage: Automated integration of various deep learning backends results in state of the art model performance

At TVMCon this week, we will be presenting our latest research from Carnegie Mellon University and University of Michigan for generating the fastest possible executable for a given machine learning model by using Collage.

Phil Mazenett
Jared Roesch

Dec 9, 2021

Phil Mazenett
Jared Roesch

Dec 9, 2021

OctoML’s BERT Model Acceleration Proves Apple M1 Pro and Max Chips Make AI Accessible to Everyone

OctoML, the company behind the leading Machine Learning (ML) Deployment Platform, has recently spearheaded ML acceleration work with Hugging Face’s implementation of BERT to automatically optimize its performance for the “most powerful chips Apple has ever built.”

Gavin Uberti

Gavin Uberti

Dec 8, 2021

Gavin Uberti

Gavin Uberti

Dec 8, 2021

Prototyping Machine Learning with Arduino: TVMCon Tutorial Dec 15

On December 15th at 1:00 PM PST, we’re holding a free 45 minute tutorial session on how Apache TVM can be used to prepare and run neural networks on Arduino.

1

...

Accelerate Performance and Deployment Time