Leveraging block sparsity with Apache TVM to halve your cloud bill for NLP
The natural language processing (NLP) community has been transformed by the recent performance and versatility of transformer models from the deep learning research community...
WebGPU powered machine learning in the browser with Apache TVM
We introduced support for WASM and WebGPU backends to the Apache TVM deep learning compiler. Our initial experiments shows that TVM’s WebGPU backend can get close to native GPU performance when deploying models in the browser.