Details

Event Description
Abstract: Etched is building Sohu, an AI inference chip that’s >10x faster than NVIDIA GPUs. In order to achieve this performance, Sohu is hyper-specialized for running transformers. In this talk, we’ll cover three applied math problems which we’ve faced on our journey.
- A scaling law for FLOPs vs Memory: where is the bottleneck?
- Matrix Sharding: algorithms for linear algebra at scale
- Approximation Theory: how to evaluate nonlinear functions better than PyTorch
Time permitting, we’ll discuss current and future challenges.
Bio: Henry Bosch is a Member of Technical Staff at Etched, working on architectural and algorithmic optimizations for fast AI inference. Previously, he was a PhD student at Stanford University, where he was advised by Otis Chodosh (Geometric Analysis) and studied with Andrea Montenari (Machine Learning Theory).
Lunch will be provided.
Sponsor
Organized by PLI