Date
May 2, 2025, 12:00 pm1:30 pm
Location
Bendheim House 103

Details

Event Description

Abstract:  Etched is building Sohu, an AI inference chip that’s >10x faster than NVIDIA GPUs. In order to achieve this performance, Sohu is hyper-specialized for running transformers. In this talk, we’ll cover three applied math problems which we’ve faced on our journey.

  1. A scaling law for FLOPs vs Memory: where is the bottleneck?
  2. Matrix Sharding: algorithms for linear algebra at scale
  3. Approximation Theory: how to evaluate nonlinear functions better than PyTorch

Time permitting, we’ll discuss current and future challenges.


Bio: Henry Bosch is a Member of Technical Staff at Etched, working on architectural and algorithmic optimizations for fast AI inference. Previously, he was a PhD student at Stanford University, where he was advised by Otis Chodosh (Geometric Analysis) and studied with Andrea Montenari (Machine Learning Theory).

 

Lunch will be provided.

Sponsor
Organized by PLI