May 30, 2024, 12:00 pm1:30 pm


Event Description

Understanding Complex Processing in Temporal Models - From Implicit Biases to Mechanistic Interpretability

Recent temporal models such as transformers and RNN variants can effectively capture complex structure in text. However, it remains largely unknown how this is achieved. The talk will discuss our work on this problem. First, I will discuss work demonstrating implicit biases of RNNs, showing that they have a bias towards learning "simple" rules that correspond to dynamic systems with a low dimensional state. I will then present several works on understanding how transformers solve complex problems. First, I will discuss work that dissects how transformers extract information about the world (e.g., "Paris is the capital of France"), highlighting several information processing streams that underlie this process. Finally, I will discuss our analysis of how transformers achieve in-context learning, and the internal hypothesis space used in this process.

Event organized by PLI
Special Event