Speaker
Paul Christiano
Affiliation
Alignment Research Center
Details
Event Description
Paul Christiano, Alignment Research Center
Abstract: I’ll discuss two possible paths by which AI systems could be so misaligned that they attempt to deceive and disempower their human operators. I’ll review the current state of evidence about these risks, what we might hope to learn over the next few years, and how we could become confident that the risk is adequately managed.
Bio: I run the Alignment Research Center. I previously ran the language model alignment team at OpenAI, and before that received my PhD from the theory group at UC Berkeley. You may be interested in my writing about alignment, my blog, my academic publications, or fun and games.
Special Event