Date
Mar 19, 2024, 2:00 pm3:00 pm

Speaker

Details

Event Description

Paul Christiano, Alignment Research Center

Abstract: I’ll discuss two possible paths by which AI systems could be so misaligned that they attempt to deceive and disempower their human operators. I’ll review the current state of evidence about these risks, what we might hope to learn over the next few years, and how we could become confident that the risk is adequately managed.

Bio: I run the Alignment Research Center. I previously ran the language model alignment team at OpenAI, and before that received my PhD from the theory group at UC Berkeley. You may be interested in my writing about alignment, my blog, my academic publications, or fun and games