Written by
Allison Gasparini, Princeton Language and Intelligence
March 26, 2024

On Tuesday, March 19, the Princeton Language and Intelligence (PLI) initiative hosted the first webinar for the new Princeton Alignment and Safety Seminar series (PASS).

During the YouTube livestream, members of the Princeton University campus community and beyond were invited to participate in a discussion on what scenarios might lead to the catastrophic misalignment of AI and how researchers can develop and deploy safe systems.

The scenario in which an AI model begins pursuing its own objectives, rather than the intended ones that researchers programmed the systems for, is referred to as “misalignment.” With the rise of generative AI has come a wave of concern over how we can ensure that large language models are safe and aligned. The idea for a digital forum where experts and students alike could discuss and think about these pressing issues within the field was formulated by PLI members during the fall semester.

“We actually initially started with a reading group,” said Yangsibo Huang, a PhD student in the Department of Electrical and Computer Engineering who helped conceive, organize, and host the seminar series. By passing around papers on the topic within an internal group of Princeton students and faculty, the group kept the discussions guided and thought-provoking. However, “We also wanted to broaden the audience for these discussions beyond Princeton,” said Huang.

People often assume that access to state-of-the-art models is necessary to think about problems of safety and alignment “correctly,” Huang’s collaborator Sadhika Malladi said. This line of thinking places the tech industry at the center of the conversation – while placing academics to the side. “Our goal was to bring these topics of discussion to otherwise left out groups, both in terms of academics and also underrepresented groups,” said Malladi, a PhD student in the Computer Science Department. The quest for inclusion also played a role in motivating the organizers to hold the talks online as well. “People can view the recordings or watch it live and participate in a discussion they might not otherwise have a chance to have.”

Huang and Malladi said they hope to continue the seminar series into the fall semester. “I think by fall we’ll have new issues around safety and also potentially new techniques around alignment and they will definitely bring up new discussions,” said Huang.

Malladi said that though AI is a fast-moving field, she also sees longevity in the discussions around safety and alignment. “I think the questions and the problems definitely are a very long-term thing to think about,” she said.

To see the schedule for upcoming Princeton Alignment and Safety Seminars:  https://pli.princeton.edu/events/princeton-ai-alignment-and-safety-seminar

