April 21, 2025

The International Conference on Learning Representations (ICLR) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence called representation learning, but generally referred to as deep learning. Taking place between April 24-28 in Singapore, ICLR is globally renowned for presenting and publishing cutting-edge research on all aspects of deep learning used in the fields of artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, text understanding, gaming, and robotics.

Princeton Language and Intelligence is proud to feature the work of our students, post-docs, and faculty that will be showcased at the conference. 

Presented at ICLR 2025


Progressive distillation induces an implicit curriculum (Oral)

prog_dist

Authors: Abhishek Panigrahi, Bingbin Liu, Sadhika Malladi, Andrej Risteski, Surbhi Goel

Links: Paper 


Safety Alignment Should be Made More Than Just a Few Tokens Deep (Oral)

safety

Authors: Xiangyu Qi, Ashwinee Panda, Kaifeng Lyu, Xiao Ma, Subhrajit Roy, Ahmad Beirami, Prateek Mittal, Peter Henderson

Links: Paper 


Capturing the Temporal Dependence of Training Data Influence (Oral)

capture

Authors: Jiachen (Tianhao) Wang, Dawn Song, James Y Zou, Prateek Mittal, Ruoxi Jia

Links: Paper 


Data Shapley in One Training Run (Oral)

shapley

Authors: Jiachen (Tianhao) Wang, Prateek Mittal, Dawn Song, Ruoxi Jia

Links: Paper 


Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning (Oral)

misl

Authors: Chongyi Zheng, Jens Tuyls, Joanne Peng, Benjamin Eysenbach

Links: Paper 


BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval (Spotlight)

bright

Authors: Hongjin SU, Howard Yen, Mengzhou Xia, Weijia Shi, Niklas Muennighoff, Han-yu Wang, Liu Haisu, Quan Shi, Zachary Siegel, Michael Tang, Ruoxi Sun, Jinsung Yoon, Sercan Arik, Danqi Chen, Tao Yu

Links: Paper 


Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization (Spotlight)

chisq

Authors: Audrey Huang, Wenhao Zhan, Tengyang Xie, Jason Lee, Wen Sun, Akshay Krishnamurthy, Dylan Foster

Links: Paper 


Understanding Factual Recall in Transformers via Associative Memories (Spotlight)

associative_mem

Authors: Eshaan Nichani, Jason Lee, Alberto Bietti

Links: Paper 


Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought (Spotlight)

multistepgd

Authors: Jianhao Huang, Zixuan Wang, Jason Lee

Links: Paper 


Accelerating Goal-Conditioned Reinforcement Learning Algorithms and Research (Spotlight)

gcrl

Authors: Michał Bortkiewicz, Władysław Pałucki, Vivek Myers, Tadeusz Dziarmaga, Tomasz Arczewski, Łukasz Kuciński, Benjamin Eysenbach

Links: Paper 


Horizon Generalization in Reinforcement Learning

horizon

Authors: Vivek Myers, Catherine Ji, Benjamin Eysenbach

Links: Paper 


Privacy Auditing of Large Language Models

canary

Authors: Ashwinee Panda, Xinyu Tang, Christopher Choquette-Choo, Milad Nasr, Prateek Mittal

Links: Paper 


Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias

hallucination

Authors: Rui Lu, Runzhe Wang, Kaifeng Lyu, Xitai Jiang, Gao Huang, Mengdi Wang

Links: Paper 


HELMET: How to Evaluate Long-context Models Effectively and Thoroughly

helmet

Authors: Howard Yen, Tianyu Gao, Minmin Hou, Ke Ding, Daniel Fleischer, Peter Izsak, Moshe Wasserblat, Danqi Chen

Links: Paper 


Specialized Foundation Models Struggle to Beat Supervised Baselines

sfm

Authors: Zongzhe Xu, Ritvik Gupta, Wenduo Cheng, Alexander Shen, Junhong Shen, Ameet Talwalkar, Mikhail Khodak

Links: Paper 


MUSE: Machine Unlearning Six-Way Evaluation for Language Models

muse

Authors: Weijia Shi, Jaechan Lee, Yangsibo Huang, Sadhika Malladi, Jieyu Zhao, Ari Holtzman, Daogao Liu, Luke Zettlemoyer, Noah Smith, Chiyuan Zhang

Links: Paper 


Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning

ism

Authors: Simran Kaur, Simon Park, Anirudh Goyal, Sanjeev Arora

Links: Paper 


Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow

recdiff

Authors: Fu-Yun Wang, Ling Yang, Zhaoyang Huang, Mengdi Wang, Hongsheng Li

Links: Paper 


Understanding Optimization in Deep Learning with Central Flows

cf

Authors: Jeremy Cohen, Alex Damian, Ameet Talwalkar, J Kolter, Jason Lee

Links: Paper 


SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?

swe

Authors: John Yang, Carlos E Jimenez, Alex Zhang, Kilian Lieret, Joyce Yang, Xindi Wu, Ori Press, Niklas Muennighoff, Gabriel Synnaeve, Karthik Narasimhan, Diyi Yang, Sida Wang, Ofir Press

Links: Paper 


On Evaluating the Durability of Safeguards for Open-Weight LLMs

owd

Authors: Xiangyu Qi, Boyi Wei, Nicholas Carlini, Yangsibo Huang, Tinghao Xie, Luxi He, Matthew Jagielski, Milad Nasr, Prateek Mittal, Peter Henderson

Links: Paper 


Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

regr

Authors: Zhaolin Gao, Wenhao Zhan, Jonathan Chang, Gokul Swamy, Kianté Brantley, Jason Lee, Wen Sun

Links: Paper 


Diffusion Policy Policy Optimization
 

dppo

Authors: Allen Z. Ren, Justin Lidard, Lars Ankile, Anthony Simeonov, Pulkit Agrawal, Anirudha Majumdar, Benjamin Burchfiel, Hongkai Dai, Max Simchowitz

Links: Paper 


TAU-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

tau

Authors: Shunyu Yao, Noah Shinn, Pedram Razavi, Karthik Narasimhan

Links: Paper 


Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy

hierarchy

Authors: Tong Wu, Shujian Zhang, Kaiqiang Song, Silei Xu, Sanqiang Zhao, Ravi Agrawal, Sathish Reddy Indurthi, Chong Xiang, Prateek Mittal, Wenxuan Zhou

Links: Paper 


Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
 

unalignment

Authors: Noam Razin, Sadhika Malladi, Adithya Bhaskar, Danqi Chen, Sanjeev Arora, Boris Hanin

Links: Paper 


Fantastic Copyrighted Beasts and How (Not) to Generate Them
 

copyright

Authors: Luxi He, Yangsibo Huang, Weijia Shi, Tinghao Xie, Haotian Liu, Yue Wang, Luke Zettlemoyer, Chiyuan Zhang, Danqi Chen, Peter Henderson

Links: Paper 


A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals
 

contrastiverl

Authors: Grace Liu, Michael Tang, Benjamin Eysenbach

Links: Paper 


Neuron based Personality Trait Induction in Large Language Models

neuron

Authors: Jia Deng, Tianyi Tang, Yanbin Yin, Wenhao yang, Xin Zhao, Ji-Rong Wen

Links: Paper 


Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank

markov

Authors: Wenhao Zhan, Scott Fujimoto, Zheqing Zhu, Jason Lee, Daniel Jiang, Yonathan Efroni

Links: Paper 


Collab: Controlled Decoding using Mixture of Agents for LLM Alignment

cllb

Authors: Souradip Chakraborty, Sujay Bhatt, Udari Sehwag, Soumya Suvra Ghosal, Jiahao Qiu, Mengdi Wang, Dinesh Manocha, Furong Huang, Alec Koppel, Sumitra Ganesh

Links: Paper 


Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data

temporal

Authors: Hengyu Fu, Zehao Dou, Jiawei Guo, Mengdi Wang, Minshuo Chen

Links: Paper 


Benign Overfitting in Out-of-Distribution Generalization of Linear Models

overfit

Authors: Shange Tang, Jiayun Wu, Jianqing Fan, Chi Jin

Links: Paper 


Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
 

imit

Authors: Clemencia Siro, Guy Gur-Ari, Gaurav Mishra, Stuart Shieber, Jason Phang, Zijie Wang, Kory Mathewson, Giorgio Mariani, Allen Nie, James Y Zou, Behnam Neyshabur, Karl Krauth, Shixiang Gu, Pablo Antonio Moreno Casares, Maarten Sap, Mohit Tiwari, Bill Yuchen Lin, Aykut Erdem, Angelica Chen, Swaroop Mishra, Chenlin Meng, Ashish Sabharwal, James Simon, Louis-Philippe Morency, Kyle Richardson, Emanuele Rodolà, Adam Fisch, Simone Melzi, Kristen Chiafullo, Rif A. Saurous, Shubh Pachchigar, Siamak Shakeri, Aitor Lewkowycz, Yonatan Belinkov, Mihir Kale, Mantas Mazeika, Dar Gilboa, Hongming Zhang, Seung Jae Lee, Owain Evans, Ambrose Slone, David Dohan, Damien Sileo, Mor Geva, Cameron Diao, Christopher Potts, Jekaterina Novikova, Alicia Parrish, Debajyoti Datta, Chitta Baral, Maarten Bosma, Michael Strube, Jiacheng Xu, Trishala Neeraj, Colin Raffel, Leo Gao, Vishakh Padmakumar, Yu (Hope) Hou, Christopher Waites, Ellie Pavlick, Pouya Pezeshkpour, Nanyun (Violet) Peng, Gerard de Melo, Martin Potthast, Aarohi Srivastava, Abhinav Rastogi, Abu Awal Md Shoeb, Adam Brown, Adam Santoro, Aditya Gupta, Agnieszka Kluska, Diyi Yang, Akshat Agarwal, Alexander Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Aman Hussain, Amanda Askell, Amanda Dsouza, Ameet Rahane, Anantharaman S. Iyer, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew La, Ethan Dyer, Angela Jiang, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Austin Herrick, Avia Efrat, Ayla Karakaş, B. Roberts, Bao Loe, Bartłomiej Bojanowski, Benjamin Inden, Benno Stein, Batuhan Özyurt, Behnam Hedayatnia, Blake Howald, Bryan Orinion, Cameron Dour, Catherine Stinson, Cedrick Argueta, Cesar Ferri, Chandan Singh, Charles Rathkopf, Christian Voigt, Cindy Ramirez, Clara Rivera, Noah Fiedel, Courtney Ashcraft, Dan Garrette, Dan Kilman, C. Freeman, Daniel Levy, Daniel González, Danielle Perszyk, Danny Hernandez, David Jurgens, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Mátyás Schubert, Derek Tam, Dilyar Buzan, Shyam Upadhyay, Dimitri Coelho Mollo, Dylan Schrader, Ekaterina Shutova, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Emma Lam, Eric Tang, Ernie Chang, Ethan Chi, Ethan Jerzak, Ethan Kim, Eunice Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fernando Martínez-Plumed, Francesca Happé, Gloria X Wang, Gonzalo Jaimovitch-Lopez, Gregor Betz, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hayden Bogar, Henry Shevlin, Hiromu Yakura, Hugh Wong, Kumar Shridhar, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, James Zheng, Jan Kocon, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesujoba Alabi, Jillian Tang, Joan Waweru, John Burden, Dieuwke Hupkes, John Balis, Jonathan Batchelder, Jörg Frohberg, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua Rule, Joyce Chua, Kamil Kanclerz, Karthik Gopalakrishnan, Katerina Ignatyeva, Li Zhang, Liam Dugan, Katja Markert, Kaustubh Dhole, Lucas Lam, Kevin Omondi, Kyle McDonell, Laria Reynolds, Lianhui Qin, Lidia Contreras-Ochando, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros-Colón, Lütfi Kerem Senel, Maria Jose Ramirez-Quintana, Maartje Ter Hoeve, Mohit Bansal, Martha Lewis, Maheen Farooqi, Marco Baturan, Marco Marelli, Marco Maru, Marie Tolkiehn, Michael A. Yee, Mario Giulianelli, Michael Gu, Michael Ivanitskiy, Matthias Hagen, Medina Baitemirova, Mike Cain, Mimee Xu, Mitch Walker, Moin Aminnaseri, Mozhdeh Gheini, Nathan Chi, Michael Starritt, Michał Swędrowski, Michele Bevilacqua, Nayeon Lee, Neta Krakover, Nicholas Cameron, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niveditha Iyer, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Parth Doshi, Pascale Fung, Pegah Alipoormolabashi, Liao Peiyuan, Peter W Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Priti Oli, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Paul Pu Liang, Rowan Jacobs, Ryan Stovall, Rylan Yang, Saif Mohammad, Sajant Anand, Sam Dillavou, Sam Wiseman, Samuel Gruetter, Sanghyun Han, Mukund Varma T, Sanjeev Kwatra, Sarah Rous, Sarik Ghazarian, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sepideh Sadeghi, Shadi Hamdan, Sherry Shi, Shikhar Singh, Daphne Ippolito, Shima Asaadi, Shyamolima Debnath, Simon Thormeyer, Sneha Makini, Soo-Hwan Lee, Spencer Torene, Stanislas Dehaene, Stefan Divic, Hanna Hajishirzi, Stephanie Lin, Stephen Prasad, Andrew Dai, Steven Piantadosi, Summer Misherghi, Svetlana Kiritchenko, Tao Li, Tariq Ali, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Adrià Garriga-Alonso, Tiberius Nkinyili, Timofei Kornev, Titus Tunduny, Trenton Chang, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Victoria Nyamai, Vikas Raunak, vinay prabhu, William Saunders, William Zhang, Wout Vossen, Xiaoyu Tong, Xinyi Wu, Yair Lakretz, Yichi Yang, Sophie Hao, Yifu Chen, Yufang Hou, Yuntao Bai, Zachary Seid, Cristina Garbacea, Ziyi Wu, Genta Winata, Shubham Toshniwal, Abubakar Abid, John Miller, Karen Livescu, Tatsunori Hashimoto, Ekin Cubuk, Sayan Ghosh, Harsh Mehta, Jacob Hilton, Yadollah Yaghoobzadeh, Jiaming Song, Siva Reddy, Stefano Ermon, Shashank Srivastava, Percy Liang, Chiyu Wu, James Koppel, Rui Zhang, David Drakard, Germàn Kruszewski, Dong-Ho Lee, Fatemeh Siar, Luke Metz, Roman Sitelew, Dan Hendrycks, Paul Vicol, Alexander Ray, Tobias Gerstenberg, Chris Callison-Burch, Sriharsha Hatwar, Xinran Zhao, Zijian Wang, Luca Moschella, Sam Bowman, Jaime Fernández Fisac, Danqi Chen, Stella R Biderman, Nitish Shirish Keskar, Eric Chu, Manaal Faruqui, Ksenia Shkaruta, Xudong Shen, Ryan Teehan, Vinay Ramasesh, Andy Zou, Jaehoon Lee, Hinrich Schuetze, Jesse Engel, Tal Schuster, Berk Ekmekci, Yangqiu Song, Andrew Lampinen, Dan Roth, Yasaman Bahri, Jascha Sohl-Dickstein, Jason Yosinski, Sebastian Schuster, Melody Arnaud, Russ Salakhutdinov, Nicholas Roberts, William Fedus, Sam Shleifer, Vivek Srikumar, Ronan Le Bras, Jos Rozen, Kevin Gimpel, Melvin McElrath, Omer Levy, Tal Linzen, Diganta Misra, Frieda Rong, Xiang Ren, Abhishek Rao, Mirac Suzgun, Yejin Choi, Michihiro Yasunaga, Sharon Zhou, Joshua B Tenenbaum, Sahib Singh, Michael Cohen, Tao Yu, Samuel Schoenholz, Rosanne Liu, Ryan Chi, Giambattista Parascandolo, Zhuoye Zhao, Erkut Erdem, Matthew Leavitt, Francois Chollet, Anders Andreassen, Timo Schick, Vera Demberg, Qiaozhu Mei, Daniel Khashabi, Jonathan Berant, Noah Constant, Alex Warstadt, Zirui Wang, Alethea Power, Niklas Muennighoff, Barret Zoph, Jason Wei, Christopher Manning

Links: Paper 


Learning Hierarchical Polynomials of Multiple Nonlinear Features

nonlin

Authors: Hengyu Fu, Zihao Wang, Eshaan Nichani, Jason Lee

Links: Paper 


To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
 

cot

Authors: Zayne Sprague, Fangcong Yin, Juan Rodriguez, Dongwei Jiang, Manya Wadhwa, Prasann Singhal, Xinyu Zhao, Xi Ye, Kyle Mahowald, Greg Durrett

Links: Paper 


SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal

frq

Authors: Tinghao Xie, Xiangyu Qi, Yi Zeng, Yangsibo Huang, Udari Sehwag, Kaixuan Huang, Luxi He, Boyi Wei, Dacheng Li, Ying Sheng, Ruoxi Jia, Bo Li, Kai Li, Danqi Chen, Peter Henderson, Prateek Mittal

Links: Paper 


The "Law'' of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities
 

law

Authors: Yongwei Che, Benjamin Eysenbach

Links: Paper 


IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
 

itercomp

Authors: Xinchen Zhang, Ling Yang, Guohao Li, YaQi Cai, xie jiake, Yong Tang, Yujiu Yang, Mengdi Wang, Bin CUI

Links: Paper 


Transformers Provably Learn Two-Mixture of Linear Classification via Gradient Flow

gf

Authors: Hongru Yang, Zhangyang Wang, Jason Lee, Yingbin Liang

Links: Paper 


OGBench: Benchmarking Offline Goal-Conditioned RL

ogbench

Authors: Seohong Park, Kevin Frans, Benjamin Eysenbach, Sergey Levine

Links: Paper 


Provable unlearning in topic modeling and downstream tasks

unlearning

Authors: Stanley Wei, Sadhika Malladi, Sanjeev Arora, Amartya Sanyal

Links: Paper 


Building Math Agents with Multi-Turn Iterative Preference Learning
 

unlearning

Authors: Wei Xiong, Chengshuai Shi, Jiaming Shen, Aviv Rosenberg, Zhen Qin, Daniele Calandriello, Misha Khalman, Rishabh Joshi, Bilal Piot, Mohammad Saleh, Chi Jin, Tong Zhang, Tianqi Liu

Links: Paper 


Advancing LLM Reasoning Generalists with Preference Trees
 

llmreasoning

Authors: Lifan Yuan, Ganqu Cui, Hanbin Wang, Ning Ding, Xingyao Wang, Boji Shan, Zeyuan Liu, Jia Deng, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, Bowen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun

Links: Paper 


A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
 

entanglement

Authors: Hui Yuan, Yifan Zeng, Yue Wu, Huazheng Wang, Mengdi Wang, Liu Leqi

Links: Paper 


Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
 

datasel

Authors: Yiding Jiang, Allan Zhou, Zhili Feng, Sadhika Malladi, J Kolter

Links: Paper