
The International Conference on Learning Representations (ICLR) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence called representation learning, but generally referred to as deep learning. Taking place between April 24-28 in Singapore, ICLR is globally renowned for presenting and publishing cutting-edge research on all aspects of deep learning used in the fields of artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, text understanding, gaming, and robotics.
Princeton Language and Intelligence is proud to feature the work of our students, post-docs, and faculty that will be showcased at the conference.
Presented at ICLR 2025
Progressive distillation induces an implicit curriculum (Oral)

Authors: Abhishek Panigrahi, Bingbin Liu, Sadhika Malladi, Andrej Risteski, Surbhi Goel
Links: Paper
Safety Alignment Should be Made More Than Just a Few Tokens Deep (Oral)

Authors: Xiangyu Qi, Ashwinee Panda, Kaifeng Lyu, Xiao Ma, Subhrajit Roy, Ahmad Beirami, Prateek Mittal, Peter Henderson
Links: Paper
Capturing the Temporal Dependence of Training Data Influence (Oral)

Authors: Jiachen (Tianhao) Wang, Dawn Song, James Y Zou, Prateek Mittal, Ruoxi Jia
Links: Paper
Data Shapley in One Training Run (Oral)

Authors: Jiachen (Tianhao) Wang, Prateek Mittal, Dawn Song, Ruoxi Jia
Links: Paper
Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning (Oral)

Authors: Chongyi Zheng, Jens Tuyls, Joanne Peng, Benjamin Eysenbach
Links: Paper
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval (Spotlight)

Authors: Hongjin SU, Howard Yen, Mengzhou Xia, Weijia Shi, Niklas Muennighoff, Han-yu Wang, Liu Haisu, Quan Shi, Zachary Siegel, Michael Tang, Ruoxi Sun, Jinsung Yoon, Sercan Arik, Danqi Chen, Tao Yu
Links: Paper
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization (Spotlight)

Authors: Audrey Huang, Wenhao Zhan, Tengyang Xie, Jason Lee, Wen Sun, Akshay Krishnamurthy, Dylan Foster
Links: Paper
Understanding Factual Recall in Transformers via Associative Memories (Spotlight)

Authors: Eshaan Nichani, Jason Lee, Alberto Bietti
Links: Paper
Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought (Spotlight)

Authors: Jianhao Huang, Zixuan Wang, Jason Lee
Links: Paper
Accelerating Goal-Conditioned Reinforcement Learning Algorithms and Research (Spotlight)

Authors: Michał Bortkiewicz, Władysław Pałucki, Vivek Myers, Tadeusz Dziarmaga, Tomasz Arczewski, Łukasz Kuciński, Benjamin Eysenbach
Links: Paper
Horizon Generalization in Reinforcement Learning

Authors: Vivek Myers, Catherine Ji, Benjamin Eysenbach
Links: Paper
Privacy Auditing of Large Language Models

Authors: Ashwinee Panda, Xinyu Tang, Christopher Choquette-Choo, Milad Nasr, Prateek Mittal
Links: Paper
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias

Authors: Rui Lu, Runzhe Wang, Kaifeng Lyu, Xitai Jiang, Gao Huang, Mengdi Wang
Links: Paper
HELMET: How to Evaluate Long-context Models Effectively and Thoroughly

Authors: Howard Yen, Tianyu Gao, Minmin Hou, Ke Ding, Daniel Fleischer, Peter Izsak, Moshe Wasserblat, Danqi Chen
Links: Paper
Specialized Foundation Models Struggle to Beat Supervised Baselines

Authors: Zongzhe Xu, Ritvik Gupta, Wenduo Cheng, Alexander Shen, Junhong Shen, Ameet Talwalkar, Mikhail Khodak
Links: Paper
MUSE: Machine Unlearning Six-Way Evaluation for Language Models

Authors: Weijia Shi, Jaechan Lee, Yangsibo Huang, Sadhika Malladi, Jieyu Zhao, Ari Holtzman, Daogao Liu, Luke Zettlemoyer, Noah Smith, Chiyuan Zhang
Links: Paper
Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning

Authors: Simran Kaur, Simon Park, Anirudh Goyal, Sanjeev Arora
Links: Paper
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow

Authors: Fu-Yun Wang, Ling Yang, Zhaoyang Huang, Mengdi Wang, Hongsheng Li
Links: Paper
Understanding Optimization in Deep Learning with Central Flows

Authors: Jeremy Cohen, Alex Damian, Ameet Talwalkar, J Kolter, Jason Lee
Links: Paper
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?

Authors: John Yang, Carlos E Jimenez, Alex Zhang, Kilian Lieret, Joyce Yang, Xindi Wu, Ori Press, Niklas Muennighoff, Gabriel Synnaeve, Karthik Narasimhan, Diyi Yang, Sida Wang, Ofir Press
Links: Paper
On Evaluating the Durability of Safeguards for Open-Weight LLMs

Authors: Xiangyu Qi, Boyi Wei, Nicholas Carlini, Yangsibo Huang, Tinghao Xie, Luxi He, Matthew Jagielski, Milad Nasr, Prateek Mittal, Peter Henderson
Links: Paper
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Authors: Zhaolin Gao, Wenhao Zhan, Jonathan Chang, Gokul Swamy, Kianté Brantley, Jason Lee, Wen Sun
Links: Paper
Diffusion Policy Policy Optimization

Authors: Allen Z. Ren, Justin Lidard, Lars Ankile, Anthony Simeonov, Pulkit Agrawal, Anirudha Majumdar, Benjamin Burchfiel, Hongkai Dai, Max Simchowitz
Links: Paper
TAU-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

Authors: Shunyu Yao, Noah Shinn, Pedram Razavi, Karthik Narasimhan
Links: Paper
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy

Authors: Tong Wu, Shujian Zhang, Kaiqiang Song, Silei Xu, Sanqiang Zhao, Ravi Agrawal, Sathish Reddy Indurthi, Chong Xiang, Prateek Mittal, Wenxuan Zhou
Links: Paper
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization

Authors: Noam Razin, Sadhika Malladi, Adithya Bhaskar, Danqi Chen, Sanjeev Arora, Boris Hanin
Links: Paper
Fantastic Copyrighted Beasts and How (Not) to Generate Them

Authors: Luxi He, Yangsibo Huang, Weijia Shi, Tinghao Xie, Haotian Liu, Yue Wang, Luke Zettlemoyer, Chiyuan Zhang, Danqi Chen, Peter Henderson
Links: Paper
A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals

Authors: Grace Liu, Michael Tang, Benjamin Eysenbach
Links: Paper
Neuron based Personality Trait Induction in Large Language Models

Authors: Jia Deng, Tianyi Tang, Yanbin Yin, Wenhao yang, Xin Zhao, Ji-Rong Wen
Links: Paper
Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank

Authors: Wenhao Zhan, Scott Fujimoto, Zheqing Zhu, Jason Lee, Daniel Jiang, Yonathan Efroni
Links: Paper
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment

Authors: Souradip Chakraborty, Sujay Bhatt, Udari Sehwag, Soumya Suvra Ghosal, Jiahao Qiu, Mengdi Wang, Dinesh Manocha, Furong Huang, Alec Koppel, Sumitra Ganesh
Links: Paper
Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data

Authors: Hengyu Fu, Zehao Dou, Jiawei Guo, Mengdi Wang, Minshuo Chen
Links: Paper
Benign Overfitting in Out-of-Distribution Generalization of Linear Models

Authors: Shange Tang, Jiayun Wu, Jianqing Fan, Chi Jin
Links: Paper
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Authors: Clemencia Siro, Guy Gur-Ari, Gaurav Mishra, Stuart Shieber, Jason Phang, Zijie Wang, Kory Mathewson, Giorgio Mariani, Allen Nie, James Y Zou, Behnam Neyshabur, Karl Krauth, Shixiang Gu, Pablo Antonio Moreno Casares, Maarten Sap, Mohit Tiwari, Bill Yuchen Lin, Aykut Erdem, Angelica Chen, Swaroop Mishra, Chenlin Meng, Ashish Sabharwal, James Simon, Louis-Philippe Morency, Kyle Richardson, Emanuele Rodolà, Adam Fisch, Simone Melzi, Kristen Chiafullo, Rif A. Saurous, Shubh Pachchigar, Siamak Shakeri, Aitor Lewkowycz, Yonatan Belinkov, Mihir Kale, Mantas Mazeika, Dar Gilboa, Hongming Zhang, Seung Jae Lee, Owain Evans, Ambrose Slone, David Dohan, Damien Sileo, Mor Geva, Cameron Diao, Christopher Potts, Jekaterina Novikova, Alicia Parrish, Debajyoti Datta, Chitta Baral, Maarten Bosma, Michael Strube, Jiacheng Xu, Trishala Neeraj, Colin Raffel, Leo Gao, Vishakh Padmakumar, Yu (Hope) Hou, Christopher Waites, Ellie Pavlick, Pouya Pezeshkpour, Nanyun (Violet) Peng, Gerard de Melo, Martin Potthast, Aarohi Srivastava, Abhinav Rastogi, Abu Awal Md Shoeb, Adam Brown, Adam Santoro, Aditya Gupta, Agnieszka Kluska, Diyi Yang, Akshat Agarwal, Alexander Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Aman Hussain, Amanda Askell, Amanda Dsouza, Ameet Rahane, Anantharaman S. Iyer, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew La, Ethan Dyer, Angela Jiang, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Austin Herrick, Avia Efrat, Ayla Karakaş, B. Roberts, Bao Loe, Bartłomiej Bojanowski, Benjamin Inden, Benno Stein, Batuhan Özyurt, Behnam Hedayatnia, Blake Howald, Bryan Orinion, Cameron Dour, Catherine Stinson, Cedrick Argueta, Cesar Ferri, Chandan Singh, Charles Rathkopf, Christian Voigt, Cindy Ramirez, Clara Rivera, Noah Fiedel, Courtney Ashcraft, Dan Garrette, Dan Kilman, C. Freeman, Daniel Levy, Daniel González, Danielle Perszyk, Danny Hernandez, David Jurgens, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Mátyás Schubert, Derek Tam, Dilyar Buzan, Shyam Upadhyay, Dimitri Coelho Mollo, Dylan Schrader, Ekaterina Shutova, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Emma Lam, Eric Tang, Ernie Chang, Ethan Chi, Ethan Jerzak, Ethan Kim, Eunice Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fernando Martínez-Plumed, Francesca Happé, Gloria X Wang, Gonzalo Jaimovitch-Lopez, Gregor Betz, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hayden Bogar, Henry Shevlin, Hiromu Yakura, Hugh Wong, Kumar Shridhar, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, James Zheng, Jan Kocon, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesujoba Alabi, Jillian Tang, Joan Waweru, John Burden, Dieuwke Hupkes, John Balis, Jonathan Batchelder, Jörg Frohberg, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua Rule, Joyce Chua, Kamil Kanclerz, Karthik Gopalakrishnan, Katerina Ignatyeva, Li Zhang, Liam Dugan, Katja Markert, Kaustubh Dhole, Lucas Lam, Kevin Omondi, Kyle McDonell, Laria Reynolds, Lianhui Qin, Lidia Contreras-Ochando, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros-Colón, Lütfi Kerem Senel, Maria Jose Ramirez-Quintana, Maartje Ter Hoeve, Mohit Bansal, Martha Lewis, Maheen Farooqi, Marco Baturan, Marco Marelli, Marco Maru, Marie Tolkiehn, Michael A. Yee, Mario Giulianelli, Michael Gu, Michael Ivanitskiy, Matthias Hagen, Medina Baitemirova, Mike Cain, Mimee Xu, Mitch Walker, Moin Aminnaseri, Mozhdeh Gheini, Nathan Chi, Michael Starritt, Michał Swędrowski, Michele Bevilacqua, Nayeon Lee, Neta Krakover, Nicholas Cameron, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niveditha Iyer, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Parth Doshi, Pascale Fung, Pegah Alipoormolabashi, Liao Peiyuan, Peter W Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Priti Oli, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Paul Pu Liang, Rowan Jacobs, Ryan Stovall, Rylan Yang, Saif Mohammad, Sajant Anand, Sam Dillavou, Sam Wiseman, Samuel Gruetter, Sanghyun Han, Mukund Varma T, Sanjeev Kwatra, Sarah Rous, Sarik Ghazarian, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sepideh Sadeghi, Shadi Hamdan, Sherry Shi, Shikhar Singh, Daphne Ippolito, Shima Asaadi, Shyamolima Debnath, Simon Thormeyer, Sneha Makini, Soo-Hwan Lee, Spencer Torene, Stanislas Dehaene, Stefan Divic, Hanna Hajishirzi, Stephanie Lin, Stephen Prasad, Andrew Dai, Steven Piantadosi, Summer Misherghi, Svetlana Kiritchenko, Tao Li, Tariq Ali, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Adrià Garriga-Alonso, Tiberius Nkinyili, Timofei Kornev, Titus Tunduny, Trenton Chang, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Victoria Nyamai, Vikas Raunak, vinay prabhu, William Saunders, William Zhang, Wout Vossen, Xiaoyu Tong, Xinyi Wu, Yair Lakretz, Yichi Yang, Sophie Hao, Yifu Chen, Yufang Hou, Yuntao Bai, Zachary Seid, Cristina Garbacea, Ziyi Wu, Genta Winata, Shubham Toshniwal, Abubakar Abid, John Miller, Karen Livescu, Tatsunori Hashimoto, Ekin Cubuk, Sayan Ghosh, Harsh Mehta, Jacob Hilton, Yadollah Yaghoobzadeh, Jiaming Song, Siva Reddy, Stefano Ermon, Shashank Srivastava, Percy Liang, Chiyu Wu, James Koppel, Rui Zhang, David Drakard, Germàn Kruszewski, Dong-Ho Lee, Fatemeh Siar, Luke Metz, Roman Sitelew, Dan Hendrycks, Paul Vicol, Alexander Ray, Tobias Gerstenberg, Chris Callison-Burch, Sriharsha Hatwar, Xinran Zhao, Zijian Wang, Luca Moschella, Sam Bowman, Jaime Fernández Fisac, Danqi Chen, Stella R Biderman, Nitish Shirish Keskar, Eric Chu, Manaal Faruqui, Ksenia Shkaruta, Xudong Shen, Ryan Teehan, Vinay Ramasesh, Andy Zou, Jaehoon Lee, Hinrich Schuetze, Jesse Engel, Tal Schuster, Berk Ekmekci, Yangqiu Song, Andrew Lampinen, Dan Roth, Yasaman Bahri, Jascha Sohl-Dickstein, Jason Yosinski, Sebastian Schuster, Melody Arnaud, Russ Salakhutdinov, Nicholas Roberts, William Fedus, Sam Shleifer, Vivek Srikumar, Ronan Le Bras, Jos Rozen, Kevin Gimpel, Melvin McElrath, Omer Levy, Tal Linzen, Diganta Misra, Frieda Rong, Xiang Ren, Abhishek Rao, Mirac Suzgun, Yejin Choi, Michihiro Yasunaga, Sharon Zhou, Joshua B Tenenbaum, Sahib Singh, Michael Cohen, Tao Yu, Samuel Schoenholz, Rosanne Liu, Ryan Chi, Giambattista Parascandolo, Zhuoye Zhao, Erkut Erdem, Matthew Leavitt, Francois Chollet, Anders Andreassen, Timo Schick, Vera Demberg, Qiaozhu Mei, Daniel Khashabi, Jonathan Berant, Noah Constant, Alex Warstadt, Zirui Wang, Alethea Power, Niklas Muennighoff, Barret Zoph, Jason Wei, Christopher Manning
Links: Paper
Learning Hierarchical Polynomials of Multiple Nonlinear Features

Authors: Hengyu Fu, Zihao Wang, Eshaan Nichani, Jason Lee
Links: Paper
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Authors: Zayne Sprague, Fangcong Yin, Juan Rodriguez, Dongwei Jiang, Manya Wadhwa, Prasann Singhal, Xinyu Zhao, Xi Ye, Kyle Mahowald, Greg Durrett
Links: Paper
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal

Authors: Tinghao Xie, Xiangyu Qi, Yi Zeng, Yangsibo Huang, Udari Sehwag, Kaixuan Huang, Luxi He, Boyi Wei, Dacheng Li, Ying Sheng, Ruoxi Jia, Bo Li, Kai Li, Danqi Chen, Peter Henderson, Prateek Mittal
Links: Paper
The "Law'' of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities

Authors: Yongwei Che, Benjamin Eysenbach
Links: Paper
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Authors: Xinchen Zhang, Ling Yang, Guohao Li, YaQi Cai, xie jiake, Yong Tang, Yujiu Yang, Mengdi Wang, Bin CUI
Links: Paper
Transformers Provably Learn Two-Mixture of Linear Classification via Gradient Flow

Authors: Hongru Yang, Zhangyang Wang, Jason Lee, Yingbin Liang
Links: Paper
OGBench: Benchmarking Offline Goal-Conditioned RL

Authors: Seohong Park, Kevin Frans, Benjamin Eysenbach, Sergey Levine
Links: Paper
Provable unlearning in topic modeling and downstream tasks

Authors: Stanley Wei, Sadhika Malladi, Sanjeev Arora, Amartya Sanyal
Links: Paper
Building Math Agents with Multi-Turn Iterative Preference Learning

Authors: Wei Xiong, Chengshuai Shi, Jiaming Shen, Aviv Rosenberg, Zhen Qin, Daniele Calandriello, Misha Khalman, Rishabh Joshi, Bilal Piot, Mohammad Saleh, Chi Jin, Tong Zhang, Tianqi Liu
Links: Paper
Advancing LLM Reasoning Generalists with Preference Trees

Authors: Lifan Yuan, Ganqu Cui, Hanbin Wang, Ning Ding, Xingyao Wang, Boji Shan, Zeyuan Liu, Jia Deng, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, Bowen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun
Links: Paper
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Authors: Hui Yuan, Yifan Zeng, Yue Wu, Huazheng Wang, Mengdi Wang, Liu Leqi
Links: Paper
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws

Authors: Yiding Jiang, Allan Zhou, Zhili Feng, Sadhika Malladi, J Kolter
Links: Paper