NeurIPS, the top AI and ML conference, is being held in Vancouver from December 10 to December 15. Princeton Language and Intelligence is proud to feature the work of our students, post-docs, and faculty that will be showcased at the conference.
Presented at NeurIPS 2024
The Road Less Scheduled (Oral)
Authors: Aaron Defazio, Xingyu Yang, Ahmed Khaled, Konstantin Mishchenko, Harsh Mehta, Ashok Cutkosky
Links: Paper
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision (Spotlight)
Authors: Jay Shah, Ganesh Bikshandi, Ying Zhang, Vijay Thakkar, Pradeep Ramani, Tri Dao
Links: Paper
CryoBench: Datasets and Benchmarks for Heterogeneous Cryo-EM Reconstruction (Spotlight)
Authors: Minkyu Jeon, Rishwanth Raghu, Miro Astore, Geoffrey Woollard, Ryan Feathers, Alkin Kaz, Sonya Hanson, Pilar Cossio, Ellen Zhong
Links: Paper
RedPajama: an Open Dataset for Training Large Language Models (Spotlight)
Authors: Maurice Weber, Dan Fu, Quentin Anthony, Yonatan Oren, Shane Adams, Anton Alexandrov, Xiaozhong Lyu, Huu Nguyen, Xiaozhe Yao, Virginia Adams, Ben Athiwaratkun, Rahul Chalamala, Kezhen Chen, Max Ryabinin, Tri Dao, Percy Liang, Christopher Ré, Irina Rish, Ce Zhang
Links: Paper
GREATS: Online Selection of High-Quality Data for LLM Training in Every Iteration (Spotlight)
Authors: Jiachen (Tianhao) Wang, Tong Wu, Dawn Song, Prateek Mittal, Ruoxi Jia
Links: Paper
Finding Transformer Circuits With Edge Pruning (Spotlight)
Authors: Adithya Bhaskar, Alexander Wettig, Dan Friedman, Danqi Chen
Links: Paper
A Metalearned Neural Circuit for Nonparametric Bayesian Inference
Authors: Jake Snell, Gianluca Bencomo, Tom Griffiths
Links: Paper
Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates
Authors: Kaifeng Lyu, Haoyu Zhao, Xinran Gu, Dingli Yu, Anirudh Goyal, Sanjeev Arora
Links: Paper
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
Authors: John Yang, Carlos Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, Ofir Press
Links: Paper
Offline Multitask Representation Learning for Reinforcement Learning
Authors: Haque Ishfaq, Thanh Nguyen-Tang, Songtao Feng, Raman Arora, Mengdi Wang, Ming Yin, Doina Precup
Links: Paper
Fast Best-of-N Decoding via Speculative Rejection
Authors: Hanshi Sun, Momin Haider, Ruiqi Zhang, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter Bartlett, Andrea Zanette
Links: Paper
Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks
Authors: Kaiqi Zhang, Zixuan Zhang, Minshuo Chen, Yuma Takeda, Mengdi Wang, Tuo Zhao, Yu-Xiang Wang
Links: Paper
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Authors: James Liu, Guangxuan Xiao, Kai Li, Jason Lee, Song Han, Tri Dao, Tianle Cai
Links: Paper
A Theoretical Perspective for Speculative Decoding Algorithm
Authors: Ming Yin, Minshuo Chen, Kaixuan Huang, Mengdi Wang
Links: Paper
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem
Authors: Declan Campbell, Sunayana Rane, Tyler Giallanza, Camillo Nicolò De Sabbata, Kia Ghods, Amogh Joshi, Alexander Ku, Steven Frankland, Tom Griffiths, Jonathan D Cohen, Taylor Webb
Links: Paper
Evaluating Copyright Takedown Methods for Language Models
Authors: Boyi Wei, Weijia Shi, Yangsibo Huang, Noah Smith, Chiyuan Zhang, Luke Zettlemoyer, Kai Li, Peter Henderson
Links: Paper
LoFiT: Localized Fine-tuning on LLM Representations
Authors: Fangcong Yin, Xi Ye, Greg Durrett
Links: Paper
Gradient Guidance for Diffusion Models: An Optimization Perspective
Authors: Yingqing Guo, Hui Yuan, Yukang Yang, Minshuo Chen, Mengdi Wang
Links: Paper
Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference
Authors: Benjamin Eysenbach, Vivek Myers, Ruslan Salakhutdinov, Sergey Levine
Links: Paper
SciCode: A Research Coding Benchmark Curated by Scientists
Authors: Minyang Tian, Luyu Gao, Shizhuo Zhang, Xinan Chen, Cunwei Fan, Xuefei Guo, Roland Haas, Pan Ji, Kittithat Krongchon, Yao Li, Shengyan Liu, Di Luo, Yutao Ma, HAO TONG, Kha Trinh, Chenyu Tian, Zihan Wang, Bohao Wu, Shengzhu Yin, Minhui Zhu, Kilian Lieret, Yanxin Lu, Genglin Liu, Yufeng Du, Tianhua Tao, Ofir Press, Jamie Callan, Eliu Huerta, Hao Peng
Links: Paper
Learning to Assist Humans without Inferring Rewards
Authors: Vivek Myers, Evan Ellis, Sergey Levine, Benjamin Eysenbach, Anca Dragan
Links: Paper
Are More LLM Calls All You Need? Towards the Scaling Properties of Compound AI Systems
Authors: Lingjiao Chen, Jared Quincy Davis, Boris Hanin, Peter Bailis, Ion Stoica, Matei A Zaharia, James Zou
Links: Paper
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving
Authors: Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Mozer, Sanjeev Arora
Links: Paper
Learning and Transferring Sparse Contextual Bigrams with Linear Transformers
Authors: Yunwei Ren, Zixuan Wang, Jason Lee
Links: Paper
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
Authors: Rui Yang, Ruomeng Ding, Yong Lin, Huan Zhang, Tong Zhang
Links: Paper
Transfer Q-star : Principled Decoding for LLM Alignment
Authors: Souradip Chakraborty, Soumya Suvra Ghosal, Ming Yin, Dinesh Manocha, Mengdi Wang, Amrit Singh Bedi, Furong Huang
Links: Paper
Can Models Learn Skill Composition from Examples?
Authors: Haoyu Zhao, Simran Kaur, Dingli Yu, Anirudh Goyal, Sanjeev Arora
Links: Paper
Preference Learning Algorithms Do Not Learn Preference Rankings
Authors: Angelica Chen, Sadhika Malladi, Lily Zhang, Xinyi Chen, Qiuyi (Richard) Zhang, Rajesh Ranganath, Kyunghyun Cho
Links: Paper
Learning Human-like Representations to Enable Learning Human Values
Authors: Andrea Wynn, Ilia Sucholutsky, Tom Griffiths
Links: Paper
Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers
Authors: Sukjun Hwang, Aakash Sunil Lahoti, Ratish Puduppully, Tri Dao, Albert Gu
Links: Paper
Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity
Authors: Qian Yu, Yining Wang, Baihe Huang, Qi Lei, Jason Lee
Links: Paper
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encoding
Authors: Rajesh Jayaram, Laxman Dhulipala, Majid Hadian, Jason Lee, Vahab Mirrokni
Links: Paper
Directional Smoothness and Gradient Methods: Convergence and Adaptivity
Authors: Aaron Mishkin, Ahmed Khaled, Yuanhao Wang, Aaron Defazio, Robert Gower
Links: Paper
ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty
Authors: Xindi Wu, Dingli Yu, Yangsibo Huang, Olga Russakovsky, Sanjeev Arora
Links: Paper
SimPO: Simple Preference Optimization with a Reference-Free Reward
Authors: Yu Meng, Mengzhou Xia, Danqi Chen
Links: Paper
CiteME: Can Language Models Accurately Cite Scientific Claims?
Authors: Ori Press, Andreas Hochlehnert, Ameya Prabhu, Vishaal Udandarao, Ofir Press, Matthias Bethge
Links: Paper
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
Authors: Jason Lee, Kazusato Oko, Taiji Suzuki, Denny Wu
Links: Paper
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Authors: Zirui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu, Richard Zhu, Kaiqu Liang, Xindi Wu, Haotian Liu, Sadhika Malladi, Alexis Chevalier, Sanjeev Arora, Danqi Chen
Links: Paper
Beyond Aesthetics: Cultural Competence in Text-to-Image Models
Authors: Nithish Kannen Senthilkumar, Arif Ahmad, Marco Andreetto, Vinodkumar Prabhakaran, Utsav Prabhu, Adji Bousso Dieng, Pushpak Bhattacharyya, Shachi Dave
Links: Paper
Optimal Aggregation of Prediction Intervals under Unsupervised Domain Shift
Authors: Jiawei Ge, Debarghya Mukherjee, Jianqing Fan
Links: Paper
Don't Compress Gradients in Random Reshuffling: Compress Gradient Differences
Authors: Abdurakhmon Sadiev, Grigory Malinovsky, Eduard Gorbunov, Igor Sokolov, Ahmed Khaled, Konstantin Burlachenko, Peter Richtarik
Links: Paper
REBEL: Reinforcement Learning via Regressing Relative Rewards
Authors: Zhaolin Gao, Jonathan Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, Drew Bagnell, Jason Lee, Wen Sun
Links: Paper
Online Control in Population Dynamics
Authors: Noah Golowich, Elad Hazan, Zhou Lu, Dhruv Rohatgi, Y. Jennifer Sun
Links: Paper
FlexSBDD: Structure-Based Drug Design with Flexible Protein Modeling
Authors: Zaixi Zhang, Mengdi Wang, Qi Liu
Links: Paper
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Authors: Licong Lin, Jingfeng Wu, Sham Kakade, Peter Bartlett, Jason Lee
Links: Paper
Global Convergence in Training Large-Scale Transformers
Authors: Cheng Gao, Yuan Cao, Zihao Li, Yihan He, Mengdi Wang, Han Liu, Jason Klusowski, Jianqing Fan
Links: Paper
One-Layer Transformer Provably Learns One-Nearest Neighbor In Context
Authors: Zihao Li, Yuan Cao, Cheng Gao, Yihan He, Han Liu, Jason Klusowski, Jianqing Fan, Mengdi Wang
Links: Paper
Mixture of neural fields for heterogeneous reconstruction in cryo-EM
Authors: Axel Levy, Rishwanth Raghu, David Shustin, Adele Peng, Huan Li, Oliver Clarke, Gordon Wetzstein, Ellen Zhong
Links: Paper
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
Authors: Junxiong Wang, Daniele Paliotta, Avner May, Alexander Rush, Tri Dao
Links: Paper