Welcome to the PLI Blog! Our blog post will cover topics such as new AI research, cross-disciplinary applications, societal implications, and more!
Large language models have transformed how we conduct research. In this session, we will talk about how one can:
(1) fine-tune a language model on their own data;
(2) evaluate models on benchmarks;
(3) use large-language-model-powered embedding tools for clustering, classification, and other tasks;
(4) access frontier large language…
NeurIPS, the top AI and ML conference, is being held in Vancouver from December 10 to December 15. Princeton Language and Intelligence is proud to feature the work of our students, post-docs, and faculty that will be showcased at the conference.
Presented at NeurIPS 2024The Road Less…Yu Meng1*, Mengzhou Xia2*, Danqi Chen2
*Equal Contribution
1Computer Science Department,…
Jun-Jie Zhu, Meiqi Yang, Jinyue Jiang, Yiming Bai, Zhiyong Jason Ren
Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University
Research Gap
Our discussion (Zhu et al.,…
A striking feature of large language models is their ability for in-context learning. In-context learning (ICL) is the ability to predict the response to a query based on illustrative examples presented…
Chongyi Zheng, Benjamin Eysenbach
Unsupervised learning is really powerful. It lies at the heart of large language models (just predict the next token), generative image models (predict what…
Naman Agarwal, Daniel Suo, Xinyi Chen, Elad Hazan
One of the biggest challenges for the…
The theoretical framework of structured state space duality (SSD) (see part 1 and 2 of this blogpost series) connects SSMs and (linear) attention through structured matrices. As mentioned in Part I, this connection allows us to derive new algorithms for selective SSMs that are faster than the parallel associative scan in Mamba-1 by…
The International Conference on Learning Representations (ICLR) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence called representation learning, but generally referred to as deep learning. Taking place between May 7th…
Mengzhou Xia*
Sadhika Malladi*
This post is based on the following work:
LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia*1, Sadhika Malladi*1, Suchin Gururangan2, Sanjeev Arora1, Danqi Chen1
*denotes equal…