Large Language Models
References
- PaLM: Scaling Language Modeling with Pathways - Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, etal. - 2022 
- Gemini: A Family of Highly Capable Multimodal Models - Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, etal. - 2023 
- Mistral 7B - Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, etal. - 2023 
- Mixtral of Experts - Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, etal. - 2024 
- Improving Language Understanding by Generative Pretraining - Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever - 2018 
- Attention Is All You Need - Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, etal. - 2017 
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding - Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova - 2018 
- Physics of Language Models: Part 3.1, Knowledge Storage and Extraction - Zeyuan Allen-Zhu, Yuanzhi Li - 2023 
- Language Models are Unsupervised Multitask Learners - Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever - 2019 
- Language Models are Few-Shot Learners - Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, etal. - 2020 
- https://commoncrawl.org/
- The Pile: An 800GB Dataset of Diverse Text for Language Modeling - Leo Gao, Stella Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, etal. - 2020 
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces - Albert Gu, Tri Dao - 2023 
- Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu, Karan Goel, Christopher Ré - 2021 
- The Curious Case of Neural Text Degeneration - Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi - 2019 
- Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs - Minh Nguyen, Andrew Baker, Clement Neo, Allen Roush, Andreas Kirsch, Ravid Shwartz-Ziv - 2024 
- Training language models to follow instructions with human feedback - Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, etal. - 2022 
- Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs - Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, etal. - 2024 
- Simple statistical gradient-following algorithms for connectionist reinforcement learning - Ronald J. Williams - 1992 
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model - Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn - 2023 
- DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs - Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, Matt Gardner - 2019 
- PIQA: Reasoning about Physical Commonsense in Natural Language - Yonatan Bisk, Rowan Zellers, Ronan Le Bras, Jianfeng Gao, Yejin Choi - 2019 
- Measuring Massive Multitask Language Understanding - Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt - 2020 
- Training Verifiers to Solve Math Word Problems - Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, etal. - 2021 
- WinoGrande: An Adversarial Winograd Schema Challenge at Scale - Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi - 2019 
- Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models - Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, etal. - 2022 
- AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models - Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, etal. - 2023 
- Evaluating Large Language Models Trained on Code - Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, etal. - 2021 
- Program Synthesis with Large Language Models - Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, etal. - 2021 
- Ring Attention with Blockwise Transformers for Near-Infinite Context - Hao Liu, Matei Zaharia, Pieter Abbeel - 2023 
- Sequence Parallelism: Long Sequence Training from System Perspective - Shenggui Li, Fuzhao Xue, Chaitanya Baranwal, Yongbin Li, Yang You - 2021 
- Reducing Activation Recomputation in Large Transformer Models - Vijay Korthikanti, Jared Casper, Sangkug Lym, Lawrence McAfee, Michael Andersch, etal. - 2022 
- DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training - Dacheng Li, Rulin Shao, Anze Xie, Eric P. Xing, Xuezhe Ma, Ion Stoica, Joseph E. Gonzalez, Hao Zhang - 2023 
- Efficient Memory Management for Large Language Model Serving with PagedAttention - Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, etal. - 2023 
- PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling - Zefan Cai, Yichi Zhang, Bofei Gao, Yuliang Liu, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, etal. - 2024 
- GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints - Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebrón, Sumit Sanghai - 2023 
- Fast Inference from Transformers via Speculative Decoding - Yaniv Leviathan, Matan Kalman, Yossi Matias - 2022 
- Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads - Tianle Cai, Yuhong Li, Zhengyang Geng, Hongwu Peng, Jason D. Lee, Deming Chen, Tri Dao - 2024 
- https://github.com/pytorch/torchtune
- https://github.com/vllm-project/vllm
- https://huggingface.co/models
- https://lmsys.org/
- https://ollama.com/
- https://github.com/ggerganov/llama.cpp
- Toolformer: Language Models Can Teach Themselves to Use Tools - Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, etal. - 2023 
- AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls - Yu Du, Fangyun Wei, Hongyang Zhang - 2024 
- The Llama 3 Herd of Models - Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, etal. - 2024 
- Synchromesh: Reliable code generation from pre-trained language models - Gabriel Poesia, Oleksandr Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, etal. - 2022 
- Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation - Luca Beurer-Kellner, Marc Fischer, Martin Vechev - 2024 
- Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search - Chris Hokamp, Qun Liu - 2017 
- Long Context Compression with Activation Beacon - Peitian Zhang, Zheng Liu, Shitao Xiao, Ninglu Shao, Qiwei Ye, Zhicheng Dou - 2024 
- RoFormer: Enhanced Transformer with Rotary Position Embedding - Jianlin Su, Yu Lu, Shengfeng Pan, Ahmed Murtadha, Bo Wen, Yunfeng Liu - 2021 
- Extending Context Window of Large Language Models via Positional Interpolation - Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian - 2023 
- Reading Wikipedia to Answer Open-Domain Questions - Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes - 2017 
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks - Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, etal. - 2020 
- REALM: Retrieval-Augmented Language Model Pre-Training - Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang - 2020 
- Improving language models by retrieving from trillions of tokens - Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, etal. - 2021 
- In-Context Retrieval-Augmented Language Models - Ori Ram, Yoav Levine, Itay Dalmedigos, Dor Muhlgay, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham - 2023 
- Vision Transformers Need Registers - Timothée Darcet, Maxime Oquab, Julien Mairal, Piotr Bojanowski - 2023 
- Massive Activations in Large Language Models - Mingjie Sun, Xinlei Chen, J. Zico Kolter, Zhuang Liu - 2024 
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models - Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, etal. - 2022 
- Self-Consistency Improves Chain of Thought Reasoning in Language Models - Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, etal. - 2022 
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models - Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan - 2023 
- ReAct: Synergizing Reasoning and Acting in Language Models - Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao - 2022 
- Reflexion: Language Agents with Verbal Reinforcement Learning - Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao - 2023 
- Generative Verifiers: Reward Modeling as Next-Token Prediction - Lunjun Zhang, Arian Hosseini, Hritik Bansal, Mehran Kazemi, Aviral Kumar, Rishabh Agarwal - 2024 
- ChatGPT is bullshit - Michael Townsen Hicks, James Humphries, Joe Slater - 2024 
- Large Language Models Cannot Self-Correct Reasoning Yet - Jie Huang, Xinyun Chen, Swaroop Mishra, Huaixiu Steven Zheng, Adams Wei Yu, Xinying Song, Denny Zhou - 2023 
- Dissociating language and thought in large language models - Kyle Mahowald, Anna A. Ivanova, Idan A. Blank, Nancy Kanwisher, Joshua B. Tenenbaum, etal. - 2023 
- Physics of Language Models: Part 1, Learning Hierarchical Language Structures - Zeyuan Allen-Zhu, Yuanzhi Li - 2023 
- Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs - Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, etal. - 2024 
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models - Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, etal. - 2024 
- DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning - DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, etal. - 2025 
- Reinforcement Learning for Long-Horizon Interactive LLM Agents - Kevin Chen, Marco Cusumano-Towner, Brody Huval, Aleksei Petrenko, Jackson Hamburger, etal. - 2025 
- Buy 4 REINFORCE Samples, Get a Baseline for Free! - Wouter Kool, Herke van Hoof, Max Welling - 2019