RLHF
SlidesVideo Lecture
References
Training language models to follow instructions with human feedback Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, etal. 2022 Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, etal. 2024 Simple statistical gradient-following algorithms for connectionist reinforcement learning Ronald J. Williams 1992