RLHF
SlidesVideo Lecture
References
- Training language models to follow instructions with human feedback - Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, etal. - 2022 
- Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs - Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, etal. - 2024 
- Simple statistical gradient-following algorithms for connectionist reinforcement learning - Ronald J. Williams - 1992