RLHF

Slides
Video Lecture

References

  1. Training language models to follow instructions with human feedbackLong Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, etal.2022
  2. Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMsArash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, etal.2024
  3. Simple statistical gradient-following algorithms for connectionist reinforcement learningRonald J. Williams1992