Quantization

Slides
Video Lecture

References

  1. 8-Bit Approximations for Parallelism in Deep LearningTim Dettmers2015
  2. 8-bit Optimizers via Block-wise QuantizationTim Dettmers, Mike Lewis, Sam Shleifer, Luke Zettlemoyer2021
  3. The case for 4-bit precision: k-bit Inference Scaling LawsTim Dettmers, Luke Zettlemoyer2022