Dall-E

Slides
Video Lecture

References

  1. Neural Discrete Representation LearningAaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu2017
  2. Zero-Shot Text-to-Image GenerationAditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, etal.2021
  3. Language Models are Unsupervised Multitask LearnersAlec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever2019
  4. Simulating 500 million years of evolution with a language modelThomas Hayes, Roshan Rao, Halil Akin, Nicholas J. Sofroniew, Deniz Oktay, etal.2024
  5. Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image CaptioningPiyush Sharma, Nan Ding, Sebastian Goodman, Radu Soricut2018
  6. YFCC100M: The New Data in Multimedia ResearchBart Thomee, David A. Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, etal.2015
  7. Generating Long Sequences with Sparse TransformersRewon Child, Scott Gray, Alec Radford, Ilya Sutskever2019