Diffusion Models for Text-to-Image Generation


Recently, OpenAI showcased their latest text-to-image model known as DALL-E2. It generates photorealistic images from text including some unusual one e.g. ""astraunaut riding a horse"". Soon after, this was superceded by Google's Imagen as state-of-the-art model. Both models share a common thing, they use diffusion models as the core algorithm, which is the topic of our workshop.

In this workshop, we will first go through the evolution of models i.e GANs and autoregressive transformer (used in DALL-E) before delving into diffusion model. What you'll learn in this workshop:
- The principle of training diffusion model
- Image super resolution
- Classifier-free guidance for text conditioning
- Architecture of DALL-E 2 and Imagen
- CLIP guidance and hands-on using DiscoDiffusion

Background Knowledge:


Soon-Yau Cheong is the founder of Sooner.ai, an AI consulting and training company specialises in image/video generation and manipulation. Past projects include face swapping, portrait cartoonisation, shoes virtual try-on etc. He is well-versed in generative AI techniques which include GANs, autoregressive transformer and diffusion models. He authored the book “Hands-on Image Generation with TensorFlow” which is well-received for its hands-on approach in making difficult mathematical theories easy to understand. Soon-Yau is also currently doing PhD in AI digital media creation at University of Surrey.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google