Building AI to Recreate Our Visual World


Humans are consumers of visual content. Every day, people watch videos, play digital games and share photos on social media. But there is still an asymmetry - not that many of us are creators. We aim to build machines capable of creating and manipulating photographs and use them as training wheels for visual content creation, with the goal of making people more visually literate. We propose to learn natural image statistics directly from large-scale data. We then define a class of image generation and editing operations and constrain their output to look realistic according to the learned image statistics.

I will discuss a few recent projects. First, we propose to directly model the natural image manifold via generative adversarial networks (GANs) and constrain the output of an image editing tool to lie on this manifold. Then, we present a general image-to-image translation framework, “pix2pix”, where a network is trained to map input images (such as user sketches) directly to natural looking results. Finally, we introduce CycleGAN, which learns image-to-image translation models even in the absence of paired training data and additionally demonstrate its application to style transfer, object transfiguration, season transfer, and photo enhancement.


Jun-Yan Zhu is a Ph.D. student at the Berkeley AI Research (BAIR) Lab, working on computer vision, graphics and machine learning with Professor Alexei A. Efros. He received his B.E. from Tsinghua University in 2012 and was a Ph.D. student at CMU from 2012-13. His research goal is to build machines capable of recreating the visual world. Jun-Yan is currently supported by a Facebook Graduate Fellowship.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google