Video Prediction and Generation

Video generation is the process of creating non-existing visual scenes from existing scenes. Its applications range from high frame-rate conversion, action forecast, to pose-invariant face recognition, virtual view synthesis. Cutting-edge deep learning and computer vision technologies are desired for high-quality video generation. In the above figure, we can see Fig. 1(a) generates middle frames from adjacent frames, Fig. 1(b) generates future frames from past frames, and Fig. 2 generates intermediate views from adjacent views in a multi-camera system.

In recent years, the generative adversarial network (GAN) has been proposed and has led the most recent research trend of deep learning. A GAN is able to generate photo-realistic images from random noise, by training two competing networks, the generator and the discriminator. While the discriminator aims to distinguish fake images from natural images, the generator tries to produce natural-looking images to fool the discriminator. The gaming mechanism of GAN allows the generator to create images that are persistent to human perception, which can be leveraged in video generation tasks. In this project, our goal is to build a video generation framework as shown in the following figure based on GAN.