Can learned frame prediction compete with block motion compensation for video coding?

My paper titled “Can learned frame prediction compete with block motion compensation for video coding?” is published on Springer Journal of Signal, Image and Video Processing. The supplementary material is presented at the bottom of this page.

The paper is available on Springer Link and on ArXiv. Our paper can be cited as:

@article{sulun2021can,
    title={Can learned frame prediction compete with block motion compensation for video coding?},
    author={Sulun, Serkan and Tekalp, A Murat},
    journal={Signal, Image and Video Processing},
    publisher={Springer Science and Business Media LLC},
    volume={15},
    number={2},
    pages={401--410},
    year={2021}}

Abstract

Given recent advances in learned video prediction, we investigate whether a simple video codec using a pre-trained deep model for next frame prediction based on previously encoded/decoded frames without sending any motion side information can compete with standard video codecs based on block-motion compensation. Frame differences given learned frame predictions are encoded by a standard still-image (intra) codec. Experimental results show that the rate-distortion performance of the simple codec with symmetric complexity is on average better than that of x264 codec on 10 MPEG test videos, but does not yet reach the level of x265 codec. This result demonstrates the power of learned frame prediction (LFP), since unlike motion compensation, LFP does not use information from the current picture. The implications of training with L1, L2, or combined L2 and adversarial loss on prediction performance and compression efficiency are analyzed.

 

SUPPLEMENTARY MATERIAL

Here are the qualitative results for our learned video prediction models. In particular, we compare the performance of two models, one trained with L2 loss (L2) and the other with adversarial loss and L2 loss combined (GAN). All results belong to the 9th frames of each video. The slideshow provides an easy way to compare images, enabling navigation using the arrow keys of the keyboard. Below the slideshow, ground-truth and prediction images are available at their original size.





City - Frame 9

Ground-truth



Prediction of L2 model



Prediction of GAN model



Coastguard - Frame 9

Ground-truth



Prediction of L2 model



Prediction of GAN model



Container - Frame 9

Ground-truth



Prediction of L2 model



Prediction of GAN model



Football - Frame 9

Ground-truth



Prediction of L2 model



Prediction of GAN model



Foreman - Frame 9

Ground-truth



Prediction of L2 model



Prediction of GAN model



Garden - Frame 9

Ground-truth



Prediction of L2 model



Prediction of GAN model



Hall monitor - Frame 9

Ground-truth



Prediction of L2 model



Prediction of GAN model



Harbour - Frame 9

Ground-truth



Prediction of L2 model



Prediction of GAN model



Mobile - Frame 9

Ground-truth



Prediction of L2 model



Prediction of GAN model



Tennis - Frame 9

Ground-truth



Prediction of L2 model



Prediction of GAN model



Written on May 27, 2020