[FFmpeg-devel] [GSoC] Motion Interpolation

Wed Aug 10 21:40:16 EEST 2016

On Mon, Jul 25, 2016 at 9:35 AM Davinder Singh <ds.mudhar at gmail.com> wrote:

> https://github.com/dsmudhar/FFmpeg/commits/dev
>
> The Paper 2 algorithm is complete. It seems good. If I compare Paper 2
> (which uses bilateral motion estimation) v/s motion vectors exported by
> mEstimate filter:
>
> $ tiny_psnr 60_source_2.yuv 60_mest-esa+obmc.yuv
> stddev:    1.43 PSNR: 45.02 MAXDIFF:  174 bytes:476928000/474163200
>
> $ tiny_psnr 60_source_2.yuv 60_paper2_aobmc+cls.yuv
> stddev:    1.25 PSNR: 46.18 MAXDIFF:  187 bytes:476928000/474163200
>
> Frame comparison: http://www.mediafire.com/?qe7sc4o0s4hgug5
>
> Compared to simple OBMC which over-smooth edges, Objects clustering and
> Adaptive OBMC makes the edges crisp but also introduce blocking artifacts
> where MVs are bad (with default search window = 7). But I think it’s ESA’s
> fault. The paper doesn’t specify which motion estimation method they used;
> I have been using ESA. I think quality can be further improved with EPZS,
> which I'm going to implement.
>
> I also tried to tweak VS-BMC (Variable size block motion compensation)
> which reduced the blocking artifacts in VS-BMC area. Had to do experiments
> a lot, more to be done.
>
> mEstimate filter (ESA) + Simple OBMC:
> http://www.mediafire.com/?3b8j1zj1lsuw979
> Paper 2 (full): http://www.mediafire.com/?npbw1iv6tmxwvyu
>
>
> Regards,
> DSM_
>

implemented all other modern fast ME algorithms:
https://github.com/dsmudhar/FFmpeg/blob/dev/libavfilter/vf_mestimate.c

quality is further improved with UMH which uses prediction [1]:
$ ../../../tiny_psnr 60_source_2.yuv 60_wtf.yuv
stddev: 1.05 PSNR: 47.65 MAXDIFF: 178 bytes:476928000/474163200
(search window = 18)

only problem is when the motion is too fast in some movie scenes (e.g. far
objects in background when camera is rotating) and bigger than the search
window, there will be artifacts.

good thing with predictive UMH search (compared to ESA) is we can use
bigger search window; with P = around 20, it removed all those artifacts
for which the search window wasn't large enough.

but using too big search window reduces the quality.

here's another idea: dynamic block size selection for MC-FRUC
since it's not video encoding, using 16x16 block with fixed search window
may not work same for all resolution videos. what if we automatic resize
block depending on resolution? like if 16x16, P=20 works fine for 1280x720
video, we can scale it according to width, e.g for 1920x1080 which 1.5x
1280, we use 24x24 block and also scale P accordingly? i haven't tested it
yet though.

[1]: JVT-F017.pdf by Z Chen <http://akuvian.org/src/x264/JVT-F017.pdf.gz>

DSM_