[FFmpeg-devel] [PATCH 1/4] ssim: refactor a weird double loop.

Paul B Mahol onemda at gmail.com
Sun Jul 12 16:29:04 CEST 2015


Dana 12. 7. 2015. 14:18 osoba "Ronald S. Bultje" <rsbultje at gmail.com>
napisala je:
>
> Hi,
>
> On Sun, Jul 12, 2015 at 6:48 AM, Paul B Mahol <onemda at gmail.com> wrote:
>
> > Dana 12. 7. 2015. 01:56 osoba "Ronald S. Bultje" <rsbultje at gmail.com>
> > napisala je:
> > >
> > > ---
> > >  libavfilter/vf_ssim.c | 5 ++---
> > >  1 file changed, 2 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/libavfilter/vf_ssim.c b/libavfilter/vf_ssim.c
> > > index 0721ddd..3ef122f 100644
> > > --- a/libavfilter/vf_ssim.c
> > > +++ b/libavfilter/vf_ssim.c
> > > @@ -134,7 +134,7 @@ static float ssim_end1(int s1, int s2, int ss, int
> > s12)
> > >           / ((float)(fs1 * fs1 + fs2 * fs2 + ssim_c1) * (float)(vars +
> > ssim_c2));
> > >  }
> > >
> > > -static float ssim_end4(int sum0[5][4], int sum1[5][4], int width)
> > > +static float ssim_endn(int (*sum0)[4], int (*sum1)[4], int width)
> > >  {
> > >      float ssim = 0.0;
> > >      int i;
> > > @@ -169,8 +169,7 @@ static float ssim_plane(uint8_t *main, int
> > main_stride,
> > >                                  &sum0[x]);
> > >          }
> > >
> > > -        for (x = 0; x < width - 1; x += 4)
> > > -            ssim += ssim_end4(sum0 + x, sum1 + x, FFMIN(4, width - x
-
> > 1));
> > > +        ssim += ssim_endn(sum0, sum1, width - 1);
> > >      }
> > >
> > >      return ssim / ((height - 1) * (width - 1));
> > > --
> > > 2.1.2
> > >
> > >
> >
> > Why? There was reason behind this code I guess.
> >
>
> I think it's for simd code simplification. See, I'm guessing the code you
> took from libvpx had an extra condition to do only 4-sized chunks through
a
> function pointer, and then the odd tail in c code. If you do this, the
simd
> code has a fixed size (always 4), which makes the implementation much more
> trivial: 4 16-byte loads, add, transpose4x4d, and then ssim_end1 to get 4
> results, which you horizontal-add and return.
>

I took this from tiny_ssim.c as pengvado said its ok to relicense to lgpl.

> The disadvantage is overhead. First, call overhead since each 4-element
> chunk requires a function call, second overhead for function
initialization
> (anything outside the main loop, either before or after). This includes
the
> horizontal-add, which is relatively expensive. Third, it limits us to
> 16-byte: no avx(2). Doing a variable-size function makes the simd slightly
> more complex, but is more future-proof (avx/2) and theoretically faster.
>
> Does this change results?
>
>
> No.
>
> Ronald
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


More information about the ffmpeg-devel mailing list