[FFmpeg-devel] libavcodec/proresdec : add qmat dsp with SSE2, AVX2 simd
Ronald S. Bultje
rsbultje at gmail.com
Sat Oct 7 19:16:08 EEST 2017
Hi Martin,
On Sat, Oct 7, 2017 at 11:49 AM, Martin Vignali <martin.vignali at gmail.com>
wrote:
> 2017-10-07 17:30 GMT+02:00 Ronald S. Bultje <rsbultje at gmail.com>:
> > On Sat, Oct 7, 2017 at 10:22 AM, Martin Vignali <
> martin.vignali at gmail.com>
> > wrote:
> > > Patch in attach add a new dsp
> > > for manipulation of qmat
> > >
> > > for now, i move this code inside
> > >
> > > for (i = 0; i < 64; i++) {
> > > qmat_luma_scaled [i] = ctx->qmat_luma [i] * qscale;
> > > qmat_chroma_scaled[i] = ctx->qmat_chroma[i] * qscale;
> > > }
> > >
> > > i add a special case for qscale == 1
> > > and SSE2, AVX2 optimization
> >
> > This loop only executes once per slice. We typically do not SIMD-optimize
> > at that level, because it won't give significant speed gains...
>
> Ok didn't know that.
> I mostly follow, what there are already done, like in blockdsp.clear_block
>
Right, so consider that blockdsp is done per block (16x16 pixels), not per
slice.
You could remove this entirely from the slice processing code by simply
pre-calculating the values in the init function once for the whole stream,
there's only 224 qscale values so it's 224*64*2 multiplications, which is
(in the context of prores) virtually negligible.
Ronald
More information about the ffmpeg-devel
mailing list