[FFmpeg-devel] [PATCH] avfilter/vf_gblur: add x86 SIMD optimizations

Song, Ruiling ruiling.song at intel.com
Thu May 30 10:29:25 EEST 2019



> -----Original Message-----
> From: Paul B Mahol [mailto:onemda at gmail.com]
> Sent: Thursday, May 30, 2019 3:24 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Cc: Song, Ruiling <ruiling.song at intel.com>
> Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_gblur: add x86 SIMD
> optimizations
> 
> On 5/30/19, Ruiling Song <ruiling.song at intel.com> wrote:
> > For details of the implementation, please refer to the comment
> > inlined in the assembly code. It improves the horizontal pass
> > performance about 100% under single thread.
> >
> > Tested overall performance using the command(avx2 enabled):
> > ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null
> > ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null
> > For single thread, the fps improves from 43 to 60, about 40%.
> > For multi-thread, the fps improves from 110 to 130, about 20%.
> >
> > Signed-off-by: Ruiling Song <ruiling.song at intel.com>
> > ---
> >  libavfilter/gblur.h             |  54 ++++++++++
> >  libavfilter/vf_gblur.c          |  66 +++++-------
> >  libavfilter/x86/Makefile        |   2 +
> >  libavfilter/x86/vf_gblur.asm    | 182
> ++++++++++++++++++++++++++++++++
> >  libavfilter/x86/vf_gblur_init.c |  36 +++++++
> >  5 files changed, 302 insertions(+), 38 deletions(-)
> >  create mode 100644 libavfilter/gblur.h
> >  create mode 100644 libavfilter/x86/vf_gblur.asm
> >  create mode 100644 libavfilter/x86/vf_gblur_init.c

[...]
> > diff --git a/libavfilter/vf_gblur.c b/libavfilter/vf_gblur.c
> > index b91a8c074a..4e876bca05 100644
> > --- a/libavfilter/vf_gblur.c
> > +++ b/libavfilter/vf_gblur.c
> > @@ -30,29 +30,11 @@
> >  #include "libavutil/pixdesc.h"
> >  #include "avfilter.h"
> >  #include "formats.h"
> > +#include "gblur.h"
> >  #include "internal.h"
> >  #include "video.h"
> > +#include <immintrin.h>
> 
> Is this header really needed?
Oh, this is not needed, I forget to remove it after I am experimenting with SSE intrinsics.
Will remove it. Thanks!

Ruiling


More information about the ffmpeg-devel mailing list