[FFmpeg-devel] libavcodec/lossless_videodsp : add add_bytes AVX2

Martin Vignali martin.vignali at gmail.com
Wed Oct 25 23:05:14 EEST 2017


2017-10-25 21:53 GMT+02:00 Paul B Mahol <onemda at gmail.com>:

> On 10/25/17, Martin Vignali <martin.vignali at gmail.com> wrote:
> > 2017-10-25 9:43 GMT+02:00 Paul B Mahol <onemda at gmail.com>:
> >
> >> On 10/21/17, Martin Vignali <martin.vignali at gmail.com> wrote:
> >> > Hello,
> >> >
> >> > In attach patch to add AVX2 version for add_bytes
> >> >
> >> > 0001-libavcodec-lossless_videodsp-add-add_bytes-avx2-vers :
> >> > add AVX2 version
> >> >
> >> > pass fate-test for me (os 10.12, x86_64)
> >> >
> >> > checkasm result : (Kaby Lake) (run 10 times, and i took the fastest
> >> > version)
> >> > checkasm: all 2 tests passed
> >> > add_bytes_c: 108.7
> >> > add_bytes_sse2: 26.5
> >> > add_bytes_avx2: 15.5
> >> >
> >> >
> >> > 0002-libavcodec-lossless_video_dsp-cosmetic-add-better-se:
> >> > only cosmetic
> >> > like the ref c function declaration in asm file is not consistent
> >> > between
> >> > each asm file
> >> > i think a better separator for each function make the file easier to
> >> > read
> >> >
> >> > also add the c declaration for add bytes in comment
> >> >
> >> >
> >> > Martin
> >> >
> >>
> >> Are you sure 32bit alignment is actually enforced?
> >>
> >>
> > Hello,
> >
> > I think, data used by add_bytes is always aligned
> > because dst and src, are start of a line of an AvFrame
>
> Yes, but try width thats not multiple of 32.
> _______________________________________________
>
>
Sorry, not sure i understand.
following the doc, AVFrame->linesize, is multiple of max alignment

and in the asm, loop will be repeat until, val < width

Can you indicate me, the part, where you think, it's not ok ?

Martin


More information about the ffmpeg-devel mailing list