[FFmpeg-devel] libavcodec/lossless_videodsp : add add_bytes AVX2

Paul B Mahol onemda at gmail.com
Wed Oct 25 23:54:12 EEST 2017


On 10/25/17, Martin Vignali <martin.vignali at gmail.com> wrote:
> 2017-10-25 22:08 GMT+02:00 Paul B Mahol <onemda at gmail.com>:
>
>> On 10/25/17, Martin Vignali <martin.vignali at gmail.com> wrote:
>> > 2017-10-25 21:53 GMT+02:00 Paul B Mahol <onemda at gmail.com>:
>> >
>> >> On 10/25/17, Martin Vignali <martin.vignali at gmail.com> wrote:
>> >> > 2017-10-25 9:43 GMT+02:00 Paul B Mahol <onemda at gmail.com>:
>> >> >
>> >> >> On 10/21/17, Martin Vignali <martin.vignali at gmail.com> wrote:
>> >> >> > Hello,
>> >> >> >
>> >> >> > In attach patch to add AVX2 version for add_bytes
>> >> >> >
>> >> >> > 0001-libavcodec-lossless_videodsp-add-add_bytes-avx2-vers :
>> >> >> > add AVX2 version
>> >> >> >
>> >> >> > pass fate-test for me (os 10.12, x86_64)
>> >> >> >
>> >> >> > checkasm result : (Kaby Lake) (run 10 times, and i took the
>> >> >> > fastest
>> >> >> > version)
>> >> >> > checkasm: all 2 tests passed
>> >> >> > add_bytes_c: 108.7
>> >> >> > add_bytes_sse2: 26.5
>> >> >> > add_bytes_avx2: 15.5
>> >> >> >
>> >> >> >
>> >> >> > 0002-libavcodec-lossless_video_dsp-cosmetic-add-better-se:
>> >> >> > only cosmetic
>> >> >> > like the ref c function declaration in asm file is not consistent
>> >> >> > between
>> >> >> > each asm file
>> >> >> > i think a better separator for each function make the file easier
>> to
>> >> >> > read
>> >> >> >
>> >> >> > also add the c declaration for add bytes in comment
>> >> >> >
>> >> >> >
>> >> >> > Martin
>> >> >> >
>> >> >>
>> >> >> Are you sure 32bit alignment is actually enforced?
>> >> >>
>> >> >>
>> >> > Hello,
>> >> >
>> >> > I think, data used by add_bytes is always aligned
>> >> > because dst and src, are start of a line of an AvFrame
>> >>
>> >> Yes, but try width thats not multiple of 32.
>> >> _______________________________________________
>> >>
>> >>
>> > Sorry, not sure i understand.
>> > following the doc, AVFrame->linesize, is multiple of max alignment
>> >
>> > and in the asm, loop will be repeat until, val < width
>> >
>> > Can you indicate me, the part, where you think, it's not ok ?
>>
>> I dunno. You should test it with widths not divisible by 32.
>>
>
> Tested with the fate sample : vsynth3-huffyuvbgra.avi (34x34)
> ./ffmpeg -i ./tests/data/fate/vsynth3-huffyuvbgra.avi -f framecrc -
>
> generate same crc than
> ./ffmpeg -i ./tests/data/fate/vsynth3-huffyuvbgra.avi -f framecrc -
> -cpuflags 0
>
>
>>
>> also try encoding cropped video.
>>
>
> Are you sure, encoding cropped video, have a link to the decoding dsp func ?
>
> these patch only take care about the decoding func
>
>
> And the encoding func of huffyuvenc (in huffyuv add add/diff_bytes16 AVX2
> discussion)
> and losslessencdsp (not made for now), have a test for alignment of dst and
> src
>
>
> Martin
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


ok then


More information about the ffmpeg-devel mailing list