[FFmpeg-devel] [FFmpeg-cvslog] avutil/mem: Optimize fill32() by unrolling and using 64bit

Carl Eugen Hoyos ceffmpeg at gmail.com
Mon Jan 21 02:09:45 EET 2019


2019-01-20 23:37 GMT+01:00, Michael Niedermayer <michael at niedermayer.cc>:
> On Sun, Jan 20, 2019 at 10:33:26PM +0100, Carl Eugen Hoyos wrote:
>> 2019-01-20 22:22 GMT+01:00, Michael Niedermayer <git at videolan.org>:
>> > ffmpeg | branch: master | Michael Niedermayer <michael at niedermayer.cc> |
>> > Thu
>> > Jan 17 22:35:10 2019 +0100| [12b1338be376a3e5fb606d9fe41b58dc4a9e62c7]
>> > |
>> > committer: Michael Niedermayer
>> >
>> > avutil/mem: Optimize fill32() by unrolling and using 64bit
>> >
>> > Reviewed-by: Marton Balint <cus at passwd.hu>
>> > Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
>> >
>> >> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=12b1338be376a3e5fb606d9fe41b58dc4a9e62c7
>> > ---
>> >
>> >  libavutil/mem.c | 12 ++++++++++++
>> >  1 file changed, 12 insertions(+)
>> >
>> > diff --git a/libavutil/mem.c b/libavutil/mem.c
>> > index 6149755a6b..88fe09b179 100644
>> > --- a/libavutil/mem.c
>> > +++ b/libavutil/mem.c
>> > @@ -399,6 +399,18 @@ static void fill32(uint8_t *dst, int len)
>> >  {
>> >      uint32_t v = AV_RN32(dst - 4);
>> >
>> > +#if HAVE_FAST_64BIT
>>
>> I suspect this should be !X86_32
>
>>
>> > +    uint64_t v2= v + ((uint64_t)v<<32);
>> > +    while (len >= 32) {
>> > +        AV_WN64(dst   , v2);
>> > +        AV_WN64(dst+ 8, v2);
>> > +        AV_WN64(dst+16, v2);
>> > +        AV_WN64(dst+24, v2);
>> > +        dst += 32;
>> > +        len -= 32;
>> > +    }
>>
>> How can I test the performance of this function?
>
> with the testcase from the fuzzer (it should be substantially
> faster in this case with teh next commit)

> it should also be possible to test it with some fate tests
> as this is used by some.

I cannot measure any speed difference for the (lengthened)
nuv and cscd fate samples with your patch, so I don't think
this questions warrants further investigation.

Thank you for the abort() suggestion, Carl Eugen


More information about the ffmpeg-devel mailing list