[FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

Marton Balint cus at passwd.hu
Sat Jan 19 01:28:25 EET 2019



On Thu, 17 Jan 2019, Michael Niedermayer wrote:

> On Wed, Jan 16, 2019 at 08:00:22PM +0100, Marton Balint wrote:
>>
>>
>> On Tue, 15 Jan 2019, Michael Niedermayer wrote:
>>
>>> On Sun, Dec 30, 2018 at 07:15:49PM +0100, Marton Balint wrote:
>>>>
>>>>
>>>> On Fri, 28 Dec 2018, Michael Niedermayer wrote:
>>>>
>>>>> On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:
>>>>>>
>>>>>>
>>>>>> On Wed, 26 Dec 2018, Paul B Mahol wrote:
>>>>>>
>>>>>>> On 12/26/18, Michael Niedermayer <michael at niedermayer.cc> wrote:
>>>>>>>> On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
>>>>>>>>> On 12/25/18, Michael Niedermayer <michael at niedermayer.cc> wrote:
>>>>>>>>>> Fixes: Timeout
>>>>>>>>>> Fixes:
>>>>>>>>>> 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>>>>>>>>>> Before: Executed
>>>>>>>>>> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>>>>>>>>>> in 11294 ms
>>>>>>>>>> After : Executed
>>>>>>>>>> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>>>>>>>>>> in 4249 ms
>>>>>>>>>>
>>>>>>>>>> Found-by: continuous fuzzing process
>>>>>>>>>> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>>>>>>>>>> Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
>>>>>>>>>> ---
>>>>>>>>>> libavutil/imgutils.c | 6 ++++++
>>>>>>>>>> 1 file changed, 6 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
>>>>>>>>>> index 4938a7ef67..cc38f1e878 100644
>>>>>>>>>> --- a/libavutil/imgutils.c
>>>>>>>>>> +++ b/libavutil/imgutils.c
>>>>>>>>>> @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
>>>>>>>>>> dst_size,
>>>>>>>>>> uint8_t *clear,
>>>>>>>>>>        }
>>>>>>>>>>    } else if (clear_size == 4) {
>>>>>>>>>>        uint32_t val = AV_RN32(clear);
>>>>>>>>>> +        uint64_t val8 = val * 0x100000001ULL;
>>>>>>>>>> +        for (; dst_size >= 32; dst_size -= 32) {
>>>>>>>>>> +            AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
>>>>>>>>>> +            AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
>>>>>>>>>> +            dst += 32;
>>>>>>>>>> +        }
>>>>>>>>>>        for (; dst_size >= 4; dst_size -= 4) {
>>>>>>>>>>            AV_WN32(dst, val);
>>>>>>>>>>            dst += 4;
>>>>>>>>>> --
>>>>>>>>>> 2.20.1
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> NAK, implement special memset function instead.
>>>>>>>>
>>>>>>>> I can move the added loop into a seperate function, if thats what you
>>>>>>>> suggest ?
>>>>>>>
>>>>>>> No, don't do that.
>>>>>>>
>>>>>>>> All the code is already in a "special" memset though, this is
>>>>>>>> memset_bytes()
>>>>>>>>
>>>>>>>
>>>>>>> I guess function is less useful if its static. So any duplicate should
>>>>>>> be avoided in codebase.
>>>>>>
>>>>>> Isn't av_memcpy_backptr does almost exactly what is needed here? That can
>>>>>> also be optimized further if needed.
>>>>>
>>>>> av_memcpy_backptr() copies data with overlap, its more like a recursive
>>>>> memmove().
>>>>
>>>> So? As far as I see the memset_bytes function in imgutils.c can be replaced
>>>> with this:
>>>>
>>>>    if (clear_size > dst_size)
>>>>        clear_size = dst_size;
>>>>    memcpy(dst, clear, clear_size);
>>>>    av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);
>>>>
>>>> I am not against an av_memset_bytes API addition, but I believe it should
>>>> share code with av_memcpy_backptr to avoid duplication.
>>>
>>> ive implemented this, it does not seem to be really faster in the testcase
>>
>> I guess it is not faster because you have not applied your original
>> optimalization to fill32 in libavutil/mem.c. Could you compare speed after
>> optimizing that the same way your original patch did it with imgutils
>> memset_bytes?
>
> sure, that makes it faster:

Thanks, both patches LGTM.

Regards,
Marton


More information about the ffmpeg-devel mailing list