[FFmpeg-devel] [PATCH 11/13] ffv1dec: reference the current packet into the main context

Thu Mar 13 06:57:06 EET 2025

Lynne:
> 
> 
> On 13/03/2025 02:24, Andreas Rheinhardt wrote:
>> Lynne:
>>> On 10/03/2025 18:42, Lynne wrote:
>>>> On 10/03/2025 04:14, Andreas Rheinhardt wrote:
>>>>> Lynne:
>>>>>> ---
>>>>>>    libavcodec/ffv1.h    |  3 +++
>>>>>>    libavcodec/ffv1dec.c | 19 +++++++++++++++++--
>>>>>>    2 files changed, 20 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/libavcodec/ffv1.h b/libavcodec/ffv1.h
>>>>>> index 8c0e71284d..860a5c14b1 100644
>>>>>> --- a/libavcodec/ffv1.h
>>>>>> +++ b/libavcodec/ffv1.h
>>>>>> @@ -174,6 +174,9 @@ typedef struct FFV1Context {
>>>>>>         * NOT shared between frame threads.
>>>>>>         */
>>>>>>        uint8_t           frame_damaged;
>>>>>> +
>>>>>> +    /* Reference to the current packet */
>>>>>> +    AVPacket *pkt_ref;
>>>>>>    } FFV1Context;
>>>>>>    int ff_ffv1_common_init(AVCodecContext *avctx, FFV1Context *s);
>>>>>> diff --git a/libavcodec/ffv1dec.c b/libavcodec/ffv1dec.c
>>>>>> index eaa21eebdf..6396f22f79 100644
>>>>>> --- a/libavcodec/ffv1dec.c
>>>>>> +++ b/libavcodec/ffv1dec.c
>>>>>> @@ -469,6 +469,10 @@ static av_cold int decode_init(AVCodecContext
>>>>>> *avctx)
>>>>>>        f->pix_fmt = AV_PIX_FMT_NONE;
>>>>>>        f->configured_pix_fmt = AV_PIX_FMT_NONE;
>>>>>> +    f->pkt_ref = av_packet_alloc();
>>>>>> +    if (!f->pkt_ref)
>>>>>> +        return AVERROR(ENOMEM);
>>>>>> +
>>>>>>        if ((ret = ff_ffv1_common_init(avctx, f)) < 0)
>>>>>>            return ret;
>>>>>> @@ -701,6 +705,10 @@ static int decode_frame(AVCodecContext *avctx,
>>>>>> AVFrame *rframe,
>>>>>>        /* Start */
>>>>>>        if (hwaccel) {
>>>>>> +        ret = av_packet_ref(f->pkt_ref, avpkt);
>>>>>> +        if (ret < 0)
>>>>>> +            return ret;
>>>>>> +
>>>>>>            ret = hwaccel->start_frame(avctx, avpkt->data, avpkt-
>>>>>> >size);
>>>>>>            if (ret < 0)
>>>>>>                return ret;
>>>>>> @@ -720,15 +728,21 @@ static int decode_frame(AVCodecContext *avctx,
>>>>>> AVFrame *rframe,
>>>>>>                uint32_t len;
>>>>>>                ret = find_next_slice(avctx, avpkt->data, buf_end, i,
>>>>>>                                      &pos, &len);
>>>>>> -            if (ret < 0)
>>>>>> +            if (ret < 0) {
>>>>>> +                av_packet_unref(f->pkt_ref);
>>>>>>                    return ret;
>>>>>> +            }
>>>>>>                buf_end -= len;
>>>>>>                ret = hwaccel->decode_slice(avctx, pos, len);
>>>>>> -            if (ret < 0)
>>>>>> +            if (ret < 0) {
>>>>>> +                av_packet_unref(f->pkt_ref);
>>>>>>                    return ret;
>>>>>> +            }
>>>>>>            }
>>>>>> +
>>>>>> +        av_packet_unref(f->pkt_ref);
>>>>>>        } else {
>>>>>>            ret = decode_slices(avctx, c, avpkt);
>>>>>>            if (ret < 0)
>>>>>> @@ -827,6 +841,7 @@ static av_cold int
>>>>>> ffv1_decode_close(AVCodecContext *avctx)
>>>>>>        ff_progress_frame_unref(&s->last_picture);
>>>>>>        av_refstruct_unref(&s->hwaccel_last_picture_private);
>>>>>> +    av_packet_free(&s->pkt_ref);
>>>>>>        ff_ffv1_close(s);
>>>>>>        return 0;
>>>>>
>>>>> Why not simply use a const AVPacket*?
>>>>
>>>> No reason. Fixed locally.
>>>> Thanks.
>>>
>>> *reverted this change.
>>> We need to ref the packet, since we map its memory and let the GPU use
>>> it directly without copying the contents. 6k16bit content at 24fps is
>>> typically around 2Gbps when compressed, so avoiding copies is important.
>>
>> How long does the hwaccel need this data?
> 
> Until the frame has been asynchronously decoded. We give an output frame
> with a semaphore that receivers need to wait on to determine when that is.
> 
> On the decoder-side, the hardware has a fixed number of queues where
> submissions can be sent to asynchronously. We treat it as a ring buffer
> and keep a reference to all resources our side for each submission,
> until we need to reuse the slot, at which point we wait on the frame
> decoding to complete (which it usually has), and we release all
> resources used.
> 
> Output frames also have a bit of state that has to be freed once the
> frame is marked (unreferenced) by the decoder as no longer being needed
> as a reference, this is done in the FFHWAccel.free_frame_priv callback.
> There, we have to wait for the last internal use of the frame to be
> finished (done via the vp->wait_semaphores() call in vulkan_decode.c).
> 
> This is valid for both ASIC hardware decoders and a compute shader based
> implementation, since the two share the same code, except for decode
> submissions.

1. If you need a reference to the packet's data, then reference
AVPacket.buf, not the whole AVPacket. This avoids allocating a spare
AVPacket as well as copying side data.
2. It sounds very wrong and fragile that the decoder has to keep a
reference because the hwaccel might need it. There may be future
hwaccels that don't need such a reference etc. It seems better to extend
e.g. the start_frame callback and pass a reference to the input data (no
need to change this for all other start_frame calls; they can pass NULL
until needed).
3. If the user closes the decoder (which is allowed at any time, even
without draining the decoder), ff_codec_close() uninitializes the
hwaccel after calling the decoder's close function; the latter
unreferences the reference to the packet. Is this really safe?

- Andreas