[FFmpeg-devel] [PATCH+RFC] AVFrame for audio

Justin Ruggles justin.ruggles
Thu Oct 28 04:13:10 CEST 2010


Michael Niedermayer wrote:

> On Tue, Oct 26, 2010 at 09:31:13PM -0400, Justin Ruggles wrote:
>> Michael Niedermayer wrote:
>>
>>> On Sun, Oct 17, 2010 at 05:22:54PM -0400, Justin Ruggles wrote:
>>>> Michael Niedermayer wrote:
>>>>
>>>>> On Sat, Oct 16, 2010 at 04:12:26PM -0400, Justin Ruggles wrote:
>>>>>> Michael Niedermayer wrote:
>>>>>>
>>>>>>> On Fri, Oct 15, 2010 at 06:35:01PM -0400, Justin Ruggles wrote:
>>>>>>>> Justin Ruggles wrote:
>>>>>>>>
>>>>>>>>> Michael Niedermayer wrote:
>>>>>>>>>
>>>>>>>>>> On Wed, Oct 13, 2010 at 07:52:12PM -0400, Justin Ruggles wrote:
>>>>>>>>>>> Michael Niedermayer wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Oct 06, 2010 at 11:05:26AM -0400, Justin Ruggles wrote:
>>>>>>>>>>>>> Michael Niedermayer wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Oct 05, 2010 at 04:55:12PM -0400, Justin Ruggles wrote:
>>>>>>>>>>>>>>> Michael Niedermayer wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Sep 29, 2010 at 09:20:04PM -0400, Justin Ruggles wrote:
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Peter Ross wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Sep 02, 2010 at 07:11:37PM -0400, Justin Ruggles wrote:
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>> [...]
>>>>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>>>>> @@ -644,29 +677,49 @@
>>>>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>>>>>  #endif
>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>> +#if LIBAVCODEC_VERSION_MAJOR < 53
>>>>>>>>>>>>>>>>>  int attribute_align_arg avcodec_decode_audio3(AVCodecContext *avctx, int16_t *samples,
>>>>>>>>>>>>>>>>>                           int *frame_size_ptr,
>>>>>>>>>>>>>>>>>                           AVPacket *avpkt)
>>>>>>>>>>>>>>>>>  {
>>>>>>>>>>>>>>>>> +    AVFrame frame;
>>>>>>>>>>>>>>>>> +    int ret, got_frame = 0;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +    avcodec_get_frame_defaults(&frame);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +    ret = avcodec_decode_audio4(avctx, &frame, &got_frame, avpkt);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +    if (ret >= 0 && got_frame) {
>>>>>>>>>>>>>>>>> +        *frame_size_ptr = frame.nb_samples * avctx->channels *
>>>>>>>>>>>>>>>>> +                          (av_get_bits_per_sample_format(avctx->sample_fmt) / 8);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +        /* ensure data will fit in the output buffer */
>>>>>>>>>>>>>>>>> +        if (*frame_size_ptr > AVCODEC_MAX_AUDIO_FRAME_SIZE) {
>>>>>>>>>>>>>>>>> +            av_log(avctx, AV_LOG_WARNING, "avcodec_decode_audio3 samples "
>>>>>>>>>>>>>>>>> +                   "truncated to AVCODEC_MAX_AUDIO_FRAME_SIZE\n");
>>>>>>>>>>>>>>>>> +            *frame_size_ptr = AVCODEC_MAX_AUDIO_FRAME_SIZE;
>>>>>>>>>>>>>>>>> +        }
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +        memcpy(samples, frame.data[0], *frame_size_ptr);
>>>>>>>>>>>>>>>>  the default get_buffer() should return the appropriate
>>>>>>>>>>>>>>>> buffer for this case.
>>>>>>>>>>>>>>> I'm sorry, I don't understand your comment.
>>>>>>>>>>>>>> i mean (non functional psseudocode below to explain the idea)
>>>>>>>>>>>>>> avcodec_decode_audio3(){
>>>>>>>>>>>>>>     avctx->foobar= samples;
>>>>>>>>>>>>>>     ret = avcodec_decode_audio4(avctx, &frame, &got_frame, avpkt);
>>>>>>>>>>>>>>     ...
>>>>>>>>>>>>>>     assert(samples == frame.data[0]);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>> default_get_buffer(){
>>>>>>>>>>>>>>     if(avctx->foobar)
>>>>>>>>>>>>>>         return avctx->foobar;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (and yes this cannot work for all theoretical decoders)
>>>>>>>>>>>>> I think I get it.  So avctx->foobar would be an optional user-supplied
>>>>>>>>>>>>> buffer (avctx->user_buffer?) that default_get_buffer() would return if
>>>>>>>>>>>>> it is non-NULL, right?
>>>>>>>>>>>> yes
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> The problem I see is this:
>>>>>>>>>>>>> avcodec_decode_audio3() would use avcodec_decode_audio4().
>>>>>>>>>>>>> avcodec_decode_audio4() allocates as large a buffer as is needed through
>>>>>>>>>>>>> get_buffer(), but the avcodec_decode_audio3() API only requires the
>>>>>>>>>>>>> user-supplied buffer to be AVCODEC_MAX_AUDIO_FRAME_SIZE.  Couldn't this
>>>>>>>>>>>>> lead to the decoder writing past the end of a user-supplied buffer if it
>>>>>>>>>>>>> isn't large enough?  I guess we could also add a field
>>>>>>>>>>>>> avctx->user_buffer_size?
>>>>>>>>>>>> yes, of course
>>>>>>>>>>>> it was just a rough idea ...
>>>>>>>>>>> I'm running into some questions trying to implement the rough idea.  The
>>>>>>>>>>> only way I can see this working smoothly is if avcodec_decode_audio3()
>>>>>>>>>>> always sets get/release_buffer to default.  Also, either all audio
>>>>>>>>>>> decoders will have to support CODEC_CAP_DR1 (i.e. they always use
>>>>>>>>>>> get/release_buffer) or there needs to be a fallback that will memcpy
>>>>>>>>>>> into the user buffer if CODEC_CAP_DR1 is not supported.
>>>>>>>>>> old API decoders surely dont need to copy with old API.
>>>>>>>>>> old API decoders surely dont need to copy with new API if the api can provide
>>>>>>>>>> a buffer to the decoder (this can be through function argument like its done
>>>>>>>>>> currently)
>>>>>>>>>> new API decoders surely dont need to copy with new API because otherwise the
>>>>>>>>>> API sucks and needs more work
>>>>>>>>>> whats left is new API decoders and used with old API and for this get_buffer()
>>>>>>>>>> should return the user supplied buffer if its large enough and fail if its not
>>>>>>>>>> large enough.
>>>>>>>>>> The case where the user overrides get_buffer() and supplies a user specified
>>>>>>>>>> buffer which its own code doesnt use is a case that id consider user error.
>>>>>>>>> I think I might have been misinterpreting the API.  For video decoders,
>>>>>>>>> what does it mean as far as buffer allocation when CODEC_CAP_DR1 is not set?
>>>>>>>> So I think I have this worked out and I don't see how we can avoid a
>>>>>>>> memcpy with the old API when CODEC_CAP_DR1 is not set.  There would be
>>>>>>>> no other way to get the data into the correct output buffer.
>>>>>>>>
>>>>>>>> other questions:
>>>>>>>>
>>>>>>>> 1. Should AVCodecContext.user_buffer be supported for video decoders?
>>>>>>> possible but this is seperate, lets not entangle this patch with too many
>>>>>>> other changes
>>>>>>>
>>>>>>>
>>>>>>>> If so, should it be user_buffer[4] and user_buffer_size[4]?
>>>>>>> possible
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> 2. avcodec_default_get_buffer() supports allocating multiple internal
>>>>>>>> buffers.  How should that be handled if the buffer is supplied by the
>>>>>>>> user?  Don't support multiple buffers?  Use the user-supplied buffer
>>>>>>>> just for the first one?
>>>>>>> there are buffer hints (grep for hint in avcodec.h) that specify if a buffer
>>>>>>> will be reused/read/preserved/blah
>>>>>>> the user supplied buffer is likely just valid for this call and cannot be used
>>>>>>> for some cases of the hints. For what remains using the buffer on the first
>>>>>>> call only seems ok
>>>>>> I think I've implemented it in a way that will work even when the
>>>>>> various buffer hints are set.  This implementation will not use memcpy
>>>>>> in avcodec_decode_audio3() in the most common case of the decoder
>>>>>> supporting CODEC_CAP_DR1, only needing 1 buffer, and not needing a
>>>>>> buffer larger than AVCODEC_MAX_AUDIO_FRAME_SIZE.
>>>>>>
>>>>>> One thing I'm unsure of is whether to truncate output if it is too large
>>>>>> for avcodec_decode_audio3() (which is done in this patch) or to return
>>>>>> an error instead.
>>>>> I think its better to tell the user straight through an error that there is a
>>>>> problem instead of generating output that contains randomly truncated packets
>>>> Ok. New patch.
>>>>
>>>> -Justin
>> [...]
>>>> @@ -2763,7 +2813,7 @@ typedef struct AVCodec {
>>>>      int (*init)(AVCodecContext *);
>>>>      int (*encode)(AVCodecContext *, uint8_t *buf, int buf_size, void *data);
>>>>      int (*close)(AVCodecContext *);
>>>> -    int (*decode)(AVCodecContext *, void *outdata, int *outdata_size, AVPacket *avpkt);
>>>> +    int (*decode)(AVCodecContext *, void *outdata, int *got_output_ptr, AVPacket *avpkt);
>>>>      /**
>>>>       * Codec capabilities.
>>>>       * see CODEC_CAP_*
>>> cosmetic
>> yeah yeah. I do want to change it though. :)  The size won't be needed
>> by anything after the audio API is changed.  And I still don't know why
>> the video decoders set it to sizeof(AVFrame) or sizeof(AVPicture).
> 
> The original idea probably was for ABI compatibility. That is if AVFrame grows
> over the versions the user app has to know where it ends to know what fields
> it can saftely read
> setting it to sizeof of an internal struct is obviously not a good idea ...

The user isn't supposed to use AVCodec.decode() directly right?  And the
video and subtitle decoding API just has got_*_ptr which is documented
as being zero or non-zero.  So would it be ok for the audio decoders to
just set this to 0 or 1 in the new API?  Changing all the video and
subtitle decoders is obviously not necessary, but if we did do that at
some point it would be less confusing.

> 
>>> [...]
>>>>  int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>>>>      int i;
>>>>      int w= s->width;
>>>>      int h= s->height;
>>>> +    int is_video = (s->codec_type == AVMEDIA_TYPE_VIDEO);
>>>>      InternalBuffer *buf;
>>>>      int *picture_number;
>>>>  
>>>> @@ -235,7 +258,8 @@ int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>>>>          return -1;
>>>>      }
>>>>  
>>>> -    if(av_image_check_size(w, h, 0, s))
>>>> +    if(( is_video && av_image_check_size(w, h, 0, s)) ||
>>>> +       (!is_video && audio_check_size(s->channels, pic->nb_samples, s->sample_fmt)))
>>>>          return -1;
>>>>  
>>>>      if(s->internal_buffer==NULL){
>>>> @@ -253,17 +277,30 @@ int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>>>>      picture_number= &(((InternalBuffer*)s->internal_buffer)[INTERNAL_BUFFER_SIZE]).last_pic_num; //FIXME ugly hack
>>>>      (*picture_number)++;
>>>>  
>>>> -    if(buf->base[0] && (buf->width != w || buf->height != h || buf->pix_fmt != s->pix_fmt)){
>>>> +    if (buf->base[0]) {
>>>> +        if (is_video && (buf->width != w || buf->height != h || buf->pix_fmt != s->pix_fmt)){
>>>>          for(i=0; i<4; i++){
>>>>              av_freep(&buf->base[i]);
>>>>              buf->data[i]= NULL;
>>>>          }
>>>> +        } else if (!is_video && (buf->channels   != s->channels     ||
>>>> +                                 buf->nb_samples != pic->nb_samples ||
>>>> +                                 buf->sample_fmt != s->sample_fmt)) {
>>>> +            if (buf->base[0] == s->user_buffer) {
>>>> +                s->user_buffer_in_use = 0;
>>>> +                buf->base[0] = NULL;
>>>> +            } else {
>>>> +                av_freep(&buf->base[0]);
>>>> +            }
>>>> +            buf->data[0] = NULL;
>>>> +        }
>>>>      }
>>>>  
>>>>      if(buf->base[0]){
>>>>          pic->age= *picture_number - buf->last_pic_num;
>>>>          buf->last_pic_num= *picture_number;
>>>>      }else{
>>>> +        if (is_video) {
>>>>          int h_chroma_shift, v_chroma_shift;
>>>>          int size[4] = {0};
>>>>          int tmpsize;
>>>> @@ -327,6 +364,28 @@ int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>>>>          buf->height = s->height;
>>>>          buf->pix_fmt= s->pix_fmt;
>>>>          pic->age= 256*256*256*64;
>>>> +        } else { /* audio */
>>>> +            int buf_size;
>>>> +
>>>> +            buf->last_pic_num = -256*256*256*64;
>>>> +
>>>> +            buf_size = pic->nb_samples * s->channels *
>>>> +                       (av_get_bits_per_sample_format(s->sample_fmt) / 8);
>>>> +
>>>> +            if (s->user_buffer && !s->user_buffer_in_use && s->user_buffer_size >= buf_size) {
>>>> +                buf->base[0] = s->user_buffer;
>>>> +                s->user_buffer_in_use = 1;
>>>> +            } else {
>>>> +                buf->base[0] = av_mallocz(buf_size);
>>>> +                if (!buf->base[0])
>>>> +                    return AVERROR(ENOMEM);
>>>> +            }
>>>> +
>>>> +            buf->data[0]    = buf->base[0];
>>>> +            buf->channels   = s->channels;
>>>> +            buf->nb_samples = pic->nb_samples;
>>>> +            buf->sample_fmt = s->sample_fmt;
>>>> +        }
>>>>      }
>>>>      pic->type= FF_BUFFER_TYPE_INTERNAL;
>>>>
>>>> @@ -360,9 +419,15 @@ void avcodec_default_release_buffer(AVCodecContext *s, AVFrame *pic){
>>>>      }
>>>>      assert(i < s->internal_buffer_count);
>>>>      s->internal_buffer_count--;
>>>> +    if (buf->base[0] == s->user_buffer) {
>>>> +        assert(s->user_buffer_in_use);
>>>> +        s->user_buffer_in_use = 0;
>>>> +        buf->base[0] = NULL;
>>>> +    } else {
>>>>      last = &((InternalBuffer*)s->internal_buffer)[s->internal_buffer_count];
>>>>  
>>>>      FFSWAP(InternalBuffer, *buf, *last);
>>>> +    }
>>>>  
>>>>      for(i=0; i<4; i++){
>>>>          pic->data[i]=NULL;
>>> i dont see how this could work
>>> the buffer used and returned by the previous decode() is put in a que by the
>>> user app and user_buffer is set to a new buffer.
>>> also you appear to end up calling av_free()
>>> on user supplied buffers
>> Well, I meant to disallow that, but the documentation I put just says
>> the user cannot free or change the data while user_buffer_in_use is set.
>>  I didn't consider the user replacing it with a new buffer.  But at any
>> rate, if that should be allowed, things get more complicated.  I'll need
>> to add a flag or something to indicate each user-supplied buffer.  I'll
>> work on it.
> 
> i think the API is too complex already and i dont see why it is so
> if user buffer is set get_buffer() should return it or fail, if its returned
> it should set user_buffer to NULL
> calling get_buffer() a second time if user_buffer was set should be disallowed
> release_buffer should do nothing
> 
> if we ever have decoders that dont work with this then we need a AVCodec flag
> that indicates them. For this case get_buffer() would then ignore user_buffer
> and avcodec_decode() would copy to the provided user_buffer if any.
> (we do not need this currently though because we do not have such a decoder)
> 
> maybe iam missing something but this seems simpler

Ok I did it a different way.  New patch attached.  I'm not 100% sure
about the way reget_buffer() is handled but it works.  AVFrame.type
seems to be a mask, but I don't know if it was intended to be used that way.

Cheers,
Justin


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: avcodec_decode_audio4.patch
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101027/d78248a4/attachment.txt>



More information about the ffmpeg-devel mailing list