[FFmpeg-soc] G723.1 Frame Parser

Ronald S. Bultje rsbultje at gmail.com
Mon Apr 5 19:02:15 CEST 2010


Hi Mohamed,

On Mon, Apr 5, 2010 at 8:55 AM, Mohamed Naufal <naufal11 at gmail.com> wrote:
> On 1 April 2010 02:14, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>> On Wed, Mar 31, 2010 at 10:52 AM, Mohamed Naufal <naufal11 at gmail.com>
>> wrote:
>> > +    // The frame size and codec type is determined from
>> > +    // the least 2 bits of the first byte.
>> > +    frame_len         = frame_sizes[buf[0] & 3];
>> > +    pkt->stream_index = st->index;
>> > +    ptr               = pkt->data;
>> > +
>> > +    if (av_new_packet(pkt, len) < 0) {
>> > +        av_log(ctx, AV_LOG_ERROR, "Out of memory\n");
>> > +        return AVERROR_NOMEM;
>> > +    }
>> > +
>> > +    if (frame_len > len)
>> > +        av_log(ctx, AV_LOG_WARNING, "Too little data in the RTP
>> packet\n");
>> > +
>> > +    memcpy(ptr, buf, len);
>> > +    pkt->size = len;
>> > +
>> > +    if (frame_len < len) {
>> > +        av_log(ctx, AV_LOG_WARNING, "Too much data in the RTP
>> packet\n");
>> > +        ptr += frame_len;
>> > +        memset(ptr, 0, len - frame_len);
>> > +        pkt->size = frame_len;
>> > +    }
>>
>> That looks messy, are you sure that's correct? I figure frames are
>> generally small and RTP packets would be bigger, so this would trigger
>> all the time. Shouldn't this split the packets? Or simply copy them
>> all (as the AMR depacketizer does)?
>
> You're right. Sorry. Corrected now. I suppose I'll need a parser too. I
> believe the (yet to be committed) amr parser can be modified for this too.

It'd be nice to have at some point, but we don't specifically need it
right now, all voice codecs we have support CODEC_CAP_SUBFRAME, i.e.
you can send them a series of frames and they'll just consume a small
part of the buffer, return the decode data and then continue on the
same buffer in the next call to the decoder. Also, AFAIK there's no
single muxer that expects these codecs to actually store individual
voice frames, since it'd be highly inefficient. If you do a voice
codec, you'll most likely implement it in the same way.

Ronald


More information about the FFmpeg-soc mailing list