[Libav-user] ffmpeg for matlab (mex function)

wm4 nfxjfg at googlemail.com
Tue Apr 8 14:58:18 CEST 2014

On Tue, 8 Apr 2014 15:28:05 +0300
ReSearchIT Eng <researchiteng at gmail.com> wrote:

> Hi,
> I am implementing ffmpeg as a mex function in matlab as part of a research
> project (with main focus on the audio, not video like OpenCV, etc).
> I managed to have an initial version of audio encoding working, but I need
> to clarify few things before considering it usable by others with other
> codecs that I tried.
> 1. Newbie questions section:
> 1.1: Few definitions which the documentation expects are very clear to the
> developer (which is not this case).
> avctx->sample_fmt => is the input or the output format?  (for my current
> codec I am lucky both input and output formats are the same, but I need to
> know for other situations).
> avctx->sample_rate => for the input or for the output? From what I
> understand it's for output, correct?

Both of these are for the input. But they can influence output too.
Note that encoders usually do little to no conversion of the input
data, other than encoding it. If you have data in a format that the
encoders happens not to support, you must use libswscale or
libavresample to convert it.

> avctx->bit_rate => of the input of of the output? From what I understand
> it's for output, correct?


> avctx->bits_per_raw_sample => I understand it's for input. Shall I set it,
> or it defaults to s16 by itself? In the examples I did not find this being
> set.
> avctx->frame_size => I understood is for input, correct?

I think this is essentially read-only, and you're supposed to send
frames with this many samples to the encoder. Except for PCM codecs,
which don't really have a frame size.

In general, it seems you can write frames of arbitrary size, and the
encoder will buffer data if needed.

> 1.2: how to pass from an external program, things like AVCodecID, which are
> enum type. It int good enough (size wise) or I should send int64_t to be
> safe?

IMO it's best to use the codec name, i.e. a string. The codec IDs could
change their numeric value on ABI changes.

> 1.3: bit_rate seems to be defined as int in the avcodecontext. Is it able
> to accept higher bit_rate values as well?

Not sure, probably.

> 1.4: When is required to flush the encoder and how to do it? Simply reading
> at the end the delayed frames like this: avcodec_encode_audio2(c, &pkt,
> NULL, &got_output) means flushing the encoder?

Yes. I think it needs to be done on the end of encoding. You write NULL
frames until you don't get a packet anymore.

> 2. bit more advance questions:
> 2.1.1 I have a loop parsing an input buffer, shall I call inside the loop
> av_init_packet(&pkt) each time before avcodec_encode_audio2, or it's enough
> only once before the loop?
>     I am asking because in the avcodec.c example it shows as if I need to
> run it each time.
> http://www.ffmpeg.org/doxygen/trunk/avcodec_8c-example.html#a43 (note: I am
> focusing on performance of the code as well).

Probably not, but I'd reinit on every iteration. This has nothing to do
with performance. av_init_packet() just sets some struct fields, and it
shouldn't influence the performance of your code.

> 2.1.2   I am writing the output in a buffer, by pointing the pkt.data to
> the output buffer: pkt.data=&audio_outbuf[audio_outbuf_actual_encoded_size];
>     The question shall I run av_free_packet(&pkt); inside the loop (as in
> question 2.1.1) ? (of course this time after the avcodec_encode_audio2).
>         If so, will av_free_packet-> temper with my output buffer to which
> pkt.data is pointing to?
>     Currently, when I do: av_free(audio_outbuf) at the end of the
> application, and all seems fine, but just to make sure.

That should be fine.

> 2.2 amrwb (like other many codecs), encoding of each frame depends on the
> adjacent frames, is there something special I should to while feeding the
> encoder, in order to keep the "context"?
> AVFrame *frame = av_frame_alloc();
> avcodec_fill_audio_frame(frame, c->channels, c->sample_fmt,(const
> uint8_t*)audio_bufin_single_frame, audio_bufin_single_frame_buffer_size, 0);

Not sure if I understand...

> 3. Is there some magic way of allocating the right size of the required
> output buffer? As of now, I am using an non-VBR codec, I encode the 1st
> frame, I get the size of the 1st output package and I multiply with the
> number of frames I have to encode in total. Is there some better option?

Isn't this the same question as 1.1?

> 4. I have set av_log_set_level(AV_LOG_VERBOSE), but none of the methods I
> call from the ffmpeg libraries show any debug output. Should I expect them
> to say anything or not?

Depends, most time it will probably print only something if there is a
problem of some sort.

> Initial version (as of now only for audio encoding) can be found at:
> https://github.com/ReSearchITEng/ffmpeg_matlab (utils.h-> main_encode2 is
> "the" function) (Open Source, suggestions and contributions are welcome).
> I removed the matlab sections, so I can compile and debug it in a IDE.
> Thanks in advance for replies,
> Sebastian

More information about the Libav-user mailing list