[FFmpeg-devel] Some ideas for a tiny set of audio conversion functions..

Andreas Öman andreas
Wed Nov 28 22:34:29 CET 2007


Hello,

Michael Niedermayer wrote:
> my original idea was to use AVFrame for audio as well
> if we choose not to do this then you will first have to find out
> how your AVAFrame can be used in code which should work with all
> codec_types, that is code which wants to access
> key_frame, pts, quality, opaque, ...
> and how direct rendering with audio (get/release_buffer()) could work
> with it

It really doesn't matter much to me. I'll ponder more on that later.

Something that IMO needs more thought is how to come up with the
conversion function to use.

The parameters that may affect the choice of conversion function is:
- Input/Output sample format
- Input/Output number of channels
- Mixing matrix
- Interleaved/Planar format

We should have a generic function that can handle all cases.

Then there will be a set of special functions for common cases.

After the conversion function has been selected we will also
know the source data scaling and biasing required by that
particular function and conversion parameters.

Since the number of channels and mixing-matrix may change over time
for an audio stream we must be prepared to reselect conversion
function at any time.

Now, since the scaling+biasing is something that the codec needs to
apply internally (at least for it to make any speed difference) we
cannot make the audio conversion stuff completely invisible to the
codec guts.

I see two variants here.

a) Get rid of the scaling + biasing entirely and let each sample
format have an implicit normalized range.

u8       0           255
s16     -32768       32767
s24     -8388608     8388607  (but stored as int32)
s32     -2147483648  2147483647
float   -1.0         1.0

Of course, the downside is a speed decrease, especially for the
c-variant of s16 to float.

Actually, after I composed the mail i ran some tests on my
pentium-m decoding a vorbis file.

SSE:                 user    0m3.000s
Prescaled C version: user    0m3.032s
Real dummy:          user    0m4.900s

The first two are the two variants available in dsputil today.
The third is a most simple implementation:

static void
float_to_int16_slow(int16_t *dst, float *src, int samples)
{
     int i;
     float f;

     for(i = 0; i < samples; i++) {
	f = *src++ * 32767.;
	if(f < -32768.)
	    f = -32768.;
	else if(f > 32767.)
	    f = 32767.;
	*dst++ = f;
     }
}

Thus, it looks like option a) is really not an option.


b) Keep a pointer to the conversion context in avctx and let the
codec itself update the context when necessary.
The downside here is that various tables (depends a bit on the
codec of course) needs to be reinitialized. Also, i think the
final result will be a little uglier.

Opinions?




More information about the ffmpeg-devel mailing list