[Libav-user] QTKit -> Libav: has it ever been done?

Brad O'Hearne brado at bighillsoftware.com
Wed Mar 27 20:55:43 CET 2013


On Mar 27, 2013, at 4:44 AM, René J.V. Bertin <rjvbertin at gmail.com> wrote:
> So if the QTSampleBuffers contain non-native endian data the
> FFmpeg-encoded output will inevitably be the "wrong way around" unless
> it is converted before being encoded. Not?

Thanks for the replies everyone, as they all raise new ideas. I want to answer a few questions, and then revisit the situation asking a few further questions....perhaps we can get closer. 

First, regarding endian-ness. QTKit is using native-endian (which in OS X on Intel is little-endian), but more than that, I'm explicitly checking endian-ness on the sample buffer, and it is little endian. This is the reason I was asking if there was a function in FFmpeg which would output the endian-ness explicitly, just so I could verify that. But with what information I have, it appears we are going from little-endian to little-endian, so endian-ness shouldn't be an issue. 

Second, Paul said: 

> Provide raw output given by QTKit whatever, and I'm sure someone will
> give you solution for your problem.

I'll take you up on that. You'll have to give me a little bit to create this, but I'm going to provide two files, the first just the raw bytes of a QTKit sample buffer, and the second, a compressed FLV file created by FFmpeg containing a short encoded audio stream so that you can see what I'm working with. I'll post that later today. 

Finally, I want to revisit the scenario one more time with a few questions...maybe there's something in there that will turn on a light bulb (my own) somewhere. So here goes: 

When creating an AVOutputFormatContext using av_guess_format passing it an extension of "flv" and a MIME type of "video/x-flv" configures the context with the "adpcm_swf"  audio codec, which requires a sample format of AV_SAMPLE_FMT_S16. The QTSampleBuffer format being captured from QTKIt is as follows: 

Linear PCM, 32 bit little-endian floating point, 2 channels, 44100 Hz

This format would appear to map to the FFMpeg sample format of AV_SAMPLE_FMT_FLT. However, there's a difference in how QTKit is delivering the sample buffer data -- it isn't interleaved. In other words, channel 1 samples come before all channel 2 samples. So I then interleave this data (you can see this in the QTFFAVStreamer streamAudioFrame method of my sample app) to put it into AV_SAMPLE_FMT_FLT, prior to attempting any resampling, so that the resampling converts from AV_SAMPLE_FMT_FLT to AV_SAMPLE_FMT_S16. There's a very similar handling example I was referred to a while back by the QuickTime API mailing list which does this, you can see that here: 

http://git.videolan.org/?p=vlc.git;a=blob;f=modules/access/qtsound.m;h=4ff12309927591b749e40ccca9227fe6ba293711;hb=74a3b3f19f3f15843e913ce347c237eb23375f6f

Unfortunately, it doesn't proceed with resampling or encoding with FFmpeg, so that's as far as I can follow the example. So if I understand the resampling process, here is what should be happening: 

decompressed audio samples in AV_SAMPLE_FMT_FLT -> [FFmpeg resample] -> decompressed audio samples in AV_SAMPLE_FMT_S16 -> [FFmpeg encoding] -> FLV file

If there's any part of that which is inaccurate, please let me know. However, assuming that is accurate, I'm wondering if the resampling step is the problem, specifically the conversion of floats to signed 16-bits. I could perform the resampling manually, if I knew exactly how that conversion is occurring. This raises a couple of decent questions: 

1. Regarding sample formats, what is the difference between AV_SAMPLE_FMT_S32 and AV_SAMPLE_FMT_FLT? Both are signed, both are 32 bits...?

2. How is a 32 bit float being converted to signed 16 bits? Once I know this, I'll write this manually and eliminate that from the equation too. 

3. I have posted another message to the mailing list which hasn't been responded to, but I had several questions about packed samples and the align parameter in several libswresample function calls. In reading through the resampling_audio.c example, it wasn't clear to me the setting of this parameter to 0 vs. 1. I'll bump this message again in hopes of directing dialog on that topic there. 

Thanks again for all the discussion and help. I'll get those files posted later today, but in the meantime, the answers to the above questions would really help. 

Cheers, 

Brad


More information about the Libav-user mailing list