[Ffmpeg-devel-irc] ffmpeg.log.20190326

burek burek021 at gmail.com
Wed Mar 27 03:05:02 EET 2019


[01:28:51 CET] <damdai> does ffmpeg support converting 5.1 to  stereo ?
[03:02:57 CET] <another> damdai: https://trac.ffmpeg.org/wiki/AudioChannelManipulation#a5.1stereo
[14:04:34 CET] <faLUCE> hello all
[15:19:36 CET] <iank> im doing a picture in picture of 2 vids with ffmpeg (for libreplanet conference),. i want the resulting video to be 24fps. here's the command i've got so far. ffmpeg -i libreplanet2019-room-123_2019-03-23_09-50-48.webm -i libreplanet2019-slides-123_2019-03-23_09-46-26.webm -filter_complex "[0]scale=iw/3:ih/3 [pip]; [1][pip] overlay=main_w-overlay_w-10:main_h-overlay_h-10" -vcodec libvpx-vp9 -to 15 output.webm
[15:20:07 CET] <iank> er, accidentally hit enter too soon. slides input video is 10fps, the other is 24.
[15:30:49 CET] <kepstin> iank: my recommendation would be to put an fps filter on each input to convert them to 24fps before doing the overlay.
[15:31:07 CET] <kepstin> (both inputs, because the fps filter will "clean up" the timestamps and timebase so they match)
[15:31:19 CET] <iank> kepstin awesome, thank you
[15:31:28 CET] <iank> do you happen to know the syntax for that offhand?
[15:32:13 CET] <kepstin> "[0]scale=iw/3:ih/3,fps=24[pip];[1]fps=24[base];[base][pip]overlay=&"
[15:34:42 CET] <iank> thank you!
[15:36:40 CET] <kepstin> without that, the output of the overlay will be variable framerate - you could fix that by putting an fps filter after the overlay to drop the extra frames, but putting it before should be slightly faster in theory.
[15:37:02 CET] <kepstin> probably not a big difference either way tho
[16:58:48 CET] <pridkett> does ffmpeg support converting 5.1 to stereo ?
[17:00:35 CET] <pink_mist> didn't you ask that earlier with another nick and get an answer provided to you already?
[17:01:12 CET] <kepstin> looks like damdai's connection dropped a few seconds after they got the answer
[17:01:22 CET] <kepstin> so I could understand it being misses
[17:01:47 CET] <furq> you have the patience of a saint
[17:01:48 CET] <pridkett> sorry my computer crashed
[17:02:07 CET] <pink_mist> kepstin: ah, you're right, I missed that part
[17:02:11 CET] <pink_mist> pridkett: 03:02 <another> damdai: https://trac.ffmpeg.org/wiki/AudioChannelManipulation#a5.1stereo
[17:02:58 CET] <pridkett> that picture says  it splits center into  left and right
[17:03:07 CET] <pridkett> how is that possible
[17:03:19 CET] <pink_mist> mathematics
[17:03:24 CET] <pink_mist> or magic
[17:03:29 CET] <pink_mist> maybe a bit of both
[17:03:46 CET] <pridkett> and that picture also says  LFE is ignored
[17:04:42 CET] <pridkett> is LFE really ignored?
[17:07:01 CET] <pridkett> i wonder how well ffmpeg does splitting "center"
[17:08:00 CET] <pridkett> the topic says This channel is publicly logged'  where do i see this log?
[17:08:21 CET] <kepstin> i assume it center-pans the center channel into stereo (sends the signal to both channels at reduced power, probably somewhere between -3 to -6dB.)
[17:08:38 CET] <pridkett> kepstin i see
[17:08:49 CET] <pridkett> kepstin what about LFE?
[17:09:58 CET] <kepstin> pridkett: https://lists.ffmpeg.org/pipermail/ffmpeg-devel-irc/ has logs for both this channel and #ffmpeg-devel
[17:10:49 CET] <pridkett> kepstin what is the name of the bot that is doing the logging
[17:12:19 CET] <pridkett> it has no  join/part info
[17:12:50 CET] <kepstin> seriously? the bot has literally the most obvious name :)
[17:13:19 CET] <pridkett> boturk?
[17:13:57 CET] <pridkett> shouldn't boturk have @ access?
[17:16:01 CET] <iank> i have 2 vp8 videos im overlaying, should i keep the resulting video as vp8 or make it vp9?
[17:16:48 CET] <pridkett> iank what do you mean by overlaying
[17:17:05 CET] <iank> picture in picture
[17:17:14 CET] <pridkett> make it vp9
[17:17:16 CET] <furq> iank: it makes no difference
[17:17:21 CET] <furq> use whichever one suits you better
[17:17:48 CET] <pridkett> iank if i were you, i use vp9, mine as well
[17:18:01 CET] <iank> ok, thanks
[17:18:20 CET] <pridkett> iank even youtube uses vp9 over x264
[17:18:36 CET] <iank> nice, ok
[17:19:20 CET] <pridkett> iank is it 50/50 overlaying?
[17:19:48 CET] <iank> no, the picture in picture is like 1/5th the size
[17:20:08 CET] <pridkett> ok
[17:22:53 CET] <kepstin> iank: vp9 generally provides better efficiency (quality per bit) than vp8, and the encoder has better multithreading so it's often faster. All current browsers that can play vp8 can also play vp9. So there's basically no reason to use vp8 any more.
[17:23:17 CET] <iank> thank you! good to know. i will use it
[17:23:48 CET] <pridkett> kepstin agreed
[17:25:18 CET] <pridkett> kepstin did feluce come to you and talk about  fixing ffmpeg for opus
[17:25:45 CET] <kepstin> feluce talked about it a bit.
[17:25:58 CET] <pridkett> feluce said it was easy fix
[17:26:40 CET] <kepstin> from reading the opus channel log, it looks like libopus *specifically* has a resampler implementation that allows it to return the exact number of samples from the original file when the input sample rate doesn't match opus's internal coding rates.
[17:27:22 CET] <kepstin> note that this wouldn't hold true for stuff decoded with ffmpeg's internal opus decoder, which doesn't use the same resampler. (i don't think it's required by the spec, from my look-through?)
[17:27:23 CET] <faLUCE> kepstin: in the opus channel the devs told that the input is not resampled before processing... then ffmpeg should allow other samplerates too
[17:27:38 CET] <pridkett> lol look who shows up
[17:28:23 CET] <kepstin> we could certainly allow sending arbitrary sample rates to the libopus encoder in ffmpeg, to use their internal resampler rather than ffmpeg's. That would be fine
[17:28:30 CET] <kepstin> probably easy to do
[17:28:36 CET] <faLUCE> kepstin: yes
[17:29:01 CET] <faLUCE> kepstin: they also told me that ffmpeg devs are aware of that
[17:29:57 CET] <faLUCE> maybe a ticket has already been opened
[17:30:01 CET] <faLUCE> I don't know
[17:30:02 CET] <kepstin> but for decoding it's a trickier call to make, especially since the internal decoder is used by default i think.
[17:31:15 CET] <kepstin> might need some extra stuff wired up to save the original sample rate as metadata so it can be passed to the decoder, if people care about that.
[17:31:15 CET] <faLUCE> I wonder if other samplerates are not allowed because the same codec is used for output too
[17:31:48 CET] <furq> 16:27:23 ( faLUCE) kepstin: in the opus channel the devs told that the input is not resampled before processing...
[17:31:51 CET] <furq> are you sure they said this
[17:31:58 CET] <faLUCE> furq: yes
[17:32:06 CET] <pridkett> furq i have the logs
[17:32:35 CET] <kepstin> the only thing i got from the linked log is that the opus devs say that it will return the same sample count after decoding as it had on the input, at arbitrary sample rates
[17:32:39 CET] <faLUCE> pridkett: paste the log if you have it. Anyway, you could ask to gmaxwell
[17:32:42 CET] <kepstin> they made no claim about not resampling
[17:32:46 CET] <furq> what kepstin said
[17:32:56 CET] <furq> opusenc stores original sample rate as metadata
[17:33:03 CET] <furq> and then opusdec will resample back to that if possible
[17:33:06 CET] <faLUCE> kepstin: wait, I check the log
[17:33:13 CET] <pridkett> https://pastebin.com/7DpAxMUe
[17:33:16 CET] <furq> you can ffprobe the output of opusenc and see that it's 48k
[17:33:55 CET] <kepstin> opusenc (the command line tool) supports 44.1K input, and opusdec supports 44.1K output, and there's some metadata to pass the information from one to the other.
[17:34:08 CET] <furq> right
[17:34:11 CET] <furq> and they both resample
[17:34:43 CET] <kepstin> they decided that having the decoder match the sample rate of the original input would be less surprising to users than e.g. always getting 48K back
[17:35:03 CET] <kepstin> iirc there was some debate about that?
[17:35:16 CET] <faLUCE> kepstin: [18:58] <faLUCE> gmaxwell: are you sure that the input is not resampled ?
[17:35:26 CET] <faLUCE> [19:00] <+gmaxwell> faLUCE: yes, it is as part of the processing, and?
[17:35:34 CET] <faLUCE> [19:00] <faLUCE> gmaxwell: then ffmpeg people must be informed
[17:35:47 CET] <furq> "yes it is" is him saying it is resampled
[17:35:49 CET] <faLUCE> [19:02] <+gmaxwell> I'm sure some people know.
[17:35:52 CET] <kepstin> here, let me help your reading: "<+gmaxwell> faLUCE: yes, it is [resampled] as part of the processing, and?"
[17:36:20 CET] <kepstin> english negatives are strange :/
[17:36:32 CET] <furq> yeah there are less confusing ways to say that
[17:36:43 CET] <kepstin> it helps if you avoid asking negative questions.
[17:37:09 CET] <kepstin> ask "is the input resampled", not "is the input not resampled" :)
[17:37:34 CET] <faLUCE> kepstin:
[17:37:37 CET] <faLUCE> [18:58] <faLUCE> gmaxwell: are you sure that the input is not resampled ?
[17:37:39 CET] <faLUCE> [18:58] <faLUCE> (before processing, I mean)
[17:37:57 CET] <faLUCE> and gmaxwell says "it's a part of the processing"
[17:38:05 CET] <furq> i mean you can just check this yourself
[17:38:27 CET] <furq> that wav > opusenc > opusdec > wav chain that results in a 44.1k wav will have a verifiably 48k opus file in the middle
[17:38:43 CET] <brimestone> Anyone familiar with -progress url flag here? But not sure how it maps it to header for the receiving URL to parse the data
[17:38:43 CET] <pridkett> [18:27] <damdai> i was told opusenc  does not support  44.1khz : is this true?
[17:38:43 CET] <pridkett> [18:31] <+gmaxwell> No.
[17:38:43 CET] <pridkett> [18:32] <damdai> then why people in #ffmpeg say that
[17:38:43 CET] <pridkett> [18:33] <+gmaxwell> No idea.
[17:38:45 CET] <furq> with "original sample rate" metadata if you check with opusinfo
[17:39:10 CET] <kepstin> in libopus, the resampler is integrated into the other processing it does when encoding. so it's no resampled before processing...
[17:39:24 CET] <furq> the resampler is specifically in opusenc/libopusenc iirc
[17:39:32 CET] <faLUCE> kepstin: then other samplerates should be accepted
[17:39:33 CET] <furq> libopus expects 48k input
[17:39:42 CET] <furq> or 24k or whatever
[17:39:50 CET] <kepstin> huh, does it? I thought the libopus resampler accepted more stuff.
[17:39:55 CET] <furq> i might be wrong about that
[17:40:22 CET] <faLUCE> anyway, gmaxwell is a libopus dev. maybe the main one... you should ask to him
[17:40:26 CET] <kepstin> anyways, the opus spec doesn't require resampling to arbitrary output rates. that's just something opusdec does.
[17:40:56 CET] <furq> https://github.com/xiph/libopusenc/blob/master/src/resample.c
[17:40:57 CET] <furq> yeah there you go
[17:41:21 CET] <kepstin> but yeah, nobody in this channel made the claim that "opusenc (the cli tool) doesn't support 44.1K" input, I know it can do that just fine
[17:41:44 CET] <furq> opus itself doesn't support 44.1k, it has to be resampled
[17:41:47 CET] <furq> but the opus tools will do that for you
[17:41:53 CET] <furq> because otherwise nobody would use them
[17:41:54 CET] <kepstin> heh, it's just the speex resampler, because of course it is
[17:41:56 CET] <faLUCE> kepstin: not only. They also said that 44.1 is NOT resampled before processing
[17:42:07 CET] <furq> faLUCE: again, he said that it is resampled
[17:42:20 CET] <kepstin> faLUCE: they also said it is resampled during processing
[17:42:29 CET] <pridkett> maybe you people should join #opus together and ask
[17:42:41 CET] <pridkett> so much confusion
[17:42:43 CET] <faLUCE> kepstin: furq I asked if it was resampled BEFORE
[17:42:52 CET] <faLUCE> not during processing
[17:43:05 CET] <furq> well then you must have different definitions of processing
[17:43:37 CET] <kepstin> faLUCE: doesn't really matter, it gets resampled at some point between the input and the output. The input is 44.1Khz wave, the output is spec opus (which runs encoding stuff at 48kHz or divisions thereof).
[17:45:00 CET] <faLUCE> kepstin: if it is resampled before input it's different if it's resampled before output. I mean: in the first case you loose quality before encoding
[17:45:16 CET] <kepstin> faLUCE: it's resampled when encoding, and resampled again when decoding.
[17:45:21 CET] <faLUCE> but really, you should ask to #opus people
[17:45:43 CET] <furq> it's resampled before encoding and then resampled again after decoding
[17:45:52 CET] <furq> but if you cared about minor quality loss you wouldn't be using a lossy codec anyway
[17:46:02 CET] <faLUCE> furq: no, they told me that it's not resampled BEFORE encoding
[17:46:07 CET] <furq> well it is
[17:46:22 CET] <kepstin> it's not resampled before encoding. it's resampled during encoding.
[17:46:26 CET] <kepstin> the difference doesn't matter
[17:46:32 CET] <furq> right
[17:46:53 CET] <faLUCE> I'm not sure about that, but I will ask them
[17:47:01 CET] <kepstin> 44.1kHz wav ->(opusenc)-> opus (internally 48KHz, with metadata saying "44.1kHz") ->(opusdec)-> 44.1kHz wav
[17:47:28 CET] <faLUCE> kepstin: it could  be resampled just before producing output.
[17:47:46 CET] <furq> why is that different
[17:48:06 CET] <faLUCE> furq: because if it is resampled before encoding, you loose quality before encoding
[17:48:12 CET] <kepstin> faLUCE: it is resampled two times: once either before or during encoding (from 44.1 to 48) and again either during or after decoding (from 48 to 44.1)
[17:48:31 CET] <furq> i don't see how the quality loss is any different
[17:48:34 CET] <furq> it gets resampled twice either way
[17:48:34 CET] <kepstin> faLUCE: you also lose quality during encoding, because it is a *lossy codec*
[17:48:36 CET] <kepstin> so what
[17:49:06 CET] <furq> it also still beats any lossy codec that doesn't resample in actual listening tests
[17:49:07 CET] <faLUCE> kepstin: but if it's resampled before encoding you can loose more quality
[17:49:17 CET] <furq> so maybe, just maybe, it doesn't actually matter
[17:49:41 CET] <kepstin> there's no way you can notice a tiny amount of distortion from a 48 to 44.1 conversion compared to what a lossy codec does.
[17:49:54 CET] <faLUCE> kepstin: anyway, I understand what you say, but I would ask to #opus people to be sure that doesn't make difference, for the quality, inputting 44.1 or 48
[17:50:13 CET] <faLUCE> I mean, for the "theoretical" quality
[17:50:14 CET] <furq> i mean it'll probably be slightly higher quality if you input 48, but not if you just resampled it from 44.1 yourself
[17:50:19 CET] <furq> emphasis on "slightly"
[17:50:41 CET] <pridkett> i don't want my source 44.1khz  being resampled to 48khz  or   32khz  being resampled to 24khz   which ffmpeg's libopus does
[17:50:51 CET] <furq> well bad news, that's what opusenc does
[17:51:24 CET] <kepstin> pridkett: too bad, the opus codec (the codec itself, not talking about any implementations) as standardized by the ietf, does not allow arbitrary sample rates.
[17:52:11 CET] <pridkett> feluce gmaxwell is active in #opus now
[17:52:20 CET] <kepstin> pridkett: if ffmpeg is resampling 32kHz to 24kHz that does sound like a bug tho, that should be using 48K probably
[17:52:23 CET] <pridkett> furq  gmaxwell is active in #opus now
[17:52:28 CET] <furq> cool
[17:52:30 CET] <furq> have fun talking to him
[17:53:09 CET] <pridkett> kepstin really?  but it does
[17:53:17 CET] <pridkett> last time i checked
[17:54:04 CET] <pridkett> <+gmaxwell> Kepstin is incorrect. The opus format doesn't care about sample rates at all.
[17:58:37 CET] <kepstin> right, i was kind of simplifying a bit, the concept of sample rate when combined with an mdct and the other filtering etc. done is kind of iffy
[17:59:01 CET] <kepstin> anyways, ""mark4o> opusenc and libopusenc support 44.1 kHz.  libopus does not." is the relevent quote; ffmpeg does not use libopusenc.
[18:02:20 CET] <faLUCE> I did not even know that there was a libopusenc library :-)
[18:02:36 CET] <furq> it's pretty new
[18:04:02 CET] <faLUCE> anyway, for my ears 48000 and 44100 doesn't make difference
[18:04:10 CET] <faLUCE> I'm not a purist
[18:04:15 CET] <furq> new enough that it's not even mentioned on the opus docs page referenced in the libopusenc readme
[18:04:43 CET] <furq> afaik it's just a rollup of the opus-tools stuff into a library so you don't need to bring your own resampler etc
[18:04:58 CET] <kepstin> yeah, it just adds an ogg muxer, some metadata tools, and a resampler
[18:05:02 CET] <furq> but ffmpeg already has a good one so it's not really of much relevance
[18:16:05 CET] <jmspeex> What exactly is the issue with Opus and 44.1 kHz (I'm one of the main authors)
[18:17:22 CET] <faLUCE> jmspeex: we were discussing if 44.1 sample rate has to be accepted by the input
[18:17:34 CET] <faLUCE> jmspeex: you should join the #opus channel
[18:17:58 CET] <jmspeex> I am obviously, and what would you not accept 44.1 kHz as input?
[18:18:12 CET] <faLUCE> ?
[18:18:31 CET] <jmspeex> s/what/why/
[18:18:39 CET] <furq> jmspeex: the question is whether 44.1 has to be resampled before going to libopus
[18:18:50 CET] <furq> the answer is yes and i'm not sure why it's taken so long
[18:19:02 CET] <jmspeex> it does (unless you use libopusenc which will do it for you)
[18:19:38 CET] <pridkett> furq do you mean "if the answer is yes"?
[18:19:52 CET] <jmspeex> BTW, there's an entry about 44.1 kHz in the Opus FAQ for anyone who hasn't read it yet: https://wiki.xiph.org/index.php?title=OpusFAQ
[18:20:02 CET] <furq> no i mean the answer is yes. i said that exact thing 40 minutes ago
[18:21:21 CET] <faLUCE> jmspeex: the are continuing the discussion on the #opus channel
[18:21:35 CET] <faLUCE> you can see there what is not clear yet
[18:21:41 CET] <furq> i wonder how many times a day this comes up in #opus
[18:21:49 CET] <faLUCE> (they)
[18:21:51 CET] <furq> i'm surprised they don't just have a bot that links to the wiki
[18:23:55 CET] <pridkett> furq sorry but i trust opus dev more than you about opus
[18:24:39 CET] Action: kepstin bookmarks https://wiki.xiph.org/index.php?title=OpusFAQ#How_do_I_use_44.1_kHz_or_some_other_sampling_rate_not_directly_supported_by_Opus.3F for future reference.
[18:25:58 CET] <furq> i accept your apology
[18:26:39 CET] <kepstin> anyways, looking at libopusenc, I think there probably are some improvements that could be made to the ffmpeg libopus wrapper that may  help improve the quality of gapless playback.
[18:26:59 CET] <jmspeex> I guess it comes down to this. Resampler SNR: 90 to 140+ dB. Lossy codec SNR: 15 to 50 dB.
[18:27:33 CET] <furq> also your dac probably resamples to 48k anyway
[18:27:48 CET] <furq> or a multiple thereof
[18:28:22 CET] <jmspeex> in the end it's all resampled to a few MHz and one bit (sigma-delta and the like)
[18:28:28 CET] <kepstin> man, it was so annoying back in the AC'97 days when i had to run a software mixer to convert everything to 48kHz and mix multiple concurrent streams :)
[18:28:38 CET] <kepstin> (now it's less annoying)
[18:28:39 CET] <pridkett> furq why do you say that?
[18:28:46 CET] <pridkett> dac probably resamples to 48k anyway
[18:29:11 CET] <furq> because it does
[18:29:19 CET] <pridkett> how do i tell if my DAC resamples to 48k
[18:29:42 CET] <kepstin> i think most modern "intel hda" (azalia) desktop hardware supports being sent samples at a pretty wide range of input rates, no idea what it actually does internally. it's all a black box.
[18:30:27 CET] <jmspeex> The DAC will take either 44.1 or 48 (usually software configurable). From there, the old DACs would up-sample to 96 kHz or 192 kHz (or higher) to make their lives easier. Newer DACs upsample to a few MHz with just one bit samples and noise shaping.
[18:30:50 CET] <pridkett> https://hardforum.com/threads/x-fi-titanium-digital-output-sampling-rate-48khz-or-96khz.1461470/
[18:32:39 CET] <jmspeex> Also, the MDCT that most codecs use *is* a resampler. It resamples a 48 kHz signal into (e.g.) 960 signals sampled at 50 Hz.
[18:38:21 CET] <kepstin> anyways, as far as I can tell, the main difference between ffmpeg's libopus wrapper and libopusenc (other than possibly how some libopus ctls are configured, haven't checked them all) is that ffmpeg uses the same length for the last packet as it had for other packets and 0-fills the extra, while libopusenc may pick a shorter packet length and uses a predictor to generate the padding data.
[18:40:46 CET] <pridkett> there is a huge bug in ffprobe
[18:41:11 CET] <pridkett> Stream #0:0: Audio: dst (DST  / 0x20545344), 352800 Hz, 5.1(side), flt
[18:41:24 CET] <pridkett> that should be 2.8mhz not 3.5
[18:48:19 CET] <kepstin> oh, hmm, there's also the "decision_delay" stuff? i'm not sure exactly that that does from reading the code :/
[18:56:50 CET] <brimestone> Hey guys, im trying to output stderr to a log file, but I'm getting "Unable to find a suitable output format for '2>'" instead
[18:57:50 CET] <kepstin> brimestone: you've got a typo or are using the incorrect syntax for your shell
[18:58:10 CET] <kepstin> (the shell is supposed to interpret that before starting ffmpeg - but for some reason it's getting passed to ffmpeg itself)
[18:58:40 CET] <DHE> yeah, you probably have quotes somewhere they don't belong
[18:58:47 CET] <DHE> (or backslashes?)
[18:59:15 CET] <brimestone> https://gist.githubusercontent.com/brimestoned/1ba401bc1224c847da81c0f67ef320cc/raw/9ee6eda1acbb3f88d5c219ceb6a2bcb46f4b1d21/test.sh
[18:59:58 CET] <kepstin> hmm, mac os x. presumably bash shell?
[19:00:23 CET] <brimestone> Well, its runs on a docker (ubuntu) on a osx host
[19:00:41 CET] <kepstin> you've got a bunch of quoting issues there - filenames containing spaces without quotes?
[19:01:11 CET] <kepstin> and if you're running it in docker, depending how you do it it might not use a shell at all, so the redirection wouldn't work
[19:01:16 CET] <brimestone> Yes, but that is handle via an array..
[19:01:39 CET] <kepstin> if you use the array syntax of CMD with docker, it does not use a shell, so shell redirections will not work.
[19:02:05 CET] <kepstin> you may want to use a wrapper script in the docker container.
[19:02:15 CET] <brimestone> Oh.. hmmm
[19:03:01 CET] <brimestone> If tried to do -progress url to get stats but, it does it via sockets
[19:04:46 CET] <kepstin> i.e. have docker ENTRYPOINT [ "ffmpeg-wrapper.sh" ]  and then make that into a script which runs ffmpeg "${@}" 2>/path/to/save/log
[19:05:13 CET] <kepstin> then CMD will just be your regular arguments
[19:06:26 CET] <brimestone> thanks!
[19:10:00 CET] <brimestone> This is a thinker.. I need to refactor my docker payload.. Its currently like this. https://gist.github.com/brimestoned/1ba401bc1224c847da81c0f67ef320cc
[19:17:22 CET] <furq> brimestone: you can just use -report
[19:17:57 CET] <brimestone> furq: doesn't that only happens at the end of ffmpeg run?
[19:19:03 CET] <brimestone> Oh wow... that might worked!
[19:56:49 CET] <jmspeex> Is it true that ffmpeg resamples 32 kHz audio to 24 kHz when encoding with Opus?
[19:57:39 CET] <jmspeex> If so, that would be pretty dumb because 1) it destroys the 12-16 kHz band and 2) libopus will internally up-sample to 48 kHz anyway
[19:58:21 CET] <furq> http://vpaste.net/hA9nX
[19:58:22 CET] <furq> looks like it
[19:58:28 CET] <furq> unless it got changed since 4.1.1
[19:59:00 CET] <jmspeex> :-(
[19:59:26 CET] <JEEB> wonder if that's just for some reason the automated logic
[19:59:28 CET] <JEEB> 48000, 24000, 16000, 12000, 8000, 0,
[19:59:40 CET] <JEEB> are the sample rates marked supported by the libopus wrapper
[19:59:49 CET] <JEEB> (zero being the end-of-array thing I think)
[20:00:21 CET] <jmspeex> I think the save thing to do with Opus is to just resample everything to 48 kHz regardless of the original sampling rate
[20:01:06 CET] <jmspeex> Even for 8 kHz, if the signal is music, Opus will internally resample to 48 kHz because that's what CELT operates at.
[20:02:21 CET] <jmspeex> The only case where Opus will not internally resample to 48 kHz is for 8 or 16 kHz voice. And ffmpeg's resampler is likely better than the internal Opus one which is optimized for latency.
[20:02:47 CET] <JEEB> yea, so one fix would just do the same as the internal opus encoder, which marks only 48000 as supported
[20:03:23 CET] <JEEB> yup, remembered correclty
[20:03:24 CET] <JEEB> .supported_samplerates = (const int []){ 48000, 0 },
[20:03:27 CET] <JEEB> for the native one
[20:06:28 CET] <kepstin> there's probably a general fix that could be added to ffmpeg's automatic sample rate selection logic, to make it less likely to downsample stuff.
[20:06:42 CET] <JEEB> yea I did wonder why it picked to go that way
[20:06:55 CET] <JEEB> I see ffmpeg_opt copies the encoder's list into the OutputFilter
[20:07:03 CET] <JEEB> but I didn't get further than that
[20:12:43 CET] <kepstin> the sample rate selection logit is in swap_samplerates_on_filter in avfiltergraph.c i think, it's simply picking the rate with the smallest absolute difference from the requested rate
[20:13:12 CET] <JEEB> yea, probably ffmpeg_filter.c just picks the correct list from things
[20:13:26 CET] <JEEB> and passes it into the filter chain generation in lavfi
[20:14:03 CET] <jmspeex> JEEB: by internal encoder you mean the one atomnuker wrote?
[20:14:09 CET] <JEEB> yup
[20:14:54 CET] <jmspeex> hopefully that's not the one used by default (not that it's terrible, but it doesn't match the libopus one)
[20:15:25 CET] <kepstin> it's marked as experimental right now, i think? so you need to pass an extra flag to use it. We recommend people explicitly specify the libopus encoder.
[20:15:59 CET] <JEEB> I think there was a preference order somehow set for encoders of a format?
[20:16:26 CET] <kepstin> experimental encoders automatically get low preference, but the exact match by name takes priority iirc :/
[20:16:55 CET] <jmspeex> On the plus side, Opus is designed to make it hard to write a really bad encoder
[20:17:05 CET] <kepstin> and most of the internal codec implementations have the same name as the codec name
[20:18:00 CET] <jmspeex> (as in as long as you have a decent transient detector, you'll easily beat MP3 unless you do it on purpose)
[20:19:04 CET] <kepstin> anyways, to fix the 32000’24000 issue, I think all it needs is some better logic in https://git.videolan.org/?p=ffmpeg.git?a=blob;f=libavfilter/avfiltergraph.c;h=a149f8fb6d631adcee9f80217c6faad0f9df0193;hb=HEAD#l873
[20:19:35 CET] <kepstin> it picks 24000 because that's 8000 away, when 48000 is 16000 away.
[20:20:43 CET] <JEEB> right
[20:20:56 CET] <jmspeex> kepstin: I don't know if other codecs use that logic, but it's wrong in almost all cases
[20:21:10 CET] <jmspeex> up-sampling doesn't lose information, but down-sampling definitely does
[20:21:19 CET] <JEEB> yup
[20:21:24 CET] <kepstin> this doesn't have anything to do with codecs, this is with the sample rate conversion in the automatically inserted resampler in the filter chain
[20:21:27 CET] <kepstin> but yes, i agree
[20:21:38 CET] <JEEB> of course there's the "best" rates etc you can do resampling at
[20:21:54 CET] <JEEB> but downwards is probably wrong
[20:21:56 CET] <pridkett> jmspeex doesn't upsample cause  bad quality/size ratio though?
[20:21:56 CET] <JEEB> in any case
[20:22:26 CET] <kepstin> we already have some special logic for pixel formats for video to pick appropriate types in cases like this
[20:22:44 CET] <jmspeex> pridkett: Lossy codecs don't care what the original sampling rate is. They transform to the frequency domain and ignore the useless information
[20:23:05 CET] <JEEB> yet users loove to see numbers 8)
[20:23:12 CET] <pridkett> jmspeex for all lossy codecs?
[20:24:28 CET] <jmspeex> pridkett: all the ones worth using (including MP3) that don't have a braindead encoder
[20:25:18 CET] <kepstin> pridkett: upsampling does in general reduce efficiency of lossless codecs, since the codec has to store more information (however it does it) to get a bit-exact reproduction on decoding.
[20:25:52 CET] <jmspeex> Well, if you've resampled you're already not lossless, so...
[20:26:08 CET] <kepstin> of course :)
[20:26:14 CET] <pridkett> jmspeex so you are saying even IF  opus upsampled every audio to 192khz  it will still have same  quality/size as  what it is now?
[20:26:37 CET] <jmspeex> pridkett: Yes. The first thing it does is ignore everything below 20 kHz
[20:26:51 CET] <kepstin> i hope you mean "above" :)
[20:27:00 CET] <jmspeex> That's (kinda) equivalent to resampling to 40 kHz
[20:27:14 CET] <jmspeex> yes, I mean above
[20:27:23 CET] <pridkett> jmspeex  i find that very hard to believe
[20:27:43 CET] <jmspeex> pridkett: what do you find hard to believe?
[20:27:45 CET] <kepstin> iirc opus already runs a 20khz lowpass filter, even with 48kHz sampling input
[20:27:57 CET] <jmspeex> exactly
[20:28:07 CET] <pridkett> jmspeex but you are the expert, i will take your word
[20:28:25 CET] <jmspeex> pridkett: IIRC opusenc supports input sampling rates up to around 1 GHz or so
[20:28:36 CET] <jmspeex> have fun -- same quality
[20:28:49 CET] <pridkett> i said  quality/size
[20:28:51 CET] <kepstin> well, if you're making a lossy encoder, the first thing you want to do is remove any signal that a human definitely cannot hear
[20:28:54 CET] <jmspeex> same size too
[20:29:24 CET] <pridkett> then why can't this be the same case for lossless encoder/format ?
[20:29:40 CET] <kepstin> because lossless doesn't remove stuff that a person can't hear.
[20:29:56 CET] <pridkett> but it still removes
[20:30:02 CET] <pridkett> what is it removing then
[20:30:04 CET] <jmspeex> pridkett: HAve you watched Monty's video on sampling: https://xiph.org/video/vid2.shtml ?
[20:30:12 CET] <kepstin> lossless doesn't remove anything. otherwise it wouldn't be called lossless
[20:30:32 CET] <pridkett> it has to remove something to shrink down the filie size
[20:30:33 CET] <jmspeex> If not, then just go watch it. It's 24 minutes and will answer everything *much* better than we ever will
[20:30:52 CET] <kepstin> lossless codecs don't always make the file smaller - with some inputs, they will actually make it bigger
[20:30:57 CET] <pridkett> jmspeex okay i will watch
[20:31:06 CET] <kepstin> that's usually limited to artifically generated noise, tho
[20:31:12 CET] <pridkett> kepstin no lossless codec makes the file bigger
[20:31:25 CET] <pridkett> otherwise people won't use it
[20:32:35 CET] <kepstin> sounds like you need a lesson on lossless compression technology too :/
[20:32:50 CET] <kepstin> (i don't have a good reference offhand for that)
[20:32:58 CET] <jmspeex> pridkett: If you encode perfectly random noise, the output will be slightly bigger. If you could create a lossless encoder that's guaranteed to make the output smaller, all you'd need to do is run it enough times to get a single bit at the end.
[20:33:25 CET] <jmspeex> So yeah, you need to learn about information/compression theory. But first go watch that video
[20:39:17 CET] <pridkett> jmspeex  higher sample rate isn't =  wanting to have information higher than 20 khz though:   higher sample rate = closer to analog signal
[20:39:48 CET] <jmspeex> pridkett: That statement is proof that you haven't watched the video
[20:40:23 CET] <pridkett> digital can never be true analog signal,  but higher it is; the closer the analog it can be
[20:40:27 CET] <jmspeex> that's *exactly* what that video is about
[20:41:24 CET] <jmspeex> pridkett: either you keep repeating that nonsense (based on not understanding the sampling theorem) or you just watch the video
[20:41:31 CET] <kepstin> pridkett: monty even pulls out an analog oscilloscope to prove you wrong in his video ;)
[20:42:44 CET] <pridkett> if that is true 48khz wouldn't have existed in the first place
[20:42:52 CET] <pridkett> 44.1khz would be enough
[20:43:19 CET] <jmspeex> pridkett: 48 kHz is just more convenient in many respect. That's it
[20:44:14 CET] <jmspeex> after you've watched the video you can search for the weird origins of 44.1 kHz
[20:50:07 CET] <kepstin> I'm actually kind of surprised we didn't end up with 44.056kHz
[20:50:54 CET] <jmspeex> kepstin: you mean because color NTSC?
[20:51:10 CET] <jmspeex> that would definitely have sucked!
[20:51:22 CET] <kepstin> apparently it didn't use the color at all, but with NTSC equipment that was the rate you got yeah
[20:52:27 CET] <kepstin> hmm, you're right, that is an exact match to the 1000/1001 factor added for color ntsc
[20:52:39 CET] <kepstin> so i guess we do blame color :)
[20:53:07 CET] <kepstin> I hate that we still have to deal with that for video stuff
[20:53:12 CET] <jmspeex> At least with 44.1 kHz, the largest prime factor is just 7. For 44.056 kHz it would have been 5507!
[20:54:11 CET] <jmspeex> where did you get the 44.056 from (other than from the 29.97 color NTSC rate)?
[20:54:42 CET] <kepstin> it would be exactly 44100 * 1000/1001 (not 44056Hz)
[20:55:14 CET] <jmspeex> right
[20:56:14 CET] <jmspeex> I mean, it could have been worse... CDs sampled at 14*pi kHz :-)
[20:57:14 CET] <kepstin> on the plus side, 441000/1.001Hz audio would have an exact number of samples per frame when used with ntsc vide video, on the minus side... everything else.
[20:59:29 CET] <kepstin> i guess sony must have standardized on shipping 625line/50hz format video recorders with their pcm adapters.
[00:00:00 CET] --- Wed Mar 27 2019


More information about the Ffmpeg-devel-irc mailing list