[Ffmpeg-devel-irc] ffmpeg-devel.log.20170102

burek burek021 at gmail.com
Tue Jan 3 03:05:02 EET 2017

[00:24:55 CET] <atomnuker> bofh_: oh hey you were here as well
[00:25:13 CET] <atomnuker> do you have your out of tree implementation somewhere?
[00:25:32 CET] <atomnuker> did you modify the codebase or did you just quick and dirty link and used private functions?
[02:40:12 CET] <bofh_> atomnuker: made some private functions public, but basically no changes
[02:40:37 CET] <bofh_> atomnuker: i.e. revtab_fwd is just revtab but with "inverse" = 0 , revtab_inv is just revtab for "inverse" = 1
[02:40:54 CET] <bofh_> but the construction is the same
[02:41:17 CET] <bofh_> (sorry, walked back from campus to home, trying to see if I have a more recent copy on here)
[02:46:24 CET] <cone-369> ffmpeg 03Thomas Turner 07master:3126ca282515: avutil/tests: added selftest for aes_ctr.c
[03:02:24 CET] <atomnuker> bofh_: k gets a huge value in "k = (i*inv_2*15 + j*inv_1) - q*Ntot;" and segfaults
[03:02:59 CET] <atomnuker> wait, q is uninitialized
[03:05:47 CET] <atomnuker> bofh_: I'm getting a value of l = 52168 which segfaults when calling fft15(s, out + l..., are you sure you use the revtab from the power of two fft context?
[03:16:34 CET] <atomnuker> forgot to initialize the power of two fft with the correct N, doesn't crash now but it's outputting junk
[04:12:04 CET] <atomnuker> bofh_: apparently you're using libfft which is just the ffmpeg fft with a slightly different API
[04:12:28 CET] <atomnuker> (which probably hasn't been updated in god knows how long)
[04:15:30 CET] <atomnuker> ...which you are actually the author of
[04:39:34 CET] <Zeranoe> Has there been any recent changes to DeckLink or argument parsing? https://ffmpeg.zeranoe.com/forum/viewtopic.php?f=7&t=4836
[04:56:41 CET] <bofh_> atomnuker: uh I found the up-to-date version, let me just give you that
[04:56:59 CET] <bofh_> (also if "libfft" got somewhere then that's an accident, that's literally just internal code I used for testing and it's ancient)
[04:58:54 CET] <bofh_> but like the q thing is very stupid: http://pastebin.com/a429XL4E
[04:59:25 CET] <bofh_> (literally q=k fixes it, that was a multiply via asm for div-by-15 in my code and I quickly hacked it out)
[05:16:38 CET] <atomnuker> bofh_: I'm still getting uninitialized data at the output
[05:19:38 CET] <atomnuker> bofh_: this is what I'm using: http://sprunge.us/gSZT?c
[05:20:07 CET] <atomnuker> look at ff_fft15_transform() on line 287
[05:40:06 CET] <cone-369> ffmpeg 03James Almer 07master:d800d48fc672: configure: bump year
[10:07:00 CET] <atomnuker> bofh_: I'm skeptical your transform even works, you can't just do 15 uncorrelated ptwo FFTs and then just rearrange the coefficients, you have to butterfly such that the result is one FFT
[11:43:18 CET] <cone-170> ffmpeg 03Carl Eugen Hoyos 07master:28307ef7e628: lavc/psd: Support indexed files.
[11:56:28 CET] <BtbN> so that was merged.
[14:11:06 CET] <Chloe> Will ffmpeg get support for editing PSDs? Maybe an official GUI for it too? 
[14:20:37 CET] <durandal_1707> Chloe: patches welcome
[18:17:13 CET] <bofh_> atomnuker: http://csclub.uwaterloo.ca/~pbarfuss/pfa.c
[18:17:27 CET] <bofh_> completely self-contained with test program against fftw3f that works
[18:17:44 CET] <bofh_> (the issue was that my fft15 is not interchangeable with imdct15's, and I forgot this)
[18:18:34 CET] <bofh_> (specifically mine uses unity input stride and n output stride, instead of n input stride and unity output stride)
[18:42:56 CET] <cone-382> ffmpeg 03Michael Niedermayer 07master:3d8a8fd27e63: avfilter/vf_pad: Fix segfault if reconfiguration fails
[20:22:57 CET] <atomnuker> bofh_: lol, rand() not good enough for ye?
[20:28:04 CET] <atomnuker> bofh_: I still don't understand how it would work, it's only Nx15 FFTs feeding 15xN FFTs, how would a diagram look like?
[20:45:26 CET] <fritsch> there is another approach computing the fft, without using the butterfly for optimization
[20:45:41 CET] <fritsch> which creates the 2D from two folded 1Ds
[20:45:57 CET] <fritsch> perhaps he does not "sum up" coefficients, but folds them
[20:49:45 CET] <cone-382> ffmpeg 03Steven Liu 07master:16ff54664a22: avformat/rtmphttp: fix bug for rtmphttp
[20:49:46 CET] <cone-382> ffmpeg 03John Comeau 07master:d06518752b84: avcodec/x86/imdct36: fix building with nasm 2.11.05
[20:55:20 CET] <BBB_> michaelni: I think cehoyos is right that it still doesnt build (at least on my machine) b/c of #4918
[21:05:31 CET] <bofh_> atomnuker: https://en.wikipedia.org/wiki/Prime-factor_FFT_algorithm#DFT_re-expression
[21:06:25 CET] <bofh_> atomnuker: I reindex the input, then feed it into N 15-point FFTs (the inner expression), putting each value where it needs to be for the outer FFT. Then I evaluate 15 N-point FFTs. Then I do the outer reindexing.
[21:07:13 CET] <bofh_> The factorization is PFA -> 15 point FFT by hand + 2^k FFT done via Cooley-Tukey
[21:11:07 CET] <bofh_> Because each of the base 15-pt FFTs only cares about a small number of values, I handle the reindexing for the latter at the same time as evaluation. Can't figure out how to do it for the outer reindexing (well, I can for the case when inv_1 = 1, but not when it's 2, 4 or 8).
[21:11:28 CET] <bofh_> And yeah, it's not butterflies, it's an application of the Chinese Remainder Theorem.
[21:12:15 CET] <bofh_> (the expressions in the loops are equivalent to what's on wikipedia, but done that way so no div instructions are generated).
[21:13:17 CET] <bofh_> also rand()'s impl tends to vary between systems and I had splitmix64 just lying around so yeah
[21:15:36 CET] <fritsch> bofh_: may I ask why you decided for that one?
[21:15:47 CET] <fritsch> bofh_: your input is always relatively rpime?
[21:16:38 CET] <bofh_> fritsch: because 15 is relatively prime to 2^k, yeah
[21:16:53 CET] <bofh_> I'm not sure of any faster algorithm than PFA/Winograd for that case
[21:16:54 CET] <fritsch> ah okay, you use it for a special scenario
[21:17:02 CET] <fritsch> to my knowledge there is none
[21:17:14 CET] <fritsch> as the butterfly needs extra multiplications
[21:17:17 CET] <bofh_> meaningfully reusing the existing ffmpeg 2^k FFT was also a goal, b/c it's nice and fast.
[21:18:01 CET] <bofh_> which is why I don't do the 2^k FFTs first (which is what I *believe* imdct15 is currently doing, except recursively in an odd way).
[21:18:34 CET] <bofh_> fritsch: yeah, special case for Opus.
[21:18:47 CET] <bofh_> because jmspeex hates sanity.
[21:22:08 CET] <fritsch> wondering about the performance in real life
[21:22:13 CET] <fritsch> vs. padded other stuff
[21:22:56 CET] <fritsch> you optimize on operations (mul, add, whatever), right?
[21:24:07 CET] <bofh_> well I mostly reuse the existing implementations for the base FFTs, and PFA/Winograd has no twiddle factors, it's purely a reindexing.
[21:24:56 CET] <bofh_> and uh yeah you do optimize slightly on operations
[21:25:06 CET] <bofh_> but at the cost of more complicated memory accesses
[21:25:12 CET] <bofh_> so, I'm not sure.
[21:25:16 CET] <fritsch> what I wonder is, if doing the chinese rest whatever thing is better than extending the 2d with padding
[21:25:29 CET] <bofh_> yeah, I'm not sure about that.
[21:25:37 CET] <bofh_> FMAs are cheap, L2 misses aren't.
[21:25:56 CET] <bofh_> so it might be the case that just padding a 480-pt FFT to 512 might be faster.
[21:26:03 CET] <fritsch> jep
[21:26:22 CET] <bofh_> (it almost certainly will be until someone SIMDs the 15-pt FFT).
[21:27:11 CET] <bofh_> esp since that's only 32 entries of padding
[21:29:53 CET] <bofh_> also ugh I should go through my list of ffmpeg stuff and submit patches. kind of wanted to figure out a non-slow way to do LDSBR qmf analysis first, but ugh.
[21:30:11 CET] <Rathann> Happy New Year! :)
[22:26:59 CET] <kierank> BBB: any idea where to look for these luma artefact problems?
[22:27:12 CET] <BBB> the edges around the block?
[22:27:34 CET] <BBB> or another one?
[22:27:36 CET] <BBB> happy newyear btw
[22:27:41 CET] <kierank> yeah happy new year
[22:27:45 CET] <kierank> edges round the block mainly
[22:30:46 CET] <kierank> 4:4:4 samples are also broken so might be related
[22:33:21 CET] <BBB> hm ...
[22:33:36 CET] <BBB> I dont immediately have any specific ideas
[22:33:53 CET] <BBB> did you try multiple idcts (e.g. float) to make sure its not related to that?
[22:34:02 CET] <kierank> the binary has mention of a postprocessor so I wonder if the file is meant to look like that
[22:34:08 CET] <kierank> using float idct at the moment
[22:34:13 CET] <kierank> but I will find another one
[22:35:22 CET] <BBB> it looked really odd
[22:35:27 CET] <BBB> if that was intentional, Id be hugely surprised
[22:36:18 CET] <BBB> if you convert the output from the binary decoder back to yuv and fdct the first few blocks, do you see specific dequantized coeff differences between your bitstream parsing output and the converted/fdcted reconstruction?
[22:36:24 CET] <BBB> that may be one way to look at it
[22:36:32 CET] <BBB> but I dont have generic quick ideas on what it could be :(
[22:36:42 CET] <kierank> I need to find a way of doing rgb (709) -> yuv but yes I can do that
[22:37:21 CET] <BBB> that should be trivial no? :-p
[22:37:37 CET] <kierank> yeah I probably have code somewhere to do it
[22:37:45 CET] <kierank> just too lazy to write it again if i can't find it :)
[22:38:18 CET] <cone-382> ffmpeg 03Michael Niedermayer 07master:aa952920431b: avcodec/x86/vc1dsp_mc: Fix build with NASM 2.09.10
[22:41:07 CET] <BBB> lol 
[00:00:00 CET] --- Tue Jan  3 2017

More information about the Ffmpeg-devel-irc mailing list