[Ffmpeg-devel-irc] ffmpeg-devel.log.20140206

Fri Feb 7 02:05:03 CET 2014

[00:40] <cone-691> ffmpeg.git 03Janne Grunau 07master:5a0bccd2810d: fate: force the simple idct for xvid custom matrix test
[00:40] <cone-691> ffmpeg.git 03Michael Niedermayer 07master:927696aab258: Merge remote-tracking branch 'qatar/master'
[02:41] <cone-691> ffmpeg.git 03Timothy Gu 07master:4a37e2977cb2: libfdk-aacenc: disable hard version requirements
[02:42] <BBB> Skyler_: oh wow thanks
[02:42] <BBB> ubitux: really anything is fair game now, I don't think anything in particular is pressing anymore - really whatever you fancy
[02:45] <BBB> ubitux: I think skyler's idea has merit btw, I was pretty much ready to move the /= 2 out of the loop (see my patchset), and then it may help
[02:46] <BBB> ubitux: and as for simd, perhaps we're at a point where we need to start thinking about avx2
[02:47] <BBB> ubitux: that'll be a whole new world of fun, and things like mc can easily shave off another few percent
[02:53] <BBB> ubitux: also I think the reason it's long in that instruction is because you just loaded ecx from memory 2 lines up, so it's still waiting for cl to finish loading (load latency, basically)
[02:55] <BBB> ubitux: (and Skyler_ also) another thing is prefetch - absolutely nothing done there, and given the massive waits in the load statements of mc, I think that can gain another few %s
[02:59] <cone-691> ffmpeg.git 03Loren Merritt 07master:9c978f243a47: flac/x86: add ff_flac_lpc_32_sse4()
[03:00] <michaelni> BBB, btw do you want to mentor anything (in case ffmpeg is accepted this year) ?
[03:00] <BBB> michaelni: it's kind of tricky because I'm only online 30 minutes a day...
[03:01] Action: michaelni is impressed by how much BBB can do in 30 min per day
[03:01] <BBB> it's mostly weekend stuff ;)
[03:02] <BBB> can smarter mentor someone? or does he want to be a student?
[03:03] <michaelni> if he wants yes of course he can
[03:04] <smarter> I don't think I'll have the time to do either :)
[03:04] <michaelni> :(
[03:07] <smarter> if I have the time I may work informally on some stuff, like some basic 32 bits vp9 asm
[05:09] <rcombs> can AVCodec.encode_sub() access the AVPacket for the subtitle being encoded? (Is there one?)
[07:26] <ubitux> BBB: oh sure it probably has merit, i was just merely testing it :p
[07:26] <ubitux> BBB: about avx2 i'm still waiting for the hw to arrive, dunno how long it will take..
[09:17] <Skyler_> BBB: did the patch help at all for you?  I guess I'd divide it into ~2-3 separate ideas
[09:27] <ubitux> anyone a bit familiar with the build system to explain me why something like --disable-all --enable-ffmpeg doesn't actually build ffmpeg?
[09:35] <nevcairiel> probably lacks a dep which is not automatically enabled
[09:38] <ubitux> the _select entries are not enabled
[09:41] <ubitux> actually, the _deps aren't either
[09:43] <ubitux> oh well, whatever
[09:55] <ubitux> for some reason, i have fate-vp9 mismatches when building ffmpeg with ./configure --disable-everything --enable-decoder=vp9 --enable-demuxer=matroska --enable-muxer=null,framemd5 --enable-encoder=rawvideo --enable-protocol=file,pipe --disable-programs --disable-doc --enable-ffmpeg
[09:55] <ubitux> like, md5 are mismatching
[09:56] <ubitux> maybe because of something missing in sws
[11:31] <j-b> drv: ping
[11:48] <wm4> so, could OpenMAX (IL or DL) not be supported by libavcodec?
[12:07] <av500> sure
[12:08] <av500> you can glue it in
[12:08] <av500> (and cry)
[12:54] <BBB> Skyler_: will try over the weekend, didn't test yet (hard to do on weekdays)
[12:54] <Skyler_> np, not a big deal, I was just futzing when I was bored
[12:55] <BBB> ubitux: let's do prefetch then! that's actually fun
[12:56] <BBB> ubitux: especially because I think our 2pass decoding solves a central problem we have in vp8/h264 - we don't know the mv yet of 2 blocks ahead (with 2pass, we do!)
[12:59] <Compn> ubitux : carl usually fixes those build things 
[12:59] <Compn> if you report it to him
[13:23] <pross-au> who's idea was it to have a webp codec that creates an instance of the vp8 decoder
[13:30] <Daemon404> i am impressed that demuxing_decoding.c is architected so rigidl that i cannot fix a small problem with it without efactoring
[13:40] <nevcairiel> a lot of effort went into making ensuring that
[13:44] <BBB> Skyler_: also, please be bored again ;) - but seriously, if you have other ideas (coef decoding is indeed big and slow), we're very much open for other ideas worth exploring in that area (I've tried to do a bunch of stuff with context reading/writing in my vp9-context-opts branch on github, but coef decoding itself the only idea I had so far was to put the /2 out of the loop by making tx32x32 a separate function from the others
[13:45] <BBB> (separate function is ofc not duplicate code, just putting inline and having 2 callers, one decode_coeffs_b_not32 and one decode_coeffs_b_32)
[13:47] <Skyler_> most of my changes there were to eliminate the 1 in "top + left + 1 >> 1"
[13:47] <Skyler_> via hackery and changing the shift to a 2
[13:47] <Skyler_> also things like avoiding the qmul multiplication when val = 1
[13:48] <Compn> pross-au : i think google
[13:48] <Skyler_> I get the feeling there's something to be gained via optimizing branches and cmovs, like.  the cmov version is actually slower on my system overall, which is weird?  since HAVE_FAST_CMOV should be fast on a core i7
[13:49] <Skyler_> but it seems to depend on which arithdecode call.  like.  I would /guess/ that you'd want to make a prob=250 call branchy, since it'll be super predictable.
[13:49] <Skyler_> and so cmov maybe isn't worth it there?  I'm not sure
[13:49] <Compn> Skyler_ : cmov stuff was benchmarked for mpeg4 asp codecs back in the p4 days. i dont think it has been rebenched as of i7 days. i could be wrong...
[13:49] <Skyler_> it might also be because with the cmov version of the arithdecode code, it can't replace "bit = 1" with "bit = 1 << 2" or something like that
[13:49] <Daemon404> [12:23] <@pross-au> who's idea was it to have a webp codec that creates an instance of the vp8 decoder <-- only for lossy :P
[13:49] <Daemon404> lossless webp != webp
[13:49] <Skyler_> Compn: I mean speciically the vp56 asm, which is predicated on that
[13:49] <Skyler_> (it's a lot newer, I think I wrote it?  not quite sure)
[13:49] <Compn> ah
[13:50] <BBB> hm, it's certainly true I didn't look veyr closely which ones should be _branchy and which ones shouldn't
[13:50] <Skyler_> on my system it didn't actually configure with cmov by default?? I wonder what march you need to set for that
[13:51] <Skyler_> but setting it in config.h slowed things down, so, I was curious
[13:51] <Compn> so whats the question? is cmov on i7 a good idea ?
[13:51] <Skyler_> well, more like in what contexts is it?  like.  which arithdecode calls ~really~ want cmov, and which don't
[13:51] <Skyler_> and if cmov is slower, why, and is it just a problem with not being able to propagate constants or what?
[13:51] <Compn> ah, way above my skills to figure that out :)
[13:54] <Compn> intel has some docs on optimization reference
[13:54] <Compn> http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf
[13:54] <Compn> is it what you want, probably not 
[13:55] <Compn> code optimizations to eliminate branches using cmov
[14:04] <BBB> Skyler_: this is just vp9 right?
[14:04] <BBB> Skyler_: I'm pretty sure it's just me using _branchy vs non-branchy wrong
[14:05] <Skyler_> I think you're using it right
[14:05] <Skyler_> any time you have a control flow branch you want to use the _branchy version
[14:06] <Skyler_> because it'll be merged
[14:06] <BBB> right that's what I tried to follow in the code
[14:06] <BBB> maybe cmov is indeed slow on 64bit
[14:06] <BBB> or on i7
[14:06] <BBB> or both
[14:06] <BBB> or rather, branches are fast
[14:06] <Skyler_> I don't think it's slow, it's just a question of when it's worth it, I think?
[14:07] <Keestu> what is this libpostproc? i dont get this .a file while building with latest ffmpeg ?
[14:07] <ubitux> Keestu: --enable-gpl
[14:09] <Keestu> ubitux, there you go ;).  may i ask what it is used for ?
[14:09] <BBB> ubitux: predecessor of libavfilter, has some video post-processing filters (like deblock, etc.)
[14:09] <ubitux> deblocking filters
[14:10] <ubitux> Keestu: see vf_pp for instance
[14:10] <ubitux> which uses it
[14:10] <Keestu> ubitux, Ok.  if i want to enable every options by default  any way to do while building?
[14:12] <ubitux> --enable-gpl should add quite a bunch of things, remaining are external libs, and you need to specify them manually
[14:12] <ubitux> like libx264, libvpx, etc
[14:14] <Keestu> makes sense for the external libs, when i enable for example  --enable-libx264, it enables only the wrapper that are in ffmpeg, it doesn't require x264 while compiling right ?
[14:14] <Daemon404> uh
[14:14] <Daemon404> yes it does
[14:14] <Daemon404> linking is a thing.
[14:14] <Keestu> not loading dynamically the library ?
[14:15] <Keestu> ie: x264 lib.
[14:15] <ubitux> there is no dynamic loading
[14:15] <Daemon404> no
[14:16] <Keestu> Ah. then it is yes, we ofcourse need to be compiled with libx264.  :). Sorry if my questions are very basic. 
[14:16] <Keestu> somehow i have to start from the ground ;). 
[14:21] <Daemon404> holy shit a vp7 decoder
[14:22] <JEEB> yeh pross-au was talking about it
[14:26] <pross-au> the supply of "new" codecs is dwindling.
[14:28] <BBB> oh sweet vp7
[14:30] <j-b> No fucking way...
[14:32] <BBB> lol
[14:32] <j-b> It's like a joke, gone bad
[14:36] <BBB> pross-au: have you confirmed the changes to vp8dsp don't slow down vp8?
[14:36] <BBB> pross-au: otherwise I'd love to give this some review over the weekend, don't wait for me before you commit it, but it surely looks interesting
[14:36] <BBB> and non-bitexact mc, omg back to mpeg12days
[14:36] <BBB> anyway work, bbl
[14:45] <Compn> pross-au : no supply of "new" codecs? so many binary codecs in mplayer....
[14:45] <Compn> redcode codec too?
[14:45] <Daemon404> dont forget vc-5!
[14:45] <Daemon404> Compn, the proble with redcode isnt the codec
[14:45] <Daemon404> its the colorspace
[14:45] <Daemon404> which i bayer.
[14:45] <Daemon404> is*
[14:46] <Daemon404> (codec is just jpeg2000)
[14:46] <Compn> i think someone committed the bayer 
[14:47] <Daemon404> no
[14:47] <Compn> i asked someone to resend the bayer...
[14:47] <pross-au> i have to cleanup those bayer patches
[14:47] <pross-au> rings a bell ...
[14:47] <Compn> yes, i keep bugging pross-au about it :)
[14:48] <Compn> http://ffmpeg.org/pipermail/ffmpeg-devel/2012-December/135778.html
[14:48] <Compn> so olddd
[14:50] <Compn> Daemon404 : but thanks, i didnt know r3d was just jp2k + bayer. all i knew was the bayer
[14:51] <Daemon404> it has a custom container
[14:51] <Compn> they always do
[15:13] <smarter> so, what's missing to complete the collection of VP* codecs? :)
[15:16] <Compn> vp4 ?
[15:20] <Compn> and whatever duck tm20 vp2x stuff from way back in the day
[15:24] <Compn> there maybe one or two truemotion 2 variants not supported
[15:39] <Compn> kostya is the one to talk to about vp* :)
[16:36] <cone-595> ffmpeg.git 03Martin Storsjö 07master:49ec55159564: vp8: Use 2 registers for dst_stride and src_stride in neon bilin filter
[16:36] <cone-595> ffmpeg.git 03Michael Niedermayer 07master:c73445a45cdb: Merge remote-tracking branch 'qatar/master'
[16:36] <cone-595> ffmpeg.git 03Timothy Gu 07master:474db7a696a3: doc/texi2pod: make references bold
[20:17] <kurosu_> michaelni: do you remember why you ever split out synth_filter out of dca ?
[20:18] <kurosu_> I would have expected it to end up in dca dsp stuff, except if it is reusable
[20:18] <kurosu_> by what is my question
[20:30] <michaelni> kurosu_, thats quite long ago, but yes i think the idea was to reuse it
[20:33] <michaelni> sadly i dont really remember details 
[20:36] <kurosu_> I was suggested this is because of the similarity to mp3 apply_window
[20:36] <kurosu_> but on the other hand, we still want to compile them independently so it's better if they aren't together
[20:38] <michaelni> mp3 and dca use slight different dcts IIRC
[20:38] <kurosu_> yeah, there's some pretwiddling in dca
[20:40] <kurosu_> note, there has been some reviewing already, and good suggestions were made to improve the first patches
[20:40] <kurosu_> so if you ever had time to look at them, this is already something you can skip
[20:45] <michaelni> pretwiddling ?
[20:45] <michaelni> not sure what code you talk about
[20:46] <michaelni> theres some code in the encoder that converts between FFT and 2x MDCT i think and
[20:46] <michaelni> some code that does MDCT lapping but some will argue about my terminology here i know
[20:47] <michaelni> also what patches ?
[20:54] <kurosu_> A series of 11 patches
[20:54] <kurosu_> sent yesterday
[20:54] <kurosu_> to ffmpeg-devel at ffmpeg.org
[20:54] <michaelni> ehm right :)
[20:55] <kurosu_> (or maybe very early this morning, like 1am)
[20:55] <kurosu_> I was wondering if I had broken my setup (trying to use aliases for git send mail)
[20:55] <michaelni> i was up till 8 o clock today and apparently didnt sleep enough for my brain to fully work
[20:56] <kurosu_> no problem
[20:56] <kurosu_> I don't have further plans but this code has been aging far too much
[20:57] <kurosu_> pretwiddling: negating half the coeffs
[21:00] <michaelni> that sounds like fft<->mdct stuff
[21:02] <kurosu_> anyway, I'm not about to rewrite the foundation around this code :)
[21:10] <michaelni> kurosu_, why do you change int8x8_fmul_int32() in arm from static inline to static ?
[21:11] <kurosu_> oh I did ?
[21:11] <michaelni> yes in first patch
[21:11] <kurosu_> anyway, the code will not change anymore
[21:12] <kurosu_> let me verify and push my current code to show what it should be
[21:21] <kurosu_> the first 2 patches should become:
[21:21] <kurosu_> https://github.com/kurosu/libav/commit/a69b3cdbf5f441f0050d99b29473cffaabe1fdfb
[21:22] <kurosu_> https://github.com/kurosu/libav/commit/c3fa79549d1ab14113abd09e8e5a2a6ac72dd364
[21:40] <michaelni> kurosu_, can you cross post patches when there are new versions? i dont want to review code that has already been changes
[21:40] <michaelni> s/changes/changed?/
[21:46] <BtbN> How does the ffmpeg configure system detect encoders and stuff? It just picked up the encoder i added, and i don't understand how.
[21:47] <JEEB> http://git.videolan.org/?p=ffmpeg.git;a=commit;h=1ab5a780424ae8755858e153def1173a50a44e4c you can see this as an example of an added video encoder
[21:47] <JEEB> allcodecs/Makefile
[21:47] <JEEB> there's a step-by-step checklist somewhere too
[21:48] <BtbN> yeah, i added it successfully, but i expected that i'd need to edit more files
[21:48] <BtbN> and i don't see how it does that
[21:48] <JEEB> unless you have some specific needs, it's pretty simple
[21:48] <BtbN> the nvenc encoder, the way i implement it, has no dependencies
[21:50] <BtbN> What's the SKIPHEADERS part for? Is it for excluding some header from beeing installed?
[21:53] <BtbN> The only thing i'm realy unsure about is what to do with this file: https://github.com/BtbN/FFmpeg/blob/nvenc/libavcodec/nvenc_api.h
[22:21] <kurosu_> michaelni: sure, but those haven't been submitted anywhere yet
[22:22] <michaelni> kurosu_, thanks
[22:23] <kurosu_> as no review has been done yet, I think I'll just resend the whole
[22:39] <michaelni> kurosu_, ok thx
[22:56] <kurosu_> michaelni: about to send, but nothing really new (except first 2 patches)
[22:56] <kurosu_> I'm thinking of moving back the synth dsp into the dca dsp
[22:57] <michaelni> feel free to just send the 2 then
[22:57] <michaelni> kurosu_, about sythn do what you prefer
[22:58] <kurosu_> do you prefer I postpone sending the whole set until this is settled? I know one patch in the current set will be dropped
[22:59] <michaelni> kurosu_, fine with me iam quite busy anyway currently
[23:00] <kurosu_> ok I'll just wait then
[23:16] <kierank> BtbN: doesn't seem gpl compatible
[23:17] <BtbN> yes, that's the problem
[23:18] <BtbN> but there is not realy an alternative, as there is only this one nvenc api
[23:21] <BtbN> it's only the header, the library is dynloaded at runtime
[23:53] <nevcairiel> could require the license to be supplied externally, like for other APIs
[23:53] <nevcairiel> eh, the header
[23:53] <nevcairiel> not the license :)
[23:56] <BtbN> that would add the requirement for a several GB CUDA and NVENC SDK, which is extremely annoying to setup
[23:57] <BtbN> which both have no default path or something like that
[23:59] <JEEB> CUDA_PATH - C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\
[23:59] <JEEB> seems to exist on windows at least
[23:59] <BtbN> In my case it's in D:\Lib\CUDA
[23:59] <BtbN> the path can be freely configured in the installer
[23:59] <JEEB> yes, the installer should set it correctly
[23:59] <JEEB> CUDA_PATH being the env var
[23:59] <BtbN> i don't have that var
[00:00] --- Fri Feb  7 2014