[Ffmpeg-devel-irc] ffmpeg-devel.log.20140105

Mon Jan 6 02:05:02 CET 2014

[00:07] <cone-115> ffmpeg.git 03Alexander Strasser 07master:77015443a84b: lavf/file: file_check: Handle file URLs that start with "file:"
[00:25] <ubitux> ok, found my bug
[00:25] <nevcairiel> \o/
[00:25] <ubitux> i'm obviously not unpacking correctly
[00:25] <ubitux> i need a signed b->w unpacking
[00:25] <ubitux> i was using the <0 mask as an unpack 2nd operand
[00:26] <ubitux> (so it was unpacking with ff with <0 and 0 with>=0)
[00:26] <ubitux> but obviously it's not the correct representation for <0 ones
[00:27] <ubitux> still not sure how to do that
[00:27] <ubitux> mmh actually it should be
[00:27] <nevcairiel> yes, it should
[00:28] <nevcairiel> extending a signed value means you put the signed bit in all the new bits
[00:28] <nevcairiel> so 0 or ff should be fine
[00:28] <ubitux> yeah well something is wrong around here anyway
[00:34] <kurosu_> punpack{h,l}bw with itself, psraw 8
[00:35] <kurosu_> or the bxb->w instruction whatever its name is
[00:36] <kurosu_> pmaddusbw or something
[00:39] <ubitux> wtf is going on...
[00:45] <ubitux> oh FFS
[00:46] <ubitux> this time i got the bug for real
[00:47] <ubitux> i'm doing clip_i8(i8 - i8) instead of clip_i8(u8 - u8)
[00:47] <ubitux> how am i going to do this now.
[00:48] <ubitux> kurosu_: no that trick was fine, my brain is just melting
[00:48] <nevcairiel> putting it in the top bits and doing a signed shift may be easier though
[00:48] <nevcairiel> arithmetic shift or how its officially called
[00:49] <ubitux> well it's just one additionnal instr., it's really fine
[00:56] <cone-115> ffmpeg.git 03Stefano Sabatini 07master:98ecbf0093e5: doc/protocols/file: document general file protocol URL syntax
[00:56] <cone-115> ffmpeg.git 03Stefano Sabatini 07master:92e145acbb9b: doc/protocols/file: fix semantical reverse
[00:56] <cone-115> ffmpeg.git 03Stefano Sabatini 07master:22fa50d15938: lavf/file: fix help message first character casing for trunc option
[01:02] <ubitux> brain is off, i'll solve that tomorrow i think
[01:11] <cone-115> ffmpeg.git 03Michael Niedermayer 07master:7546ac2fee40: avformat/mp3dec: fix start time in light of initial skip samples
[01:49] <cone-115> ffmpeg.git 03Alex Converse 07master:e2096e2eaa9e: fate: Add a downsampled SBR testvector
[01:49] <cone-115> ffmpeg.git 03Michael Niedermayer 07master:c69906151b50: Merge commit 'e2096e2eaa9e75663d6bf0c37d342752aa5a146d'
[02:11] <cone-115> ffmpeg.git 03Alex Converse 07master:b2212dec0f01: aac: Fix TNS decoding for the 512 sample window family.
[02:11] <cone-115> ffmpeg.git 03Michael Niedermayer 07master:4195ae0fd815: Merge commit 'b2212dec0f011893ec68eecaa990170fa24050d7'
[02:20] <cone-115> ffmpeg.git 03Alex Converse 07master:42d1b4198397: fate: Add a test vector for AAC ELD with TNS.
[02:20] <cone-115> ffmpeg.git 03Michael Niedermayer 07master:f200ec20b815: Merge commit '42d1b41983971da63302ac3d12091cad1f3d6324'
[02:26] <cone-115> ffmpeg.git 03Alex Converse 07master:7f29644108c5: aac: Fix low delay windowing.
[02:26] <cone-115> ffmpeg.git 03Michael Niedermayer 07master:fd53f9d985c9: Merge commit '7f29644108c5fbd80f160930b31b78b8704c1a49'
[02:39] <cone-115> ffmpeg.git 03Alex Converse 07master:9d18a7d3ec09: fate: Update AAC ELD 5.1 ref for recent bugfixes.
[02:39] <cone-115> ffmpeg.git 03Michael Niedermayer 07master:a57da850c0ae: Merge commit '9d18a7d3ec09d6d933d648570643fde924aa391a'
[02:45] <cone-115> ffmpeg.git 03Martin Storsjö 07master:82b9799bb211: sdp: Check that fmt->oformat is non-null before accessing it
[02:45] <cone-115> ffmpeg.git 03Michael Niedermayer 07master:70937d9708bc: Merge remote-tracking branch 'qatar/master'
[04:53] <cone-115> ffmpeg.git 03Peter Ross 07master:f5f6e59495a3: avformat/matroskaenc: warn when muxing video codec not supported by format
[13:14] <ubitux> fixed, but another mismatch :(
[13:14] <ubitux> it never ends :p
[13:14] <ubitux> hopefully the last one this time
[13:28] <nevcairiel> where would be the fun if it worked right away
[13:33] <ubitux> :(
[13:33] <ubitux> it's easier to debug though now
[13:33] <ubitux> since i have a shitload of c printing everywhere
[13:38] <BBB> that's useful
[13:39] <BBB> do any of the samples decode correctly now (and still actually invoke this code)?
[13:39] <ubitux> yes, the first one at least passes
[13:39] <ubitux> (07 one)
[13:39] <ubitux> i still have a single off-by-one in 08
[13:40] <ubitux> dunno about the rest, maybe hidding some other surprises
[13:44] <cone-251> ffmpeg.git 03Diego Biurrun 07master:52ccc4a0ece8: configure: Support preprocessor macros as header names
[13:44] <cone-251> ffmpeg.git 03Michael Niedermayer 07master:05286b6a9cc3: Merge remote-tracking branch 'qatar/master'
[13:59] <ubitux> ah, found the bug...
[13:59] <ubitux> annoying one :(
[15:53] <kurosu_> BBB: probably you aren't interested, but do you think the tricks with few dct coeffs for idct would work for h264? or is its ict too simple to show benefits?
[15:54] <kurosu_> haven't checked in details, but I guess you just check the number of coeffs. In the jpeg implementation, they were doing such things for vertical/horizontal-only coeffs
[15:55] <wm4> Daemon404: whatever happened to your idea to add libcurl to libavformat?
[16:09] <BBB> kurosu_: it's worth trying, I've had very mixed results in vp9
[16:09] <BBB> kurosu_: the gains in h264 will not be as big since even the biggest transform (8x8) still fits in a single iteration (unlike 16x16, which requires 2 16x8s)
[16:10] <BBB> 16x8 as in 1d idcts over half the rows/cols of coefficients
[16:10] <BBB> kurosu_: it's made harder b/c h264 has things like doing 2 4x4s at a time for inter blocks, and then this gets very hard very quickly
[16:11] <kurosu_> 2 4x4 at a time? that's a decoder trick, right ?
[16:12] <kurosu_> in that case, I guess so
[16:13] <kurosu_> and probably wouldn't apply to hevc/vp9 because 4x4 usually means high bitrate/intra and therefore the trick hardly applies
[17:05] <BBB> kurosu_: right, or lossless
[17:05] <BBB> kurosu_: it may work for the 8x8 in h264, but I don't know how common that is
[17:05] <BBB> I guess it can happen at higher resolution
[17:06] <BBB> like I said, worth a try but I'm not sure it'll be as transferrable
[17:19] <BBB> ubitux: ok sub8x8 is now on github also, I think I'm done with this
[17:19] <BBB> ubitux: let me know when lf is done and we'll do some blogpost to announce that it's "good enough". other simd (intra pred and iadst) and other minor optimizations can be done later after that
[17:20] <BBB> I have many ideas, but we can call it finished anytime after lf is done I think
[17:20] <ubitux> i found the bug, i need to think a bit about it, maybe done tonight, then cleanups and stuff, and i'll need to do the transpose so the horizontal lpf can work
[17:23] <ubitux> (remaining bug is that i'm re-reading already processed input in filter14(), so i need to review my "read-dep tree" :p)
[17:25] <BBB> ah right
[17:25] <BBB> shall I re-send patches to ML
[17:25] <BBB> or is github OK?
[17:25] <BBB> (also for michaelni once we're ready to merge)
[17:26] <ubitux> github is fine with me
[17:26] <ubitux> but i've something to do right now, bbl
[17:26] <cone-251> ffmpeg.git 03Federico Simoncelli 07master:b1ad93123317: v4l2: setting device parameters early
[17:28] <BBB> yeah no hurry
[17:28] <cone-251> ffmpeg.git 03Michael Niedermayer 07release/1.2:71b3235cea40: avformat/oggdec: dont read timestamps from EOS pages of ogm videos
[17:28] <cone-251> ffmpeg.git 03Michael Niedermayer 07release/2.1:8763aca389f0: avformat/oggdec: dont read timestamps from EOS pages of ogm videos
[18:18] <Daemon404> [14:55] <+wm4> Daemon404: whatever happened to your idea to add libcurl to libavformat? <-- on my Things I'd Love To Do list
[18:54] <cone-251> ffmpeg.git 03Michael Niedermayer 07master:d04aceb7d032: avformat/nutdec: remove unused variable
[18:54] <cone-251> ffmpeg.git 03Federico Simoncelli 07master:b13d6c837fd7: pulse: set time_base as multiple of sample_rate
[18:54] <cone-251> ffmpeg.git 03Federico Simoncelli 07master:b53d6ce3fdef: pulse: get latency only when needed
[19:16] <ubitux> yay it's now passing 6 of them.
[19:16] <ubitux> ...and yet another mismatch :))
[19:22] <BBB> ubitux: \o/ gj
[19:22] <ubitux> not yet :(
[19:22] <BBB> 6 is pretty good
[19:25] <ubitux> YAY
[19:25] <ubitux> IT FUCKING WORKS
[19:26] <JEEB> \o/
[19:26] <ubitux> now comes the cleanup and shit
[19:29] <BBB> ubitux: weeh
[19:30] <BBB> uh weird
[19:30] <BBB> can I not log with a parsing context?
[19:32] <ubitux> ?
[19:32] <ubitux> av_log(NULL ... works for me
[19:32] Action: ubitux is using av_log(0,0,...) all the time
[19:35] <BBB> see ML (or that trac ticket from ami_stuff just now)
[19:35] <BBB> I know you filed one also but I didn't care yet so far
[19:35] <BBB> I can look if you want
[19:36] <BBB> hm frame size change
[19:36] <BBB> I hate it when that happens
[19:37] <BBB> I sort of expected that to not work, I'll fix it
[19:37] <BBB> ideally we'd make scalable work at some point, but I just couldn't care less about scalable
[19:49] <ubitux> 17553 decicycles in loop_filter_v_16_16_c, 8387632 runs, 976 skips
[19:49] <ubitux> 3977 decicycles in ff_vp9_loop_filter_v_16_16_ssse3, 8387836 runs, 772 skips
[19:50] <ubitux> i think that's a bit better :)
[19:50] <ubitux> i can probably win a few more cycles
[19:51] <ubitux> BBB: weird error :p
[19:51] <ubitux> probably stack thrashed or something, exploding elsewhere
[19:53] <BBB> ?
[19:53] <BBB> what error?
[19:53] <BBB> oh in ami_stuff's bug?
[19:53] <ubitux> yes
[19:53] <BBB> no I was using a AVParseCodecContext as log ctx
[19:53] <BBB> that's not legal
[19:53] <BBB> so it died
[19:54] <ubitux> ah
[19:54] <BBB> so I changed it to an avcodeccontext and then it's fine
[19:54] <BBB> we should probably fix that in avparsectx
[19:54] <BBB> but that's a abi break so not now
[19:55] <ubitux> LGTM :P
[19:55] <BBB> I think I also know how to fix your crash but it's a little more involved
[19:55] <BBB> review my github when lf cleanup is done, I'll review your code
[19:55] <BBB> and then we're getting somewhere
[19:55] <BBB> how much overall speedup?
[19:55] <BBB> in wallclock decoding time
[19:56] <kierank> 6:50 PM <"ubitux> 17553 decicycles in loop_filter_v_16_16_c, 8387632 runs, 976 skips
[19:56] <kierank> 6:50 PM <"ubitux> 3977 decicycles in ff_vp9_loop_filter_v_16_16_ssse3, 8387836 runs, 772 skips
[19:56] <kierank> nice
[19:58] <BBB> kierank: we'll get another 10% per function with avx :)
[19:58] <BBB> (or something like that)
[19:58] <BBB> at least for the idcts
[19:58] <ubitux> BBB: 8.32s ’ 9.32s on ped1080p.webm with 1 thread
[19:59] <BBB> huh
[19:59] <ubitux> huh sorry
[19:59] <BBB> it got slower?
[19:59] <ubitux> the other way around
[19:59] <ubitux> ofc
[19:59] <BBB> lol
[19:59] <BBB> ok cool
[19:59] <ubitux> (tested c after)
[19:59] <kierank> if there was a place in london I could colo a 1U box I could give you access to an avx2 machine
[19:59] <BBB> I think with my and your patch we'll be beating libvpx then
[19:59] <BBB> and we're not even done with simd
[19:59] <BBB> so this is really very cool
[20:00] <ubitux> :)
[20:00] <ubitux> that's only the vertical one vtw
[20:00] <ubitux> btw
[20:00] <ubitux> i need to write the tranpose so we can have about 1 less second with horizontal 
[20:01] <BBB> cool
[20:02] <BBB> which ones does this include?
[20:02] <BBB> just 16_16?
[20:02] <BBB> or 16_16, 16_8?
[20:02] <BBB> or all 6 8/16 ones and the 2 mix ones?
[20:02] <BBB> or some other combination?
[20:03] <ubitux> just 16_16
[20:03] <ubitux> vertical
[20:03] <BBB> oh whoa
[20:03] <BBB> that's a massive speedup really for just that one
[20:03] <BBB> good job
[20:04] <BBB> I see about 50% of calls to v_16_8_c coming from v_16_16, and the other 50% from decode_frame
[20:05] <BBB> which means if you make a mmx version that operates on 8 pixels (if that's possible; else just work on full sse registers and ignore the upper halves) you'll get another big gain
[20:05] <BBB> (if you didn't already)
[20:05] <BBB> hm I'm not sure that makes sense, this shouldn't happen except on picture edges
[20:05] <BBB> ohwell
[20:05] <BBB> what do I know
[20:05] <BBB> anyway
[20:05] <BBB> g
[20:05] <ubitux> :p
[20:06] <BBB> ood job
[20:06] <BBB> maybe instruments isn't so good after all
[20:16] <cone-251> ffmpeg.git 03Michael Niedermayer 07master:ee16e0cacc16: avfilter/vf_format: check that the format list is not empty
[21:16] <smarter> <kurosu_> BBB: probably you aren't interested, but do you think the tricks with few dct coeffs for idct would work for h264? or is its ict too simple to show benefits?
[21:16] <smarter> what trick?
[21:38] <ubitux> 3936 decicycles in ff_vp9_loop_filter_v_16_16_ssse3, 8387211 runs, 1397 skips
[21:38] <ubitux> faster \o/
[21:38] <ubitux> now i'm going to try < 3900
[21:40] <ubitux> oh yeah i can do something cool
[21:48] <beastd> ubitux: what is it?
[21:48] <ubitux> avoid 1 or 2 write
[21:48] <ubitux> that might allow me to reach < 3900
[21:51] <cone-766> ffmpeg.git 03Ronald S. Bultje 07master:847072873c95: vp9_parse: don't use AVCodecParserContext as av_log context.
[21:56] <ubitux> 3905 decicycles in ff_vp9_loop_filter_v_16_16_ssse3, 8387654 runs, 954 skips
[21:56] <ubitux> aha!
[21:57] <nevcairiel> your goal, you did not reach
[21:58] <ubitux> yeah but i micro-optimized 300/500 lines
[21:58] <ubitux> maybe there is some room for improvements later on
[22:10] <cone-766> ffmpeg.git 03Tim Walker 07master:5c437fb672b6: lavu: Add values for various Dolby flags to the AVMatrixEncoding enum.
[22:10] <cone-766> ffmpeg.git 03Michael Niedermayer 07master:751385fe3f65: Merge commit '5c437fb'
[22:20] <cone-766> ffmpeg.git 03Tim Walker 07master:5b4797a21db9: avframe: add AV_FRAME_DATA_MATRIXENCODING side data type.
[22:20] <cone-766> ffmpeg.git 03Michael Niedermayer 07master:4cf4da9dc581: Merge commit '5b4797a21db900b7d509660b7a4d49829089b004'
[22:33] <cone-766> ffmpeg.git 03Tim Walker 07master:6bfdb2de8813: dcadec: set the output channel mode more accurately.
[22:33] <cone-766> ffmpeg.git 03Michael Niedermayer 07master:bc7f76377c1e: Merge commit '6bfdb2de881372048be7fbda643417e1fd3ce93c'
[22:38] <cone-766> ffmpeg.git 03Tim Walker 07master:30d70e79a6b4: dcadec: set AV_FRAME_DATA_MATRIXENCODING side data.
[22:39] <cone-766> ffmpeg.git 03Michael Niedermayer 07master:ae01af24756d: Merge commit '30d70e79a6b4ac7f4eb66446a9da275161ef6ea7'
[22:52] <ubitux> ok i don't think i'll reach below
[22:52] <ubitux> :(
[22:58] <cone-766> ffmpeg.git 03Tim Walker 07master:4b7f1a7ced0e: mlp: Parse TrueHD decoder channel modifiers and set the AVMatrixEncoding for each substream.
[22:58] <cone-766> ffmpeg.git 03Michael Niedermayer 07master:b4107f7805be: Merge commit '4b7f1a7ced0e98f2cc698d896f7ebab8d30eaa09'
[23:06] <cone-766> ffmpeg.git 03Tim Walker 07master:e92123093dfd: mlpdec: set AV_FRAME_DATA_MATRIXENCODING side data.
[23:06] <cone-766> ffmpeg.git 03Michael Niedermayer 07master:85b424a45e38: Merge commit 'e92123093dfdca0ef6608998240e2f9345d63bff'
[23:14] <cone-766> ffmpeg.git 03Tim Walker 07master:13345fc1f86f: (e)ac3: parse and store the Dolby Surround, Surround EX and Headphone mode flags.
[23:14] <cone-766> ffmpeg.git 03Michael Niedermayer 07master:7b3c78b5e6de: Merge commit '13345fc1f86fc3615789e196d5a339c1c27c9068'
[23:21] <cone-766> ffmpeg.git 03Tim Walker 07master:7840c40445c9: (e)ac3dec: set AV_FRAME_DATA_MATRIXENCODING side data.
[23:21] <cone-766> ffmpeg.git 03Michael Niedermayer 07master:d18234f31915: Merge commit '7840c40445c9f52aeccba96de3d27613398bfbf2'
[23:29] <cone-766> ffmpeg.git 03Johan Andersson 07master:7ce88e5ec414: cmdutils: update copyright year to 2014.
[23:29] <cone-766> ffmpeg.git 03Michael Niedermayer 07master:67999d3d1275: Merge remote-tracking branch 'qatar/master'
[00:00] --- Mon Jan  6 2014