[Ffmpeg-devel-irc] ffmpeg-devel.log.20170615

burek burek021 at gmail.com
Fri Jun 16 03:05:04 EEST 2017

[00:09:37 CEST] <J_Darnley> FFS!  This shit is impossible to get working.  It's not the same as C.  It's not the same as MMX.
[00:09:53 CEST] <cone-862> ffmpeg 03Mark Thompson 07master:88a2e4504d18: hevc: Fix scaling list prediction delta for the 32x32 inter matrix
[00:10:27 CEST] <durandal_1707> J_Darnley: really?
[00:11:43 CEST] <J_Darnley> As far as I can tell
[00:12:27 CEST] Action: J_Darnley will slap together a commit and push that somewhere
[00:12:32 CEST] <nevcairiel> did you guys ever figure out why mmx and C differ? That alone seems like a bad sign of the hacks to me
[00:13:22 CEST] <J_Darnley> No
[00:13:23 CEST] <kierank> no questioning old neckbeard code nevcairiel!
[00:13:53 CEST] <J_Darnley> I never went looking for emails from 2002 which I think is where the 16383 comes from.
[00:16:54 CEST] <J_Darnley> If you only skimmed the discussion earlier I think it could be summed up as "16383 was a better match for other encoders at the time"
[00:18:45 CEST] <cone-862> ffmpeg 03Michael Niedermayer 07master:900fe8ee5d17: avcodec/dnxhdenc: Assert that frame size is not assigned an error code
[00:18:46 CEST] <cone-862> ffmpeg 03Michael Niedermayer 07master:0a87be404ab7: avcodec/mpeg4videodec: Fix integer overflow in num_sprite_warping_points=2 case
[00:18:47 CEST] <cone-862> ffmpeg 03Michael Niedermayer 07master:12245ab1f677: avcodec/mpeg4videodec: Check sprite delta upshift against overflowing.
[00:20:11 CEST] <iive> J_Darnley: I thought that the C version remained with 16384 has been kept in C code, because it is done by shift and is faster (with int)
[00:20:29 CEST] <iive> on mmx it doesn't matter, since it uses mul anyway.
[00:20:54 CEST] <iive> ops...
[00:21:06 CEST] <J_Darnley> In the "dc only" bit it will use 16384 but in the actual idct it will use 16383
[00:21:23 CEST] <iive> c or mmx ?
[00:22:13 CEST] <cone-862> ffmpeg 03James Almer 07master:37388b119cf8: checkasm: add a checkasm_checked_call function that doesn't issue emms
[00:22:14 CEST] <cone-862> ffmpeg 03James Almer 07master:5b10f484e2b3: checkasm: add float_dsp tests
[00:22:15 CEST] <cone-862> ffmpeg 03James Almer 07master:e53c9065ca08: avutil/tests: remove float_dsp test
[00:22:21 CEST] <J_Darnley> Both I think.
[00:27:37 CEST] <iive> so... if both do the same, what causes the difference
[00:27:58 CEST] <BBB> J_Darnley: wait I thought with my changes it matches?
[00:28:02 CEST] <J_Darnley> I wish I knew
[00:28:05 CEST] <BBB> J_Darnley: or is there another issue that I caused?
[00:28:33 CEST] <J_Darnley> BBB: dct is the same on all 3 tests
[00:28:44 CEST] <BBB> ok, so thats promising, right?
[00:28:44 CEST] <J_Darnley> but I get an error with a vsynth2 test
[00:28:48 CEST] <J_Darnley> Yes!
[00:28:52 CEST] <BBB> hm...
[00:29:28 CEST] <BBB> can you set aborts in each of put, add and regular idct to see which ones it calls?
[00:29:39 CEST] <BBB> or breakpoints
[00:29:43 CEST] <BBB> I guess I can do it myself as well
[00:29:47 CEST] <BBB> which vsynth2 test?
[00:30:47 CEST] <J_Darnley> I think it was fate-vsynth1-wmv1
[00:30:55 CEST] <J_Darnley> (so not vsynth2)
[00:33:26 CEST] <iive> J_Darnley: so... there is difference between C and MMX version and you don't know what causes it.
[00:33:53 CEST] <iive> and there is difference between MMX and SSE2 version and you don't know what causes that either?
[00:34:07 CEST] <BBB> do you know from the multiple commands it executes (see V=1) whether its the encode or decode that changed?
[00:34:22 CEST] <J_Darnley> I think it must be the decode
[00:34:32 CEST] <J_Darnley> the input video is generated
[00:34:56 CEST] <J_Darnley> oh but the encoder needs to decode to produce ref frames
[00:35:45 CEST] <iive> J_Darnley: can you setup a bit of glue code, that uses both mmx and sse idct and checks if their result differ?
[00:35:45 CEST] <J_Darnley> so in that case... both?
[00:36:03 CEST] <J_Darnley> that is libavcodec/tests/dct
[00:36:21 CEST] <J_Darnley> It doesn't directly compare, that is out job
[00:36:31 CEST] <J_Darnley> I guess I could modify it.
[00:36:35 CEST] <iive> J_Darnley: do you have a specific input that causes a difference
[00:36:53 CEST] <iive> you don't need statistics, you need a specific input.
[00:36:54 CEST] <J_Darnley> Not with tool
[00:37:22 CEST] <BBB> J_Darnley: can you run md5 on the generated file to see whether encode changed before/after your changes?
[00:37:28 CEST] <BBB> thats actually helpful to know
[00:38:22 CEST] Action: J_Darnley swears
[00:38:26 CEST] <J_Darnley> In a moment
[00:51:44 CEST] <BBB> lol
[00:51:49 CEST] <BBB> I think youre pretty close
[00:51:58 CEST] <BBB> if wmv failed, that means its probably a type check missing somewhere
[00:52:05 CEST] <BBB> but it suggests the larger body of work is finished
[00:52:39 CEST] <jkqxz> wm4:  The obvious change is not going to work because of the videotoolbox_pixfmt thing (which looks like some sort of crazy hack).  I think it needs someone who can actually compile it to look.
[00:53:22 CEST] <wm4> what thing?
[00:53:53 CEST] <jkqxz> <http://git.videolan.org/?p=ffmpeg.git;a=blob;f=ffmpeg_opt.c;h=bb6001f534d0c82592a56b3d4efe315466b49d70;hb=HEAD#l3609>
[00:54:11 CEST] <jkqxz> Which gets to <http://git.videolan.org/?p=ffmpeg.git;a=blob;f=ffmpeg_videotoolbox.c;h=e9039654b93de84cfb1fe4c1387dc036fa777249;hb=HEAD#l155>.
[00:54:19 CEST] <wm4> oh that
[00:54:29 CEST] <wm4> you can just set the format on the hwframes ctx
[00:54:36 CEST] <jkqxz> So it's -hwaccel_output_format except using the native name?
[00:54:45 CEST] <jkqxz> Or something else.
[00:54:59 CEST] <jkqxz> Ha, does that need init_hw_frames then?
[00:55:06 CEST] <wm4> yeah I think it uses a native name
[00:55:12 CEST] <wm4> hm probably
[00:55:28 CEST] <wm4> videotoolbox is pretty "special" in that it can output any format, and the decoder/whatever converts
[00:55:43 CEST] <wm4> so this is atypical
[00:56:03 CEST] <wm4> (another +1 for VT as standalone decoder instead of hwaccel)
[00:57:19 CEST] <nevcairiel> wasnt the problem with that that the VT API is dumb as fuck and requires a lot of code, including re-ordering?
[01:09:23 CEST] Action: J_Darnley facepalms
[01:09:42 CEST] <J_Darnley> I will address *that* bug later
[01:10:29 CEST] <wm4> nevcairiel: yes, we'd have to do reordering ourselves
[01:10:32 CEST] <J_Darnley> Okay, good.  Back on course
[01:11:12 CEST] <nevcairiel> you would also end up duplicating all sorts of SEI handling and whatnot, which is why hwaccels are so nice, you get all of that for free
[01:11:25 CEST] <wm4> videotoolbox is really a hysteric API... it's neither hwaccel not full decoder, and just makes your life harder for no reason at all
[01:11:35 CEST] <iive> J_Darnley: do you have a debugger where you can step through the asm code and see the registers?
[01:11:43 CEST] <J_Darnley> Yes, gdb
[01:12:09 CEST] <iive> does gdb have a mode where it outputs the changed register automatically?
[01:12:31 CEST] <J_Darnley> Maybe, I don't know it though.
[01:12:57 CEST] <iive> i've looked and i haven't found it.
[01:13:15 CEST] <iive> last time i tried ddd also didn't had it.
[01:14:42 CEST] <iive> J_Darnley: you might want to give a try to `edb` it is assembler oriented and i find it quite useful, despite needing some polish.
[01:15:12 CEST] <J_Darnley> noted
[01:15:14 CEST] <iive> i'm also not sure if it supports avx and ymm, but it works for sse
[01:15:54 CEST] <J_Darnley> BBB: the encoded video differs
[01:16:22 CEST] Action: J_Darnley can't believe that took him 30 minutes
[01:16:48 CEST] <J_Darnley> Also, you should use the mpeg2_asm5 branch on my gitlab
[01:17:30 CEST] <J_Darnley> It avoids a bug I've got in add/put and has your patch working.
[01:17:37 CEST] <kierank> J_Darnley: what was the problem
[01:18:02 CEST] <J_Darnley> We don't know yet
[01:18:28 CEST] <kierank> ok
[01:18:52 CEST] <J_Darnley> But my 30 minute problem was trying to make a clean commit
[01:19:13 CEST] <J_Darnley> That didn't make a step backwards
[01:22:08 CEST] <kierank> anyway it will make a good blog
[01:55:42 CEST] <BBB> J_Darnley: okiedokie, good to know. I will check wmv2enc
[01:58:54 CEST] <BBB> oh, not wmv2enc
[01:58:56 CEST] <BBB> hm...
[02:01:22 CEST] <BBB> J_Darnley: why FF_IDCT_SIMPLE as idct_algo in your assignment?
[02:01:28 CEST] <BBB> J_Darnley: the mmx function doesn't
[02:02:05 CEST] <J_Darnley> So that it runs more in fate
[02:04:00 CEST] <BBB> but its not compatible with FF_IDCT_SIMPLE, I think
[02:07:04 CEST] <BBB> (I dont know why, but the mmx function is also not assigned for algo==simple)
[02:08:41 CEST] <BBB> its possible that it doesnt have to do much with the idct itself, but rather with the fact that perm_type is not default for the non-C version
[02:08:43 CEST] <BBB> I dont know
[02:19:39 CEST] <J_Darnley> Can I ask you to revert that and then test fate?
[02:25:47 CEST] <philipl> jkqxz: So what's the generic hwaccel syntax I should try for cuda? Is it documented in your changes?
[02:33:10 CEST] <J_Darnley> I get no errors in that case.  But I also got no errors on fate before michaelni made us go looking for problems.
[02:41:10 CEST] <philipl> \jkqxz: Looks like something like "ffmpeg -y -hwaccel cuvid -hwaccel_output_format cuda -c:v h264_cuvid -i sample.mp4 -c:v h264_nvenc -map 0:0 -f rawvideo /dev/null"
[02:44:17 CEST] <J_Darnley> What the fuck have I been doing this week?
[02:44:33 CEST] <J_Darnley> Wasting everyone's time by the look of it.
[02:47:20 CEST] <philipl> BtbN, jkqxz: https://github.com/philipl/FFmpeg/commit/951e804c33bf77d87c67e11b9354c3c26721c69d
[03:10:31 CEST] <cone-862> ffmpeg 03Michael Niedermayer 07master:1cb4ef526dd1: avcodec/hevc_refs: Check nb_refs in add_candidate_ref()
[03:10:32 CEST] <cone-862> ffmpeg 03Michael Niedermayer 07master:bc4067446207: avcodec/hevcdec: Check nb_sps
[03:11:54 CEST] <J_Darnley> Christ.  I think I might have finished.
[03:12:07 CEST] <J_Darnley> I do have a bug in my add function.
[03:12:41 CEST] <J_Darnley> the sse2 branch didn't store correctly
[03:13:13 CEST] <J_Darnley> I added a new fate test based on that matrixbench clip michaelni pointed out.
[03:13:59 CEST] <J_Darnley> That does manage to get identical output for C, MMX, and now SSE2
[03:17:04 CEST] <J_Darnley> Maybe this week hasn't been wasted
[03:21:03 CEST] <J_Darnley> Pull my mpeg2_asm5 branch and make fate-simpleauto
[03:21:11 CEST] <J_Darnley> Good night
[11:21:26 CEST] <cone-854> ffmpeg 03Paul B Mahol 07master:9b667f609c50: avfilter/af_headphone: fix possible memory leaks on failure
[12:10:15 CEST] <BBB> should --enable-libfdk-aac      enable AAC de/encoding via libfdk-aac [no] disappear from configure?
[12:10:20 CEST] <BBB> http://git.videolan.org/?p=ffmpeg.git;a=blob;f=configure;h=e3941f9dfd79557af12ef95a0da7373955b5df48;hb=HEAD#l225
[12:13:44 CEST] <BBB> J_Darnley: I just noticed your msgs yesterday, good idea to add a fate test
[12:14:03 CEST] <BBB> J_Darnley: does the fate test execute all three (idct, idct_put and idct_add) versions of the function?
[12:14:35 CEST] <BBB> J_Darnley: and a potential problem may be that the fate test does not give the same results across platforms ...
[12:20:27 CEST] <J_Darnley> It does do both add and put.  If forgot to check whether the plain idct is used.
[12:25:13 CEST] <J_Darnley> Ah.  A quick check suggests that it does not use the plain idct.
[14:33:59 CEST] <atomnuker> durandal_1707: bored? there's still an atrac9 decoder to write
[14:39:06 CEST] <JEEB> :D
[14:39:23 CEST] <JEEB> I had a dump of the reference binary somewhere
[17:06:04 CEST] <BBB> J_Darnley: hurray
[17:06:26 CEST] <BBB> hey hey hey now now now
[17:06:32 CEST] <BBB> hush
[17:06:43 CEST] <BBB> I think dc-only is a fine name btw
[17:06:44 CEST] <J_Darnley> I somehow dropped the 32 byte stack space change
[17:06:56 CEST] <BBB> in a 2d context, dc is top/left
[17:07:00 CEST] <BBB> in a 1d context, dc is just the first
[17:07:01 CEST] <RiCON> BBB: what's wrong with fdk-aac?
[17:07:06 CEST] <BBB> it was dropped
[17:07:12 CEST] <BBB> but the configure option is still there
[17:07:16 CEST] <RiCON> it was?
[17:07:19 CEST] <BBB> I believe so
[17:07:29 CEST] <RiCON> i don't believe so
[17:07:35 CEST] <BBB> oh wait its still there
[17:07:36 CEST] <BBB> nevermind
[17:07:38 CEST] <BBB> <- stupid
[17:07:41 CEST] <RiCON> faac and aacplus were
[17:07:58 CEST] <BBB> aha ok
[17:08:27 CEST] <atomnuker> michaelni: why does ffv1 have a max packet size?
[17:08:34 CEST] <BBB> I believe next, we should drop support for all hevc decoders and deprecate hevc :-p
[17:08:39 CEST] <BBB> hevc encoders*
[17:08:52 CEST] <atomnuker> it breaks encoding of high resolution (e.g. 15000x9000) images
[17:09:38 CEST] <atomnuker> and ffv1 and png are the only ones capable of lossless rgb I think
[17:09:50 CEST] <atomnuker> (and png is slow and very inefficient)
[17:10:35 CEST] <RiCON> doesn't x264 do lossless rgb too?
[17:10:38 CEST] <BBB> J_Darnley: is simple_auto maybe a poor fate test because its results will differ from arch to arch?
[17:15:38 CEST] <BBB> J_Darnley: the rest looks good to me, nice!
[17:15:48 CEST] <BBB> J_Darnley: I look forward to the angry blog post :D
[17:17:46 CEST] <J_Darnley> I will endavour to leave out the swearing rant sections.
[17:18:04 CEST] <J_Darnley> And I forgot the message-id for that email.
[17:18:08 CEST] <BBB> its ok
[17:18:23 CEST] <BBB> maybe you shouldnt blog, but do a VDD talk about this work
[17:25:05 CEST] <jkqxz> But the swearing rant sections are the best bits!
[17:26:39 CEST] <jkqxz> On the subject of swearing rants, do people actually approve of the NV12 warning patch or not?
[17:29:36 CEST] <BBB> Im indifferent to it
[17:29:39 CEST] <BBB> I think its fine
[17:43:19 CEST] <cone-959> ffmpeg 03Tyler Jones 07master:5a2ad7ede33b: vorbisenc: Separate copying audio samples from windowing
[17:43:20 CEST] <cone-959> ffmpeg 03Tyler Jones 07master:f57f66518359: vorbisenc: Apply and output correct length window and mdct
[17:43:21 CEST] <cone-959> ffmpeg 03Tyler Jones 07master:752dd1952a7b: vorbisenc: Stop tracking number of samples per frame
[18:10:55 CEST] <kierank> BBB: should we submit it to demuxed or is it too technical for them?
[18:11:20 CEST] <BBB> I thought the goal of demuxed was to be technical? :-p
[18:11:26 CEST] <BBB> peloverde: ^^
[18:20:31 CEST] <kierank> I've submitted something myself but in a different area (uncompressed video transport)
[18:59:10 CEST] <philipl> jkqxz: I lost track - what's the plan for not having to specify the output format to ffmpeg to make transcoding work without download/upload by default?
[19:01:57 CEST] <jkqxz> No plan yet.  I was wondering whether we could do it by detecting fake hwaccels, or maybe there could be a new flag in the HWAccel table?
[19:04:18 CEST] <wm4> fake hwaccels? wut?
[19:33:32 CEST] <cone-959> ffmpeg 03Rostislav Pehlivanov 07master:b52b398c30a7: vc2enc: decrease default strictness level
[19:48:21 CEST] <michaelni> atomnuker, the max packet size is the worst case size, its smaller in version > 3. It could be reduced in older versions if some reallocation support is added 
[19:49:02 CEST] <michaelni> it would be trivial to realloc if it wasnt for the slice / thread code
[19:49:03 CEST] <atomnuker> but it can be exceeded, so its not the worst case
[19:49:23 CEST] <michaelni> it should not be exceeded
[19:49:33 CEST] <atomnuker> so its a bug then?
[19:49:56 CEST] <michaelni> dunno, but sounds like one, how can it be reproduced?
[19:50:14 CEST] <atomnuker> "[ffv1 @ 0x3770f00] Cannot allocate worst case packet size, the encoding could fail"
[19:50:29 CEST] <michaelni> "could fail" doesnt mean it does fail
[19:50:30 CEST] <atomnuker> makes it sound like I'm running out of memory
[19:50:40 CEST] <atomnuker> "Assertion bytes < (1 << 24) failed at libavcodec/ffv1enc.c:1216"
[19:50:53 CEST] <michaelni> assert failure is always a bug
[19:51:13 CEST] <nevcairiel> worst case is quite huge, it could overflow 32-bit, maybe you should try a 64-bit build
[19:52:23 CEST] <atomnuker> ./ffmpeg_g -f rawvideo -s 14027x9924 -pix_fmt rgb24 -i /dev/urandom -frames 1 -c:v ffv1 -f null -
[19:52:31 CEST] <atomnuker> this always fails here
[19:52:40 CEST] <nevcairiel> worst case for ffv1 1 and 3 is w*h*37*4
[19:52:45 CEST] <nevcairiel> so thats 20602184304 bytes
[19:53:15 CEST] <atomnuker> happens with -level 4 as well
[19:53:25 CEST] <nevcairiel> its w*h*3*4 for 4
[19:54:01 CEST] <nevcairiel> that should in theory fit into an int though
[19:55:29 CEST] <atomnuker> seems like it doesn't though, it exceeds INT_MAX - 64
[19:57:10 CEST] <nevcairiel> its still a huge buffer and easy to run out of memory from
[19:57:51 CEST] <atomnuker> you might be right actually
[19:58:10 CEST] <atomnuker> didn't realize the worst case packet is *that* big
[19:58:37 CEST] <atomnuker> 19 gigabytes, damn
[20:00:01 CEST] <atomnuker> I think there should be an option which insteads uses a lower worst case packet size and instead reallocs if the buffer runs out, though it'll be slower and messier
[20:01:24 CEST] <atomnuker> or maybe something like running the encoder once to determine the exact buffer size needed
[20:04:49 CEST] <nevcairiel> at least 4 has a PCM mode that greatly reduces the worst case
[20:06:32 CEST] <nevcairiel> and you could probably reduce the worst case further if you made it dependent on the pixel format used, instead of a generic value for all
[20:27:56 CEST] <michaelni> yes, there also seems to be a bug in slices with huge resolutions
[22:50:03 CEST] <BBB> wbs: do you have any recollection what type of decoding speed I could expect for regular content at typical resolutions using e.g. 1 core on aarch64?
[22:50:16 CEST] <BBB> wbs: e.g. 10fps? 50fps? 5000 fps?
[22:53:46 CEST] <atomnuker> on an odroid c2 for a standard 1080p24 bluray m2ts I get around 14fps with 1 thread
[22:54:39 CEST] <atomnuker> divide that by 4 or 5 for a raspberry pi 3 even if you somehow get some aarch64 image running
[22:58:00 CEST] <atomnuker> 10 fps now after letting it run for 5 minutes
[22:58:59 CEST] <atomnuker> no audio decoding, just ~30mpbs h264 (or were you interested in vp9?)
[23:01:27 CEST] <durandal_1707> atomnuker: filter status asap!
[23:03:07 CEST] <atomnuker> nothing new since tuesday, but I'll have something to review by sunday
[23:03:49 CEST] <atomnuker> durandal_1707: how is the ambisonics gsoc going?
[23:05:29 CEST] <durandal_1707> well, its mainly figuring proper stuff for making decoder ambidonic regarding various speakers layouts
[23:06:48 CEST] <atomnuker> isn't it going to only support outputting to headphones initially?
[23:07:14 CEST] <durandal_1707> that would be too trivial to do
[23:08:55 CEST] <J_Darnley> Did someone recently flush the moderation queue for ffmpeg-user?
[23:09:15 CEST] <durandal_1707> why?
[23:11:37 CEST] <J_Darnley> I just noticed a few old emails that were marked unread
[23:22:17 CEST] <BBB> atomnuker: vp9 obviously :-p
[23:22:39 CEST] <BBB> atomnuker: all these devices have hw h264 decoders ;)
[23:24:11 CEST] <TD-Linux> unfortunately there's a lot of variance for aarch64... the a57 has double the throughput per-clock as the a53, for example
[23:31:13 CEST] <atomnuker> BBB: around 24 fps for a 1080p30 at ~3.5Mbps youtube rip on a single thread
[23:32:06 CEST] <atomnuker> for a month old 3.3 release
[23:32:21 CEST] <BBB> wow :)
[23:33:14 CEST] <JEEB> atomnuker: vp9? nice
[23:34:17 CEST] <atomnuker> will try git master if I can get it to build without much trouble
[23:34:41 CEST] <atomnuker> to hell with gas and that damned script
[23:35:23 CEST] <jamrial> oh neat, we have aarch64 vp9 asm
[23:35:39 CEST] <BBB> wbs is a hero
[23:44:04 CEST] <BBB> atomnuker: thanks for testing
[00:00:00 CEST] --- Fri Jun 16 2017

More information about the Ffmpeg-devel-irc mailing list