[Ffmpeg-devel-irc] ffmpeg-devel.log.20170612

Tue Jun 13 03:05:03 EEST 2017

[00:53:52 CEST] <michaelni> durandal_1707, the "Fix runtime error: left shift of negative value -123" for example is the litteral error from the sanitizer that the commit fixes
[01:00:16 CEST] <jamrial> michaelni: i agree though that it would be better as part of the message but not as the subject
[01:00:29 CEST] <jamrial> subject could be limited to "fix left shift of negative value" or "fix undefined behavior"
[01:01:14 CEST] <michaelni> hmm, ok
[01:05:05 CEST] <DHE> I've had the sanitizer point to unaligned access in the past, but since it's behind #ifdefs that do hardware capability checks it should be fine.
[01:37:31 CEST] <cone-048> ffmpeg 03Michael Niedermayer 07master:07339a45a04e: avcodec/avpacket: Limit iterations in ff_packet_split_and_drop_side_data()
[01:37:31 CEST] <cone-048> ffmpeg 03Michael Niedermayer 07master:f8593c2f492a: avcodec/libvpxdec: Check that display dimensions fit in the storage dimensions
[04:02:36 CEST] <alevinsn> anyone here?
[04:09:04 CEST] <Fenrirthviti> Usually works better if you just ask your question and be patient for someone to respond.
[04:22:22 CEST] <alevinsn> ok, sure, the implementation of avio_close() appears to have a bug--just wanted to confirm my hypothesis with someone else with more experience
[04:22:50 CEST] <alevinsn> the code frees s->opaque, and it also assumes that opaque is of type AVIOInternal *
[04:23:14 CEST] <alevinsn> but, this is only the case when the AVIOContext is allocated the "standard" route
[04:23:43 CEST] <alevinsn> if the user creates a custom AVIOContext using avio_alloc_context(), then opaque, if even passed in,
[04:23:51 CEST] <alevinsn> is almost certainly not an AVIOInternal object
[04:24:10 CEST] <alevinsn> and regardless, it wouldn't be memory that ought to be freed when avio_close() is called
[04:24:45 CEST] <alevinsn> anyway, this should be easy to fix, and I'm going to work on a patch, but I wanted someone else's opinion with more experience in ffmpeg
[04:25:41 CEST] <alevinsn> hmm
[04:25:54 CEST] <alevinsn> according to the documentation, such a context is supposed to be freed with av_free()....
[04:26:18 CEST] <alevinsn> so, its not a bug--its user error
[04:26:33 CEST] <alevinsn> and avio_close() is only supposed to be used when avio_open() is used
[04:35:14 CEST] <alevinsn> but, it seems in that case, it will leak AVIOContext::buffer if allocated
[04:58:17 CEST] <factor> What is the deal in image utilities with image and uncached image utilities? Was wanting to do some uncached image version or reloadable on demand anyway.  
[05:36:18 CEST] <kevmark> ffmpeg supports inputs over SSH. TIL
[11:49:02 CEST] <cone-189> ffmpeg 03Henrik Gramner 07master:aad1b6786e73: x86inc: Add some additional cpuflag relations
[11:52:00 CEST] <J_Darnley> There we go.  I pushed the 5th in that set now that x264 has added aesni in the same place as ours.
[11:53:29 CEST] <nevcairiel> well we did have aesni before
[11:53:44 CEST] <J_Darnley> Maybe "re-added" then
[11:54:02 CEST] <nevcairiel> apparently all the numbers just got jumbled around
[12:00:35 CEST] <durandal_1707> atomnuker: how you plan to detect silence parts in noised audio?
[12:03:05 CEST] <atomnuker> just retune the ffopus psy transient detector, which would be simple because all you need to do would be to adjust the filter frequency down
[12:03:38 CEST] <atomnuker> thinking more though its not a good idea
[12:04:15 CEST] <atomnuker> it requires a large lookahead and I'd like the filter to not have a latency of more than say 30 or 40 msec
[12:07:05 CEST] <atomnuker> I'll take the abs of the coefficients and compare them to both the original noise signature and a pink noise distribution with the same energy
[12:07:39 CEST] <atomnuker> then a threshold will decide whether to use this as the new signature of stay with the old one
[12:08:27 CEST] <atomnuker> response time isn't important so I'll only accept a new one if the last 10 or so frames have been above the threshold
[13:31:17 CEST] <J_Darnley> Here's an interesting tidbit: the first pmaddwd after the transpose in simple idct is 3x hotter than any other instruction.
[13:39:05 CEST] <iive> J_Darnley: is the code in patch 2/5
[13:40:19 CEST] <J_Darnley> I think so
[13:40:25 CEST] <iive> the op been hot usually means that the cpu blocks there, until data dependency is solved
[13:41:38 CEST] <J_Darnley> I guess it might be waiting for the transpose to finish.
[13:44:38 CEST] <iive> could you give me a direct link to the code in web git or something...
[13:49:17 CEST] <J_Darnley> If you really want to look, I think it is this line: https://gitlab.com/J_Darnley/ffmpeg/blob/mpeg2_asm2/libavcodec/x86/simple_idct10_template.asm#L89
[14:33:58 CEST] <iive> still looking at it, but while browsing up/down i spotten a sequence of min/max ops
[14:35:30 CEST] <iive> it might be faster if you do the min/max in pairs, e.g. m0,m1 ; m2;m3;  then  m01,m23 ... you get the idea
[14:39:58 CEST] <J_Darnley> I think so
[15:02:59 CEST] <iive> gah... you are doing a saturation there, not looking for min/max there is no data dependency. ignore what i said.
[15:11:01 CEST] <iive> i can't see much room to shuffling instructions around to fill up the latency...
[15:15:29 CEST] <J_Darnley> Thank you for looking
[15:19:21 CEST] <iive> and punpck usually have very short letencies, 1 or 2. the reason might be elsewhere
[15:20:15 CEST] <iive> do you have a free register? try to load the memory operand in it and see where the hot spot would move
[15:21:33 CEST] <iive> have in mind that memory operands always add one more micro-op for loading the constant in temp register.
[16:13:19 CEST] <Gramner> J_Darnley: is the pmaddwd with a memory arg? if so it's probably cache misses
[16:15:04 CEST] <J_Darnley> Yes, an RO constant
[16:16:02 CEST] <iive> well, how many times should constant be used in order to remain in L0 cache?
[16:16:34 CEST] <Gramner> profiling stuff can be a bit misleading. it can appear that a lot of cycles is spent in some function (or instruction) but you're actually just waiting on memory so making the code itself faster may not improve performance
[16:17:06 CEST] <Gramner> if the functions is truly executed a lot it should probably end up in L1 by itself
[16:18:33 CEST] <Gramner> optimizing cache usage is probably a whole subsection of comp.sci. by itself
[16:20:46 CEST] <Gramner> I like that avx-512 added broadcasts from GPRs, so you can load an immediate to a GPR and broadcast it from there instead of loading constants from rodata when you expect cache misses
[16:21:23 CEST] <nevcairiel> thats pretty neat
[16:23:15 CEST] <J_Darnley> I wasn't expecting people to go looking for a solution.  I was only pointing out something I found curious.
[17:09:50 CEST] <wm4> I wonder if anyone RE'd yet how to use HEVC from videotoolbox
[17:10:49 CEST] <iive> J_Darnley: what are you using for profiling?
[17:11:00 CEST] <durandal_1707> wm4: RE'd ? isnt that thing documented?
[17:11:34 CEST] <wm4> durandal_1707: no, apple is very bad at documentation, it's a shit show
[17:15:57 CEST] <kierank> wm4: is it a software encoder?
[17:16:03 CEST] <kierank> would be interesting to see strings from it
[17:16:09 CEST] <kierank> i would guess licensed from ittiam
[17:16:11 CEST] <kierank> or ateme
[17:16:32 CEST] <wm4> I meant decoder... but surely they'll have an encoder too
[17:38:04 CEST] <J_Darnley> iive: perf
[17:38:45 CEST] <J_Darnley> perf record ./ffmpeg_g -i FILE -an -f null -
[18:10:35 CEST] <cone-069> ffmpeg 03Paul B Mahol 07master:d4d1fc823f99: avfilter: add native headphone spatialization filter
[18:10:36 CEST] <cone-069> ffmpeg 03Paul B Mahol 07master:1a30bf60be92: tools: add sofa2wavs
[18:49:32 CEST] <cone-069> ffmpeg 03Ilia Valiakhmetov 07master:81fc617c1257: avcodec/vp9: ipred_dr_16x16_16 avx2 implementation
[19:01:38 CEST] <ubitux> jamrial: your aacpsdsp hybrid test passes on arm btw
[19:02:59 CEST] <jamrial> ubitux: cool
[19:04:28 CEST] <jamrial> ubitux: can you test this one as well? https://pastebin.com/hT5dZiKg
[19:04:34 CEST] <jamrial> i assume it will fail for most i values since the arm function seems optimized to the values aacdec explicitly uses, but just want to be sure
[19:05:01 CEST] <ubitux> isn't it the one you made me test?
[19:05:12 CEST] <ubitux> or did you change something?
[19:05:14 CEST] <jamrial> no, the one i made you test tried i values 3 and 5
[19:05:27 CEST] <jamrial> this one tests 0 to 63
[19:05:43 CEST] <ubitux> ah, ok
[19:07:14 CEST] <jamrial> i'm fairly sure it will fail, at least from reading the arm code, but just want to be sure
[19:07:47 CEST] <jamrial> i don't think adapting the arm function to work for all i values is worth it. aacdec is unlikely to be changed
[19:10:35 CEST] <ubitux> bus error
[19:13:36 CEST] <jamrial> alright, just squash the first one then
[20:12:50 CEST] <BBB> J_Darnley: is michaels mpeg problem in patch 3 related to not using patch 6?
[20:13:01 CEST] <BBB> J_Darnley: i.e. do you need to invert patch 3 and 6 to make sure stuff never breaks in commit order?
[20:13:09 CEST] <BBB> or just put 6 before 3/4/5
[21:03:53 CEST] <J_Darnley> I'm not sure yet
[21:05:50 CEST] <atomnuker> jamrial: why the sudden interest in making aacps as fast as possible? it was pretty quick before IIRC
[21:06:29 CEST] <atomnuker> or is this the last easy remaining audio asm left :)?
[21:06:45 CEST] <jamrial> atomnuker: none in particular. i general just work on what i find interesting or fun
[21:07:23 CEST] <atomnuker> yeah, good point, opus assembly isn't fun at all
[21:08:07 CEST] <atomnuker> you write it, it works on scalar but the extra code to handle non-multiples of 2 makes you slower than c :(
[21:17:33 CEST] <BBB> michaelni: does 6/6 fix the issue youre seeing in 3/6 from j_darnleys patch set?
[21:17:47 CEST] <BBB> michaelni: (Im wondering if the patch ordering is just wrong)
[21:21:54 CEST] <BBB> J_Darnley: the action in 2/6, its a little confusing, but it just is invoked for 10-bit right?
[21:22:00 CEST] <BBB> J_Darnley: i.e. for 8bit action is always absent?
[21:22:48 CEST] <BBB> J_Darnley: and can you fix the indenting of simple_idct8_add? within %if cpuflag(sse4) - %endif
[21:22:50 CEST] <J_Darnley> No.  For the IDCT only it is used as "store"
[21:23:03 CEST] <BBB> oh right of course
[21:23:04 CEST] <BBB> duh
[21:23:11 CEST] <BBB> but for 8bit add/put its nothing, right?
[21:23:15 CEST] <J_Darnley> Yes
[21:23:17 CEST] <BBB> ok cool
[21:23:23 CEST] <J_Darnley> "leave in registers"
[21:23:32 CEST] <BBB> right, cool
[21:23:35 CEST] <BBB> if this works, passes fate and doesnt break mpeg, its good with me
[21:23:38 CEST] <BBB> lets wait for michael also
[21:23:53 CEST] <BBB> if you can reproduce the breakage he sees with 3/6 but its gonewith 6/6, Id just re-order the patches
[22:12:58 CEST] <J_Darnley> BBB michaelni: I can repoduce the problem with patches 1-3
[22:13:11 CEST] <BBB> cool
[22:13:13 CEST] <BBB> does 6 fix it?
[22:13:20 CEST] <michaelni> not fully
[22:13:26 CEST] <michaelni> sorry for the late reply
[22:13:27 CEST] <J_Darnley> Just trying
[22:13:35 CEST] <michaelni> make -j12 ffmpeg && ./ffmpeg -i fate-suite/lena.pnm -y test.jpg && xview test.jpg
[22:13:41 CEST] <michaelni> shows artifacts with all 6
[22:13:50 CEST] <michaelni> matrix seems ok i think wit 6
[22:13:59 CEST] <J_Darnley> damn
[22:15:09 CEST] <cone-069> ffmpeg 03Paul B Mahol 07master:6e09e1264116: tools/sofa2wavs: add license header
[22:23:20 CEST] <J_Darnley> Fuck I hate the autocomplete for make that system has!  I don't want to make a target I want to make a file!
[00:00:00 CEST] --- Tue Jun 13 2017