[Ffmpeg-devel-irc] ffmpeg-devel.log.20190916

burek burek at teamnet.rs
Tue Sep 17 03:05:06 EEST 2019


[00:11:41 CEST] <Lynne> philipl: the only fence used is for queue submission, the only way for the fence not to trigger is if the queue hasn't finished if it waits for something
[00:11:54 CEST] <Lynne> which it does; the semaphore the image has
[00:12:52 CEST] <Lynne> so maybe something is going wrong with cuda signalling the imported semaphore?
[00:21:45 CEST] <Lynne> vulkan defines some synchronization scopes for semaphores which really absolutely no one understands
[00:38:51 CEST] <philipl> Yeah. I think this is beyond where I have meaningful knowledge.
[00:39:02 CEST] <philipl> haasn: feeling generous and want to take a look?
[00:39:35 CEST] <philipl> I can say you are deleting the semaphore on each map that re-uses a frame. You shouldn't do that. Keep the semaphore around for the lifetime of the frame.
[00:39:56 CEST] <BtbN> Why is this such a mess, wow.
[00:40:52 CEST] <philipl> So if I do a simpler command line with hwmap and no hwdownload, and I don't destroy semaphores every time, it will run to completetion and then deadlock destroying the external semaphores during final cleanup
[00:41:10 CEST] <philipl> That happens even if I don't signal the semaphores. Which seems illogical.
[00:41:37 CEST] <philipl> Lynne: are you testing with the nvidia card as your primary GPU? That's really hard as it tends to lock your desktop up when it deadlocks
[00:41:59 CEST] <philipl> BtbN: quite
[00:43:02 CEST] <BtbN> You never can be sure if you just hit a driver bug or are doing something wrong with this.
[00:43:44 CEST] <BtbN> At least Vulkan is generally quite actively developed and Nvidia is probably interested in feedback.
[00:45:09 CEST] <philipl> It's hard for me to compare how you're doing it vs how libplacebo does it; I see a bunch of small things which might be a problem. eg: libplacebo does an explicit transition for the image to an 'exported' state and then when you transition on the transfer_to, you have to set the right src properties (like there's an external transfer queue family)
[00:45:50 CEST] <philipl> I know when haasn and I were working on this in libplacebo, you could cut a lot of corners and it appeared to work but you never know when the driver will start caring.
[00:46:14 CEST] <BtbN> Most of this is potentially a noop internally at the moment lol
[00:47:34 CEST] <philipl> Yeah, Until it isn't.
[00:48:00 CEST] <philipl> I can't say if this is a driver bug, but libplacebo is an existence proof that you can write it in a way that works.
[00:48:11 CEST] <philipl> Lynne: never too late to depend on libplacebo :-P
[00:48:43 CEST] <BtbN> I just hope they don't throw the current infra over board at some point in the future, for something new...
[00:49:16 CEST] <philipl> replacing vulkan? Hopefully that doesn't happen too soon.
[00:49:28 CEST] <BtbN> No, replacing the current Vulkan extensions to do that stuff.
[00:50:03 CEST] <philipl> BtbN: well, if they successfully define and implement the vulkan api for video decode/encode, then all this external api shit will go away.
[00:50:20 CEST] <BtbN> At least on the decode side.
[00:50:33 CEST] <BtbN> I doubt Vulkan Encode would get a lot of traction from any vendor except Intel.
[00:51:02 CEST] <philipl> Unfortunately, probably true. and decode probably would only get traction from AMD.
[00:51:23 CEST] <BtbN> I can see Nvidia implementing Vulkan Decode
[00:51:35 CEST] <JEEB> I just learned yesterday that openMAX was actually a khronos thing for hwdec/-enc
[00:51:44 CEST] <JEEB> definitely did not know that before
[00:51:50 CEST] <philipl> With a new team, new implementation, and new feature set vs nvdec and vdpau. *sigh*
[00:51:50 CEST] <BtbN> And on the Encode side it's not too important. And it makes sense that ever vendor wants to expose the full capabilities of the HW, and not some agreed upon subset that all vendors can support.
[00:52:09 CEST] <Lynne> philipl: not depending on libplacebo, shut up about this.
[00:52:22 CEST] <philipl> BtbN: that's why vulkan is so great. Every vendor bullshit idea can be an official extension.
[00:52:31 CEST] <JEEB> Lynne: he did not attempt to make you change :P
[00:52:44 CEST] <JEEB> just noted that he wasn't sure what of the differences between caused any issues
[00:52:55 CEST] <JEEB> if any
[00:52:57 CEST] <Lynne> of course a simpler command line with just a hwmap doesn't work, the semaphore is in theory in a signalled state so it would be illegal to destroy it
[00:53:16 CEST] <Lynne> putting a wait before semaphore destruction didn't help, tried that
[00:53:20 CEST] <philipl> Lynne: except, as I said, if I comment out signalling it still deadlocks.
[00:53:25 CEST] <philipl> on destroy
[00:54:11 CEST] <Lynne> we really can't keep semaphores around, they're awful and if a client gives a semaphore in a bad state it could freeze everything
[00:55:19 CEST] <Lynne> so the simplest thing to do is to wait on them the first possible chance then destroy them and rely on events instead
[00:55:19 CEST] <philipl> maybe special case the exported cuda semaphores to retain them. Those aren't coming from a client and the hwcontext should be able to control their state.
[00:56:07 CEST] <philipl> Of course, pragmatically, you'd never see a synchronisation issue if you ignored the cuda semaphores. The pool means it's very unlikely to reuse a frame too soon.
[00:58:08 CEST] <Lynne> the synchronization issues start earlier if the memcpy2d hasn't finished by the time you pass the frame via a command buffer to a queue
[00:58:43 CEST] <philipl> (and you don't have semaphores for cuda to wait on yet anyway). I never saw an actual issue with the vulkan side reading too soon after the memcpy before I had semaphores working in mpv. FWIW.
[00:59:23 CEST] <philipl> We had the opposite problem. Using a single shared VkImage, we'd see multiple writes complete before a read.
[00:59:36 CEST] <philipl> That's when you need to wait on the cuda side for the read to complete.
[01:00:44 CEST] <cone-499> ffmpeg 03gxw 07master:92fc0bfa54d8: avutil/mips: refactor msa SLDI_Bn_0 and SLDI_Bn macros.
[01:00:44 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:9fd62b84d57e: tools/target_dec_fuzzer: Adjust motionpixels threshold
[01:00:44 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:61b055bed096: libavcodec/utils: Free threads on init failure
[01:00:44 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:a9fae76370ba: avcodec/gdv: Replace assert() checking bitstream by if()
[01:00:44 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:c80715f15359: doc/examples/decode_audio: Fix "warning: ISO C90 forbids mixed declarations and code"
[01:00:45 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:24e52709112e: avformat/hcom: Tell the compiler about set but not read variables
[01:00:45 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:fccc37ca85a7: repeat an even number of characters in occured
[01:00:46 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:d2d8e797cc4f: avcodec/hevcdec: repeat character in skiped
[01:00:46 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:305f6dbb060f: tools/target_dec_fuzzer: increase snows threshold
[01:00:47 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:9fac243744c6: avcodec/cfhd: Check that cropped size is smaller than full
[01:00:48 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:5c5575c8dc89: avformat/cdxl: Fix integer overflow in intermediate
[01:00:49 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:08dc354ef729: avformat/vividas: remove dead assignment
[01:00:50 CEST] <cone-499> ffmpeg 03Michael Niedermayer 07master:8e8fd25272c5: avformat/vividas: Remove align offset which is always masked off
[01:02:12 CEST] <philipl> Lynne: you could also do sync memcpy2d but that probably cripples performance.
[01:02:28 CEST] <BtbN> We use CUstreams pretty much everywhere now
[01:02:32 CEST] <BtbN> they exist precisely for this
[01:02:42 CEST] <BtbN> You sync on the stream at the latest possible moment
[01:04:23 CEST] <BtbN> cudaStreamAddCallback could also be handy to run code precisely after the memcpy has finished. But it's deprecated.
[01:04:32 CEST] <BtbN> cudaLaunchHostFunc is the replacement.
[01:08:59 CEST] <Lynne> even if I don't wait on any semaphores when downloading and don't destroy them ever, there's still a hang
[01:09:10 CEST] <Lynne> I don't get this
[01:11:50 CEST] <philipl> yeah. very strange
[01:21:59 CEST] <Lynne> holy shit
[01:22:10 CEST] <Lynne> cu->cuDestroyExternalSemaphore(&dst_int->cu_sem[i])
[01:22:13 CEST] <Lynne> spot the error
[01:22:38 CEST] <Lynne> then explain why the compiler did not warn
[01:23:11 CEST] <JEEB> wonder if the other one from the two of clang vs gcc would barf at you
[01:35:35 CEST] <philipl> Lynne: ouch.
[01:35:44 CEST] <philipl> I didn't see the warning here either.
[01:37:18 CEST] <philipl> Lynne: so that explains why destroy was always taking a shit.
[01:43:19 CEST] <Lynne> works fine once I fixed that and fixed the vkCmdSetEvent (for some reason bottom of pipe makes it freeze)
[01:55:42 CEST] <philipl> Sounds believable, and nicely done.
[01:56:11 CEST] <philipl> I still recommend not constantly destroying the semaphores. I think it will hurt performance.
[02:00:01 CEST] <Lynne> the thing I'm worried about is the definition of semaphores and how they act withing the defined scopes
[02:00:25 CEST] <Lynne> its not explained well, and if its not explained well how can driver devs even implement this properly
[02:00:58 CEST] <Lynne> I've read various opinions whether you can rely that a signalled semaphore in one queue will be respected on another submission
[02:01:22 CEST] <Lynne> because reading the scopes you'd think that you can't rely on that
[02:02:12 CEST] <Lynne> which means you can't do any synchronization at all with semaphores since they'll only really work for multiple commands submitted at the same time
[02:12:08 CEST] <Lynne> philipl: yeah ok, desroying and creating semaphores again is super expensive
[02:12:23 CEST] <Lynne> not even waiting on them, this is all userspace overhead
[02:12:29 CEST] <philipl> Yeah.
[02:12:41 CEST] <philipl> nvidia definitely want them to stick around and be reused.
[02:13:15 CEST] <philipl> and at least empirically, with mpv, the reuse works fine.
[02:14:11 CEST] <Lynne> which means my previous version of the synchronization system I had worked fine and I need to revert back
[02:27:59 CEST] <Lynne> is hardware frame creation synchronized, e.g. will pool_alloc be called from multiple threads?
[02:38:48 CEST] <rcombs> Lynne: alright Ill bite, why doesn't it warn? (I'm assuming the bug is the &?)
[02:51:35 CEST] <Lynne> rcombs: "note: expected CUexternalSemaphore {aka void *}"
[02:51:58 CEST] <rcombs> ah, I'd looked up the reference and it indicated that was an actual type
[02:52:38 CEST] <rcombs> I once caught a case where apple's AudioToolbox code did exactly the same thing (and the compiler of course did not warn them either)
[02:53:01 CEST] <rcombs> passed a void* to a function taking void* but mistakenly added a &
[02:53:22 CEST] <rcombs> they wouldn't believe me until I pointed them to the line of assembly where they had an lea that should've been a mov
[02:53:57 CEST] <rcombs> and gave example code where passing in a pointer to a 4-byte buffer failed, but passing in (void*)(intptr_t)*(int*)buf worked
[02:54:47 CEST] <Lynne> vulkan does it right btw, "note: expected VkSemaphore {aka struct VkSemaphore_T *}"
[02:56:36 CEST] <rcombs> yeah, and macOS's libc has opaque types for all pthread stuff, but they're actually user-visible structs containing a char opaque[size];, rather than just void
[03:20:31 CEST] <philipl> BtbN: https://github.com/philipl/nv-codec-headers/commit/bfb7e2d35b1eb899029442fa7e7d466e5ed5c642
[03:20:35 CEST] <philipl> will push if you're ok with it.
[03:20:56 CEST] <philipl> Lynne: that void * problem is on us. See diff
[03:57:20 CEST] <haasn> Lynne: are you using validation layers?
[03:58:25 CEST] <haasn> but yeah just use libplacebo :^)
[11:14:03 CEST] <BtbN> philipl, why the change from structs to void*?
[11:16:22 CEST] <nevcairiel> its the opposite, to avoid accidental errors since compilers will warn about much more this way
[11:16:45 CEST] <BtbN> oh
[11:17:02 CEST] <BtbN> hm, in that case I wonder if that'll break some code. What do the official headers do?
[11:17:56 CEST] <BtbN> I mean, I don't really care if it's void* or struct*, but what's the reason for the change?
[11:18:03 CEST] <nevcairiel> the official headers do what the changed code does
[11:25:20 CEST] <BtbN> Yeah, that's probably fine then.
[11:49:42 CEST] <cone-112> ffmpeg 03Paul B Mahol 07master:921eb21b1d1b: avfilter/x86/vf_360: add most of >8 depth asm
[11:49:43 CEST] <cone-112> ffmpeg 03Paul B Mahol 07master:dc3325076597: avfilter/af_headphone: return on error immediately
[11:49:44 CEST] <cone-112> ffmpeg 03Paul B Mahol 07master:7a7aa4f79e50: avfilter/vf_avgblur: remove dupe assignment
[11:49:45 CEST] <cone-112> ffmpeg 03Paul B Mahol 07master:654601dd1d3b: avfilter/vf_v360: add missing av_assert0()
[11:49:46 CEST] <cone-112> ffmpeg 03Paul B Mahol 07master:fa045c3ce288: avfilter/window_func: clarify intention in dolph window calculation
[11:49:47 CEST] <cone-112> ffmpeg 03Paul B Mahol 07master:f70690e8ece8: avfilter/vf_ciescope: remove dead assignments
[11:49:48 CEST] <cone-112> ffmpeg 03Paul B Mahol 07master:ea8391e519f4: avfilter/vf_shuffleplanes: remove not needed line
[11:49:49 CEST] <cone-112> ffmpeg 03Paul B Mahol 07master:34a12b99788d: avfilter/vf_stereo3d: merge same code in case branches
[11:49:50 CEST] <cone-112> ffmpeg 03Paul B Mahol 07master:94f187d38267: avfilter/vf_stereo3d: assert that out variable is valid
[12:10:08 CEST] <Lynne> haasn: yes, of course, although they're of little help when importing images from APIs
[12:10:21 CEST] <haasn> fair enough
[12:10:29 CEST] <haasn> I guess the layers haven't gotten to any of those extensions yet
[12:11:01 CEST] <Lynne> synchronization issues right now are purely theoretical though, at least with filters
[12:11:32 CEST] <Lynne> since imageviews have to be around until the command buffer has finished executing can't help but block on every queue submission until its complete
[12:12:33 CEST] <haasn> I don't know anything about your design/code but make sure you take a look at https://code.videolan.org/videolan/libplacebo/blob/master/demos/video-filtering.c
[12:12:46 CEST] <haasn> though iirc ffmpeg filter design requires blocking?
[12:14:51 CEST] <Lynne> on the cpu, only for downloading and uploading
[12:22:32 CEST] <haasn> doing a very quick analysis of avfilter.h it seems like activate() roughly translates to api2_process()
[12:22:47 CEST] <haasn> with input/output links roughly translating to my get_frame / put_frame
[12:22:58 CEST] <haasn> so it should definitely be possible to do nonblocking processing here
[13:22:18 CEST] <Lynne> haasn: yeah, but how do you deal with imageview lifetimes?
[13:23:16 CEST] <haasn> I'm not entirely sure what the context is
[13:23:16 CEST] <Lynne> vulkan doesn't give you any callbacks after a command buffer has completed, so you'd need to store imageviews on the frame and free them after a subsequent command buffer submission has completed, since the previous one would be guaranteed to have finished
[13:23:35 CEST] <haasn> what images are those imageviews attached to?
[13:23:46 CEST] <Lynne> the frames you give lavfi
[13:25:58 CEST] <haasn> assuming I do need to keep track of a unique VkImage on every frame and am not able to reuse/recycle them (depends on cooperation with the provider of images), we would indeed be forced to hold on to imageviews as long as needed for validity
[13:26:08 CEST] <haasn> (that's sort of what happens under the hood if you pl_tex_destroy with libplacebo)
[13:26:43 CEST] <haasn> that doesn't mean we have to block until this is the case before we start sending more work to the GPU, though, if that's what you were implying
[15:12:04 CEST] <cone-852> ffmpeg 03sharpbai 07master:6966548c1bd8: avcodec/videotoolboxenc: fix encoding frame crash on iOS 11
[15:12:04 CEST] <cone-852> ffmpeg 03Limin Wang 07master:57951f301906: avcodec/videotoolboxenc: add H264 Extended profile and level
[15:12:04 CEST] <cone-852> ffmpeg 03Rick Kern 07master:1db6e47e8599: avcodec/videotoolboxenc: warn user when output will use a different profile/level than requested.
[16:12:55 CEST] <Lynne> >av_map_videotoolbox_format_from_pixfmt2
[16:13:04 CEST] <JEEB> 33
[16:13:09 CEST] <Lynne> we really should have a hwcontext api to map pixfmts
[16:13:22 CEST] <Lynne> I think there was a patch even
[16:24:23 CEST] <JEEB> also I wonder if we should do something about the high time base by default for subtitles in mp4?
[16:24:35 CEST] <JEEB> it shows up currnetly if you have a subtitle that is over 37 or so minutes long
[16:24:38 CEST] <JEEB> :P
[16:48:49 CEST] <BBB> i want to tell my compiler that left-shifting a negative value is just fine; can i do that?
[16:49:03 CEST] <BBB> -std=c99-with-leftshift-defined or so
[16:49:38 CEST] <jamrial> send patch to gcc :p
[16:51:10 CEST] <jamrial> i suppose there should already be an option to silence related warnings that can be detected at compile time, but these reports are for runtime with ubsan, so...
[17:02:39 CEST] <BBB> i probably needto send a patch to the c standards committee first
[17:09:37 CEST] <Lynne> do it; they're already considering classes from c++ I think
[17:10:10 CEST] <Lynne> or was it raii
[18:16:14 CEST] <philipl> BtbN: pushed to all branches
[18:17:48 CEST] <Lynne> tnx
[19:25:34 CEST] <cone-616> ffmpeg 03Paul B Mahol 07master:d87db83e1cad: avfilter/vf_v360: rename r_tmp variables
[19:25:34 CEST] <cone-616> ffmpeg 03Paul B Mahol 07master:c271d88257e7: avfilter/vf_v360: move some local variables to private filter context
[19:25:34 CEST] <cone-616> ffmpeg 03Paul B Mahol 07master:cf62110a8381: avfilter/vf_v360: simplify allocating remap data
[19:25:34 CEST] <cone-616> ffmpeg 03Paul B Mahol 07master:a09213da23e7: avfilter/vf_v360: reverse order of remap for loops
[19:25:34 CEST] <cone-616> ffmpeg 03Paul B Mahol 07master:05ffaa252ee8: avfilter/vf_v360: refactor creation of remap data
[19:25:35 CEST] <cone-616> ffmpeg 03Paul B Mahol 07master:6f4ec4d909cc: avfilter/vf_v360: add slice threading to remap calculation
[20:20:51 CEST] <durandal_1707> cehoyos: have you committed that audio decoder yet?
[20:25:43 CEST] <cehoyos> "That audio decoder?"
[20:26:05 CEST] <durandal_1707> you wrote only one
[20:27:41 CEST] <durandal_1707> the acelp.kelvin
[20:32:38 CEST] <cone-616> ffmpeg 03James Almer 07master:dc0806dd2588: avcodec/allcodecs: make libdav1d the preferred AV1 decoder
[20:32:55 CEST] <jamrial> BBB: ^
[20:33:38 CEST] <BBB> \o/
[20:33:39 CEST] <cehoyos> I wanted to write "of course" but apparently not;-)
[20:58:34 CEST] <cone-616> ffmpeg 03Carl Eugen Hoyos 07master:551fcbbccbca: lavc/g729dec: Support decoding Sipro ACELP.KELVIN.
[21:14:20 CEST] <cehoyos> durandal_1707: Shouldn't this be: "ACT flag != frame header is not supported, please ...."?
[21:14:53 CEST] <durandal_1707> yes
[21:15:09 CEST] <cehoyos> Or maybe "ACT flag != frame header (expect wrong colours" is not supported"
[21:17:28 CEST] <durandal_1707> it is random block flag vs frame header
[21:22:17 CEST] <cehoyos> Something that tells the user that wrong colours are possible (or expected)
[22:14:25 CEST] <cone-616> ffmpeg 03Andreas Rheinhardt 07master:3ab488a5407f: avcodec/ttaenc: Fix undefined shift
[00:00:01 CEST] --- Tue Sep 17 2019


More information about the Ffmpeg-devel-irc mailing list