[Ffmpeg-devel-irc] ffmpeg-devel.log.20170807

burek burek021 at gmail.com
Tue Aug 8 03:05:04 EEST 2017


[12:19:41 CEST] <durandal_1707> is there a flag for fate to automatically update checksums?
[12:26:16 CEST] <J_Darnley> GEN=1
[12:26:41 CEST] <J_Darnley> It will use the results as the new reference values.
[12:40:59 CEST] <J_Darnley> Now on an unrelated topic...
[12:43:41 CEST] <J_Darnley> actually ignore that for now
[12:52:38 CEST] <funman> maybe to test scatter/gather, dunno
[12:52:56 CEST] Action: funman looks for his shell
[13:16:47 CEST] <J_Darnley> Has anyone here ever started work on writing code to allow avcodec to encode anything less than a whole frame?
[13:17:04 CEST] <J_Darnley> Or do you know of similar work elsewhere?
[13:18:47 CEST] <J_Darnley> Perhaps someone had a use for separated fields?
[13:23:52 CEST] Action: J_Darnley thinks he should send an email to the ML
[13:35:51 CEST] <cone-093> ffmpeg 03Paul B Mahol 07master:5621a99e2751: avfilter/drawutils: support gbrap10 too
[13:35:51 CEST] <cone-093> ffmpeg 03Paul B Mahol 07master:ab6d89d7ee5c: libavutil: add GRAY9 pixel format
[13:35:51 CEST] <cone-093> ffmpeg 03Paul B Mahol 07master:de48710c11ce: libswscale: add gray9 support
[13:35:51 CEST] <cone-093> ffmpeg 03Paul B Mahol 07master:7bfbc2d787a4: avfilter/vf_extractplanes: add 9 bitdepth support
[13:35:51 CEST] <cone-093> ffmpeg 03Paul B Mahol 07master:86222a7ea071: avfilter/vf_waveform: add support for 9 bit depth lowpass
[13:35:52 CEST] <cone-093> ffmpeg 03Paul B Mahol 07master:bac508fec1c4: avfilter: add support for GRAY9 and GBRAP10
[13:36:48 CEST] <durandal_1707> J_Darnley: thats bad idea
[13:37:11 CEST] <J_Darnley> :) which bit in particular?
[13:37:29 CEST] <durandal_1707> encoding fields
[13:37:38 CEST] <J_Darnley> It sure is.
[13:37:53 CEST] <JEEB> I don't know, the only problem is that the rest of things don't really check if they're getting a field or frame
[13:37:57 CEST] <JEEB> they just take a picture in/out
[13:38:09 CEST] <JEEB> like I think yadif et al currently fail with HEVC
[13:38:21 CEST] <JEEB> because the HEVC decoder outputs field pictures
[13:38:29 CEST] <JEEB> if the video is field coded
[13:38:34 CEST] <J_Darnley> Forget about the fields then
[13:38:57 CEST] <J_Darnley> I want to pass 16 lines to an encoder and get 16 lines out almost immediately.
[13:39:43 CEST] <J_Darnley> But separated fields would a half-decent, real-world example.
[13:40:26 CEST] <J_Darnley> Anyway, I'm going to lunch
[14:01:49 CEST] <J_Darnley> Maybe I shouldn't mention fields.  It always triggers people who hate interlacing, like me.
[14:16:14 CEST] <JEEB> don't worry, it happens to the best of us
[15:11:37 CEST] <kierank> J_Darnley: i would say just see if you can get vc2enc to accept it, forget about other encoders and forget about fate
[15:14:21 CEST] <J_Darnley> Sure, I wouldn't expect to solve it for every encoder
[15:14:39 CEST] <J_Darnley> but I have no idea where to even begin with that "small" task
[15:14:57 CEST] <J_Darnley> I have no idea how data goes from outside to inside the encoder.
[15:15:58 CEST] <J_Darnley> Even some shitty hack will need a minimum of support from the rest of the API otherwise you won't even be able to compile
[15:17:32 CEST] <J_Darnley> ^^ or then how data goes from inside to back outside
[15:22:39 CEST] <kierank> J_Darnley: did you see https://git.videolan.org/?p=ffmpeg.git;a=blob;f=doc/examples/encode_video.c;h=8cd13219bb560065f4195c6b2327bab6991ab47e;hb=HEAD
[15:23:07 CEST] <kierank> try it with vc2 set to encode to 1920x1080
[15:23:13 CEST] <kierank> send avframes of 32x8 or whatever
[15:23:17 CEST] <J_Darnley> No.  I have no such file.
[15:23:17 CEST] <kierank> and then see where it gets rejected
[15:23:28 CEST] <kierank> that's the example for encoding video
[15:23:58 CEST] <J_Darnley> Oh dammit.  I'm still on atomnuker's ancient dirac branch
[15:25:47 CEST] <J_Darnley> I expect that so call encode_frame of vc2enc many more times than 24
[15:26:04 CEST] <J_Darnley> Which I expect to produce ribbush
[15:26:09 CEST] <J_Darnley> *rubbish
[15:26:24 CEST] <BBB> J_Darnley: about x264, yes at some point it had slice threading
[15:26:31 CEST] <BBB> J_Darnley: it was replaced with frame threading at some point
[15:26:46 CEST] <BBB> J_Darnley: if you read their threading docs theyll point you to the exact commits
[15:28:50 CEST] <J_Darnley> Yes, thank you for helpful comment. /s
[15:29:23 CEST] <J_Darnley> I clearly cannot explain myself
[15:32:25 CEST] <BBB> J_Darnley: about libavcodec, I dont think we have API to input less than a frame; we have decoder API for that (avctx->draw_horiz_band()), but I Dont believe we have encoder API for that
[15:32:34 CEST] <BBB> (assuming you want to reduce latency)
[15:32:54 CEST] <J_Darnley> Yes, that is kierank's goal
[15:32:57 CEST] <BBB> I dont tihnk its particularly hard to model the encoder API to optionally be like avctx->draw_horiz_band
[15:33:15 CEST] <BBB> (in special/limited use cases)
[15:33:39 CEST] <BBB> so this isnt something generic that would be available for all encoders, just a small limited subset that set a special flag and are opened with a special option set etc.
[15:34:34 CEST] <BBB> and then the internal implementation of that in the encoder (in terms of load balancing between threads - or even whether threading is enabled at all) is entirely up to the encoder, i.e. up to you
[15:34:59 CEST] <BBB> (i.e. feed_horiz_band and slice threading are entirely independent of each other, although youd probably want to use both in most typical use cases)
[15:35:14 CEST] <BBB> (or maybe not, hard to say without knowing the use case :) )
[15:36:24 CEST] <J_Darnley> Sounds close enough to me.
[15:38:15 CEST] <J_Darnley> I don't know what I can and can't talk about here but I think I can say that lowering the latency is the goal.
[15:44:20 CEST] <iive> won't it be easier to use an existing codec that already has slices support?
[15:45:38 CEST] <iive> and rtp already has a ton of crap to work on slice level.
[15:46:40 CEST] <J_Darnley> I have no idea.
[15:47:29 CEST] <kierank> I just want to encode slices as they arrive
[15:47:33 CEST] <J_Darnley> Isn't rtp a muxer or protcol?  Just how does that relate to an encoder.
[15:48:47 CEST] <kierank> I would ignore those weird ones
[15:49:44 CEST] <iive> if you want to avoid patents, then mpeg1/2 should be ideal, as they should have expired already.
[15:50:12 CEST] <iive> real time protocol, network transmission.
[15:51:18 CEST] <iive> an old one. rtsp is based on top of it.
[16:04:05 CEST] <J_Darnley> Ugh.  The encode_video example doesn't appear to work.
[16:04:21 CEST] <J_Darnley> At first it was flat grey so I hiked up the bitrate
[16:04:53 CEST] <J_Darnley> Now ffplay only shows 1 frame.
[16:06:42 CEST] <J_Darnley> I guess I should check whether ffmpeg does it any better.
[16:18:43 CEST] <J_Darnley> No, it isn't.
[16:19:10 CEST] <J_Darnley> Now is that because the bitstream cannot code more than 1 image?
[16:19:25 CEST] <J_Darnley> *raw bitstream
[16:20:00 CEST] <J_Darnley> Does the demuxer just not like it?
[16:33:18 CEST] <BBB> I think encode_video doesnt mux
[16:33:24 CEST] <BBB> so it doesnt work for all use cases
[16:34:05 CEST] <J_Darnley> Yeah, you only give an encoder name.
[16:47:40 CEST] <kierank> raw vc2 should work
[16:52:53 CEST] <J_Darnley> So clear up confusion: Neither ffmpeg or encode_video can produce a raw vc2 bistream that ffplay can play more than just the first frame.
[16:56:41 CEST] <kierank> J_Darnley: it's should work
[16:56:48 CEST] <durandal_170> J_Darnley: no parser?
[16:57:08 CEST] <J_Darnley> I don't know, don't care.  I might have to care later.
[18:21:40 CEST] <cone-093> ffmpeg 03Paul B Mahol 07master:181c9abd47cb: avfilter/vf_premultiply: add inplace mode
[18:41:15 CEST] <cone-093> ffmpeg 03Paul B Mahol 07master:1bef0088dce7: avfilter/drawutils: add gray9/10/12 support
[18:48:02 CEST] <J_Darnley> Hacking doc/examples/encode_video was not very instructive.
[18:48:48 CEST] <J_Darnley> Allocating full size frames gets gets the encoder encoding zeros after the initial band drawn in the example
[18:49:37 CEST] <J_Darnley> Allocating less than the full frame results in a segfault becauce the encoder doesn't check the height of the given frame.
[18:49:47 CEST] <J_Darnley> The first is hardly surprising.
[18:50:19 CEST] <J_Darnley> I guess the second shows that there isn't some check hidden away somewhere rejecting frames.
[19:03:18 CEST] <kierank> J_Darnley: that's good
[19:03:41 CEST] <kierank> if you add your own x, y position stuff in AVFrame then you can do things
[20:35:39 CEST] <durandal_170> vdd open sauce
[20:35:56 CEST] <BBB> if its free, it must be bad
[20:36:17 CEST] <BBB> you should come though
[20:36:18 CEST] <BBB> :-p
[20:36:22 CEST] <BBB> [tm]
[20:36:38 CEST] <J_Darnley> Open Sores!
[20:37:29 CEST] <durandal_170> what will be presented this year?
[20:39:30 CEST] <BBB> Id like to do something with vp9
[20:39:33 CEST] <BBB> but I dont know what yet
[20:40:21 CEST] <durandal_170> vp9? there is nothing new, everything is known
[20:40:46 CEST] <BBB> one test says its 30% worse than x265, another says its 10% better
[20:40:48 CEST] <BBB> which test is right?
[20:41:00 CEST] <durandal_170> neither
[20:41:03 CEST] <BBB> :-p
[20:42:39 CEST] <durandal_170> yea do vmaf/psnr/ssim comparison of various derf samples, once it finish encoding
[20:46:52 CEST] <kierank> J_Darnley: want to present at VDD like BBB said before?
[20:47:44 CEST] <J_Darnley> Present what?  mmpeg2 stuff?  Not really.
[20:47:56 CEST] <kierank> k
[20:48:12 CEST] Action: J_Darnley sighs
[20:48:18 CEST] <durandal_170> make some asm workshop with vlc guys
[20:48:33 CEST] <J_Darnley> I still didn't start the third part of the blog post series.
[20:49:55 CEST] <BBB> I can also do something about x86inc.asm/x264asm
[20:50:08 CEST] <BBB> but I fear that it would have very limited interest for most people
[20:50:19 CEST] <BBB> most people like asm as long as someone else writes it
[20:59:36 CEST] <kierank> J_Darnley: no rush
[21:27:38 CEST] <atomnuker> BBB: talk about how its the best puzzle game and its a shame newspapers don't put SIMD problems next to crosswords
[21:28:54 CEST] <kierank> lol
[21:29:00 CEST] <atomnuker> people just need to get hooked up with it once to know its crack cocaine and soon they'll be running out of complex functions to simd
[21:29:28 CEST] <BBB> intel will just come up with wider registers and newer strange instruction names
[21:29:37 CEST] <BBB> vphfksjdnklrelnzxsw
[21:29:42 CEST] <BBB> it does magic things
[21:29:57 CEST] <BBB> how to abuse this magic new instruction to make your function 10% faster?
[21:30:03 CEST] <BBB> read about it next week @ VDD!
[21:30:12 CEST] <RiCON> wouldn't be surprised if that's actually a real name now or in the future
[21:31:01 CEST] <kierank> atomnuker: does that make me your drug dealer?
[21:31:43 CEST] <atomnuker> yep, yes it does
[21:31:48 CEST] <BBB> ...
[21:31:56 CEST] <kiroma> I joined this channel to see how professional programmers write code to a big project
[21:32:09 CEST] <JEEB> welcome to open source
[21:40:24 CEST] <jamrial> BBB: it does magic things, but takes alone 5x as much time to finish than the combination of 10 separate intructions that together do the same thing
[21:41:43 CEST] <jamrial> see sse4 dpps, or avx2 vgather
[21:41:53 CEST] <jamrial> it seems to be the norm for anything past ssse3 :p
[21:42:02 CEST] <BBB> I like avx2
[21:42:13 CEST] <BBB> pmovz/sx* are quite useful
[21:42:18 CEST] <BBB> even if they arent exactly novel
[21:42:51 CEST] <BBB> I wish we had more variants like pmulhrsw
[21:42:57 CEST] <jamrial> it's neat, yes
[21:42:58 CEST] <BBB> e.g. a rounded shift from dword to word would be very useful
[21:43:14 CEST] <atomnuker> jamrial: dpps seems crazy, why is there even a need for a mask argument?
[21:43:30 CEST] <BBB> like paddd a, [something], paddd b, [something], psrad a, something, psrad b, something, packssdw,a, b
[21:43:30 CEST] <atomnuker> this could be just a mulps and an andps
[21:43:45 CEST] <BBB> I bet you could do that in a single call like pmulhrsw
[21:44:07 CEST] <BBB> (in a more limited sense would be fine, e.g. just rounded shift by a fixed integer number without the multiply would be fine)
[21:44:15 CEST] <jamrial> i have no idea, never used it. just read a blog post of someone trying to implement it and finding that it's just too slow compared to using a combination of other instructions. just like vgather on anything pre skl-x
[21:44:47 CEST] <jamrial> BBB: yeah, a instruction like that could be used for example in jpeg2000dsp
[21:44:48 CEST] <atomnuker> is it actually faster than doing it individually on skl-x?
[21:45:02 CEST] <BBB> an instruction like that could be used anywhere
[21:45:03 CEST] <atomnuker> still waiting on agner fog's to see what skl-x does good
[21:45:05 CEST] <BBB> it would be hugely useful
[21:45:08 CEST] <jamrial> gather is fast only in skl-x
[21:45:16 CEST] <BBB> but intel dont listen to me
[21:46:25 CEST] <jamrial> they are busy adding weird instructions for deep learning
[21:47:09 CEST] <BBB> deep learning is the 2020 version of lomoso
[21:48:17 CEST] <BBB> apparently its SoLoMo, not LoMoSo (anyway...)
[22:09:11 CEST] <Gramner> atomnuker: official skl-x instruction latencies and throughput are release. it's missing some stuff and other things are outright wrong, but still
[22:10:07 CEST] <Gramner> the best one is vpconflictd with zmm regs. it's 67/17.5
[22:10:42 CEST] <Gramner> https://software.intel.com/sites/default/files/managed/ad/dc/Intel-Xeon-Scalable-Processor-throughput-latency.pdf
[22:12:29 CEST] <atomnuker> vgathers don't seem much better, they just have halved throughput compared to sklylake
[22:13:30 CEST] <Gramner> gathers have fine throughput. they just have a bunch of extra latency over doing individual loads for who knows what reason
[22:14:25 CEST] <atomnuker> I bet its wasting time doing the mask
[22:15:37 CEST] <atomnuker> the fact it modifies it to return error status is particularly wrong
[22:15:49 CEST] <iive> What?!
[22:16:30 CEST] <Gramner> it's implemented in the most sensible way
[22:17:05 CEST] <Gramner> how else would you do it efficiently in a way that's compatible with x86 fault semantics?
[22:17:50 CEST] <atomnuker> it already triggers a segfault on out of bounds indices
[22:18:50 CEST] <Gramner> page faults are the relevant fault type for gathers
[22:19:13 CEST] <Gramner> segfaults are irrelevant. it can take another million cycles if that happens and nobody would care
[22:20:28 CEST] <atomnuker> page faults? as in the soft faults when the page you need to load from is only on ram atm or something?
[22:20:31 CEST] <Gramner> if you do a gather of 16 elements and hit a page fault on the 15:th, the OS will handle the page fault and restart the instruction. but you obviously don't want to have to redo the already completed 13 elements which is why the mask is updated to reflect that they have already been gathered
[22:20:49 CEST] <Gramner> 14*
[22:21:33 CEST] <iive> Gramner: given how much cycles a page fault takes to be handled, that would be irrelevant.
[22:22:24 CEST] <atomnuker> aren't page faults only relevant if you're running starting to run out of CPU?
[22:22:35 CEST] <Gramner> and I doubt updating a bitflag takes 20 cycles
[22:22:35 CEST] <iive> you mean ram
[22:23:10 CEST] <iive> and that could involve waiting for HDD.
[22:33:52 CEST] <Gramner> anyway, I really doubt that gathers are slow due to how the instruction is defined. it's most likely just a suboptimal implementation (they have been made faster every cpu generation since they were introduced)
[22:34:35 CEST] <Gramner> it wouldn't surprise me if they ignore L1 and just go straight for L2 or something
[22:40:46 CEST] <Gramner> more interesting things about SKL-X is that some instruction latencies varies depending on which port they execute on. zmm floating-point ops and multiplies for example have 4 cycles latency on p0 but 6 cycles on p5
[22:42:42 CEST] <Gramner> also vpermw et. al are a lot slower than dword and qword equivalents. I'm guessing that will get improved at the same time as avx512vbmi is introduced though
[22:43:23 CEST] <TD-Linux> BBB, x86inc.asm + rust! (it's pretty boring though)
[22:43:35 CEST] <BBB> rust?
[22:44:10 CEST] <Gramner> kostya did some blog post about rust in relation to multimedia stuff
[22:47:23 CEST] <TD-Linux> yeah I've read it, it's reasonable
[22:47:39 CEST] <Blubberbub_> https://codecs.multimedia.cx/2017/07/rust-not-so-great-for-codec-implementing/ <- this?
[22:47:48 CEST] <Gramner> yeah
[22:48:29 CEST] <Gramner> there was also some later one about performance vs C I think
[22:50:16 CEST] <Gramner> https://codecs.multimedia.cx/2017/08/rust-optimising-decoder-experience/
[22:57:26 CEST] <J_Darnley> Wow.  Someone is actually doing the work instead of just saying "why don't you rewrite in rust"
[22:58:54 CEST] <JEEB> that happens sometimes
[23:34:12 CEST] <BBB> omg I just wasted 5 minutes of my life reading these two blog posts
[23:34:14 CEST] <BBB> poor kostya
[23:34:28 CEST] <BBB> so his goal was really just to prove ignorant fools wrong?
[23:34:47 CEST] <BBB> but ignorant fools cannot be convinced of obvious things, thats why you dont start trying to convince them
[23:35:08 CEST] <BBB> you just keep ignoring them and they will continue to use your software anyway
[23:35:13 CEST] <thardin> is improving rustc really an insurmountable task?
[23:35:14 CEST] <BBB> poor kostya
[23:35:45 CEST] <BBB> thardin: I think the sad but harsh reality is that it doesnt matter
[23:36:10 CEST] <thardin> so? nothing matters
[23:36:13 CEST] <thardin> who cares?
[23:36:18 CEST] <BBB> until someone specifically focuses and demonstrates that mylanguage is optimized for multimedia-like problem areas, it wont be appropriate for it
[23:36:24 CEST] <thardin> it's an interesting experiment
[23:36:53 CEST] <BBB> well&
[23:37:00 CEST] <BBB> heres my take on it
[23:37:14 CEST] <BBB> an ignorant fool can ask any question at a rate of about 60 questions per minute
[23:37:52 CEST] <BBB> demostrating the foolishness of the question - as meta-demonstrated by kostya - takes a few days per question
[23:38:08 CEST] <BBB> that assymetry means that we need more than zero evidence before we approach a potentially-foolish question
[23:38:55 CEST] <BBB> just because otherwise all we would ever do (and get done) is to address a fools concerns, but we wouldnt be writing software anymore
[23:39:00 CEST] <BBB> akak wed be dinosaurs
[23:39:08 CEST] <BBB> akak=a.k.a.
[23:40:06 CEST] <BBB> (thats probably very cynical&)
[23:40:16 CEST] <wm4> so when will ffmpeg be converted to rust (serious question)
[23:40:34 CEST] <BBB> it sounds like kostya is almost done with it
[23:40:37 CEST] <kierank> avformat is not an unreasonable candidate
[23:40:56 CEST] <kierank> and then decoding done via hw decoders
[23:40:56 CEST] <BBB> ffmpeg.c is probably a good candidate
[23:41:18 CEST] <wm4> modern demuxers are already written in JS and this safe
[23:41:49 CEST] <thardin> it's only 3x slower than C!
[23:41:59 CEST] <thardin> ffmpeg electrum app when?
[23:43:50 CEST] <thardin> sleep time soon, so I can continue work on my patchset
[23:58:45 CEST] <thardin> I do hope kostya gets better. he seems burned out
[00:00:00 CEST] --- Tue Aug  8 2017


More information about the Ffmpeg-devel-irc mailing list