burek021 at gmail.com
Tue Sep 12 03:05:04 EEST 2017
[00:02:01 CEST] <atomnuker> only issue is you can't directly hwdownload,format=bgr0 them but oh well
[00:28:01 CEST] <atomnuker> jkqxz: I think it might be better to wait for the major bump to apply the opencl stuff since then we could add and remove them in a single patchset
[00:32:03 CEST] <jkqxz> Yeah, the tiled hwdownload is annoying. You get around it by mapping to any other API, though.
[00:32:30 CEST] <jkqxz> There isn't any interdependency, so the bump is irrelevant for addition. Only removal is a problem.
[00:37:38 CEST] <atomnuker> I don't want to end up with 2 opencl boilerplates lying around
[00:51:31 CEST] <jamrial> Technically, the old opencl stuff, being public api, can't be removed right away
[00:52:17 CEST] <jamrial> you could however argue that if no downstream project uses it, we could just remove it and skip the deprecation period
[00:52:34 CEST] <jamrial> not sure how popular would that be
[00:54:24 CEST] <atomnuker> isn't it avpriv?
[00:54:52 CEST] <jkqxz> Not, it's all "av_". "This interface is considered still experimental and its API and ABI may change without prior notice." wins, though, I think.
[00:56:21 CEST] <jkqxz> (Since early 2013, of course. Google indeed suggests exactly zero downstream users.)
[00:57:34 CEST] <kode54> I have a fate test now
[00:57:45 CEST] <kode54> using the "gapless" module used for mp3
[00:57:58 CEST] <kode54> strangely, output 1 and output 2 produce the same md5
[00:58:14 CEST] <kode54> maybe because it already skips the leading sample data without seeking
[00:59:09 CEST] <kode54> how do I submit a fate change, in addition to the tiny test file?
[01:01:02 CEST] <atomnuker> jkqxz: yeah, that note's enough to convince anyone I think
[01:10:54 CEST] <jamrial> kode54: add the changes to gapless.mak and the new files created by GEN=1 that should be in the tests/ref folder to your patch
[01:11:09 CEST] <jamrial> or send them in a separate patch to be applied after the mp3dec change
[01:11:39 CEST] <jamrial> upload the sample somewhere and link it in the test patch as well
[01:12:06 CEST] <jamrial> the file will be uploaded to the samples repo
[01:13:06 CEST] <jamrial> jkqxz: oh hey, if it's still tagged as experimental then it can probably be removed without having to deprecate it
[04:48:03 CEST] <TimothyGu> BtbN: do you know why sudo is "required" by FFmpeg-Coverity?
[04:48:09 CEST] <TimothyGu> https://github.com/FFmpeg/FFmpeg-Coverity/blob/master/.travis.yml#L1
[07:28:36 CEST] <atomnuker> hardware encoders ruined video encoding in general and are behind the disinterest of people in encoding
[07:31:59 CEST] <atomnuker> they have crap quality but people use them because they are faster, and since they can't change them and have no one to complain to, there's no motivation to improve them
[07:46:47 CEST] <graphitemaster> dude, same with hardware decoders
[08:11:32 CEST] <atomnuker> really? they can be okay as long as they don't do color coversion, don't interlace non-interlaced video, don't deinterlace and are faster for most practical videos
[08:13:02 CEST] <atomnuker> though you may have a point given how pathetically underpowered some are and can't do higher levels of a profile so encoders that target them can't use certain features
[08:14:58 CEST] <atomnuker> hardware people are all living in the past, they refuse to admit time has passed at all and technology has gotten better
[08:15:56 CEST] <atomnuker> you'd think that current hardware decoders for h264 could handle absolutely every level and profile since the codec is more than 10 years old
[08:18:15 CEST] <atomnuker> but no, even though current technology can fit 10, 20, 30 times as many transistors in a given amount of space, with around the same reduction in power and for cheaper, they still keep the same feature set (and speed too probably)
[09:42:23 CEST] <ubitux> atomnuker: it's hard to get around hardware encoding on embedded platforms (mobile&) unfortunately
[09:42:50 CEST] <ubitux> or even decoding fwiw
[09:42:55 CEST] <JEEB> yea
[09:43:05 CEST] <graphitemaster> atomnuker, I think it's a matter of cost more than anything. There's basically three ways to make hardware decoders. True hardware decoding, which takes die space (and they have to implement several variants as it is since so many conflicting codecs) and that also costs wires, and it's hard enough for GPUs which pack so much tranistors (several orders of magnitude than say CPUs do). Then there's just software based decoders
[09:43:05 CEST] <graphitemaster> that leverage the parallel/programmable aspect of the GPUs themselves (see Cuda, OpenCL, etc), and then there's hybrid approaches.
[09:43:25 CEST] <JEEB> although you get an extra layer of shit in any case, due to the disparity in quality (´4@)
[09:45:30 CEST] <graphitemaster> I wonder if anyone has actually compared hardware encoder/decoders by dumpping their results against some reference raw set and calculate RMSE of each frame, averaging that over how ever many frames there is, taking special care to plot each frame RMSE difference on a bar graph, one pixel wide for each frame. That could be a useful metric for looking at hardware differences.
[09:46:16 CEST] <graphitemaster> I bet one can easily do that with ffmpeg
[09:46:18 CEST] <nevcairiel> decoders are bit-exact, we can easily test that with ffmpeg FATE
[09:46:35 CEST] <JEEB> yea, at least that if you can get the YCbCr out of them
[09:46:37 CEST] <nevcairiel> encoders of course produce varying quality and some people have measured SSIM and the like
[09:47:17 CEST] <nevcairiel> i don't consider those crappy decoders that give you RGB or something to be "valid" =p
[09:47:42 CEST] <JEEB> :D
[09:47:45 CEST] <graphitemaster> I think it's just my exposure to video playback programs then because I can see noticable difference in say vdpau playback of a video in mplayer vs software, is there some stupid bilinear filtering happening? I just don't quite get it. Or maybe it's all placebo :P
[09:48:44 CEST] <JEEB> if you want the least insane playback chain I recommend taking a look at mpv's opengl renderer
[09:48:46 CEST] <nevcairiel> i cant tell you what mpv might actually do, but at least PC based hardware decoders in the 3 mainstream GPU vendors cards are all bitexact with the possibility of extracting that image untouched
[09:48:55 CEST] <graphitemaster> maybe the decoders are bit-exact, but the presentation of frames isn't because there's so many ways to get it out, e.g OpenGL textures can have filtering applied, even global settings in something like nvidia-settings can affect that.
[09:49:14 CEST] <JEEB> and then you can read the caveats @ https://mpv.io/manual/master/#options-hwdec
[09:49:29 CEST] <graphitemaster> like, even if the application developer askes for nearest min/mag on everything, global settings can override it :|
[09:50:00 CEST] <graphitemaster> and then you have compositors that can change that if you don't playback full screen (proper, not windowed fullscreen or windowed)
[09:51:13 CEST] <nevcairiel> i dont have much experience with opengl interop, i do know however that you can get perfect zero-copy rendering with d3d11. of course if you have a opengl renderer and/or linux, that doesn't really help you much :)
[09:51:55 CEST] <graphitemaster> yeah, my day job is rendering and I can tell you some horror stories about how awful texture copies can get.
[09:52:50 CEST] <graphitemaster> the real problem with opengl is the lack of native texture formats like ycbcr, so you have to do three luminance textures instead, and then some form of a shader pass to combine that into rgb, something the hardware can natively sample ... there's extensions of course.
[09:53:12 CEST] <graphitemaster> and shaders are subject to shader aliasing depending on quality/performance settings
[09:53:13 CEST] <nevcairiel> had the same problem with d3d9, luckily d3d11 added them
[09:53:24 CEST] <nevcairiel> (the texture formats)
[09:53:33 CEST] <GoD-Tony> threaded AAC encoding, thought you guys might find this interesting, https://github.com/enzo1982/superfast
[09:55:02 CEST] <JEEB> when is AAC encoding the bottleneck of your chain?
[09:55:48 CEST] <GoD-Tony> my tv setup only allows AAC, so im converting to that format all the time
[09:56:22 CEST] <ubitux> atomnuker: oh, and i forgot all the licensing fees shit
[09:56:49 CEST] <ubitux> when you use the mobile encoders, you don't have to pay anything, the company does it for you
[09:57:06 CEST] <JEEB> GoD-Tony: yea but even the libavcodec encoder as far as I can tell is quite fast, and if you're converting files you can just run them parallel
[09:57:07 CEST] <ubitux> if you want to use your h264/hevc encoder OTOH, good luck with the legal thing
[09:57:43 CEST] <graphitemaster> nevcairiel, the irony here isn't so much that OpenGL can't do it or shouldn't do it. Nearly all hardware vendors support it, if not by virtue of the fact that it's needed for other things they already advertise. The problem is no one has standardized an extension for it yet. The ARB is just not interested. Meanwhile we get insane texture formats like shared 5-bit exponent and 9 bit RGB channels for cheaper HDR, which becomes
[09:57:44 CEST] <graphitemaster> standard. That's not even sensical from a purity standpoint.
[09:58:13 CEST] <graphitemaster> YCbCr makes more sense than RGB9E5
[09:58:51 CEST] <nevcairiel> odd
[09:59:15 CEST] <graphitemaster> It's ironic because the argument here is that these optimized formats are meant to reduce bandwidth. Which is exactly what YCbCr is meant to do as well, half resolution Cb/Cr.
[10:00:10 CEST] <GoD-Tony> JEEB: it can take a couple minutes to transcode a full film of audio, so it's fast, but could be 4+ times faster
[10:00:53 CEST] <JEEB> where you just put multiple things next to each other on a higher level
[10:00:56 CEST] <nevcairiel> or you could just transcode 4 movies at a time and have the same speed without hacks :)
[10:01:12 CEST] <JEEB> I mean, threading is fun but I don't see the major positive side effect here
[10:01:20 CEST] <JEEB> because audio encoding is so fast already
[10:01:25 CEST] <graphitemaster> how fast can you chop a movie up into 4 pieces :P
[10:01:41 CEST] <JEEB> doesn't even have to be the same thing chopped does it?
[10:01:59 CEST] <JEEB> you can just work your back log multiple at once :P
[10:01:59 CEST] <GoD-Tony> lol, yea all good points
[10:02:52 CEST] <nevcairiel> unless you can do actual transparent multithreading, which in a codec like aac might not be really easy, its probably not worth it
[10:03:31 CEST] <nevcairiel> i mean, that tool does just that, it slices the file into chunks and encodes them in parallel, using multiple encoder instances
[10:03:42 CEST] <nevcairiel> it d oesnt actually multithread the encoder
[10:04:09 CEST] <nevcairiel> you could apply that technique to any encoder in ffmpeg
[10:04:17 CEST] <JEEB> hah, why did I guess it was like that :P
[10:04:20 CEST] <nevcairiel> (or any encoder anywhere, for that matter)
[10:05:08 CEST] <nevcairiel> some people use that for video encoding as well, because the frame-threading in video encoders doesnt scale properly for many cores
[10:05:57 CEST] <graphitemaster> you're not going to get the optimal file size this way though
[10:06:21 CEST] <nevcairiel> yeah, but the difference should be very minimal
[10:06:25 CEST] <graphitemaster> extra reference frames for nothing
[10:06:40 CEST] <graphitemaster> depends how much you parallelize it
[10:06:45 CEST] <nevcairiel> if you're smart you find a scene cut to cut on
[10:06:54 CEST] <nevcairiel> to reduce that effect
[10:06:54 CEST] <graphitemaster> split the movie up into 1024 threads
[10:07:02 CEST] <graphitemaster> yeah I guess
[10:09:36 CEST] <graphitemaster> I've been toying with writing my own video codec for work because we have some shitty intel compute sticks to decode video on that claim to support hardware accelerated video decoding, just not at the resolutions we need, and also hard to get in an OpenGL application as a texture applied on a surface in a 3D scene.
[10:10:01 CEST] <graphitemaster> We only need it for short 30 second advertisements
[10:10:57 CEST] <graphitemaster> so I've been toying with converting frames into yuv planes, and taking each plane, and subdividing it as a quad tree, and giving each node of the quad tree an intensity value, then representing frame changes as quad tree changes
[10:11:08 CEST] <graphitemaster> and then entropy encoding the quad trees
[10:11:31 CEST] <graphitemaster> it works surprsingly well, just not happy with the file size still
[10:12:21 CEST] <nevcairiel> i'm sure someone could recommend an existing codec with decent size and fast software decode on those tiny CPUs
[10:12:33 CEST] <graphitemaster> doing some sort of dither at the beginning to reduce the palette helps file size but it's ugly quality wise
[10:13:08 CEST] <graphitemaster> it beats motion jpeg in file size and quality though, so I got that going for it :P
[10:15:10 CEST] <graphitemaster> the idea came from some existing work done by this guy http://d00m.org/~someone/qtc/
[10:18:34 CEST] <graphitemaster> his demo videos are kind of unfair since he's only showing things that tend to have lots of areas of constant color
[10:18:56 CEST] <graphitemaster> which this technique works really well with obviously
[12:08:03 CEST] <atomnuker> GoD-Tony: nope, I discussed this in opus, the consensus was its insane
[12:08:15 CEST] <atomnuker> it doesn't touch anything on the encoder itself
[12:08:36 CEST] <atomnuker> instead it creates multiple encoder contexts and feeds them all at the same time, with an offset to each other
[12:09:58 CEST] <atomnuker> the only codec you can thread sanely is opus, I think, since you can put up to 6 long/48 short frames in a packet
[12:10:29 CEST] <atomnuker> (which I did do, but I dropped until I had proper analysis)
[12:29:56 CEST] <GoD-Tony> yea fair enough. basically the same as splitting my audio in 4 big chunks and encoding them all at once, then concat
[12:29:59 CEST] <GoD-Tony> probably wont do that
[13:19:31 CEST] <cone-638> ffmpeg 03Michael Niedermayer 07master:b5995856a423: avcodec/diracdec: Fix overflow in DC computation
[13:19:32 CEST] <cone-638> ffmpeg 03Michael Niedermayer 07master:c225da68cffb: avcodec/hevcdsp_template: Fix undefined shift in put_hevc_pel_bi_w_pixels
[13:19:33 CEST] <cone-638> ffmpeg 03Michael Niedermayer 07master:f0efd795f460: avcodec/clearvideo: Only output a frame if one is coded in the packet
[13:19:34 CEST] <cone-638> ffmpeg 03Michael Niedermayer 07master:2d025e742843: avcodec/jpeg2000dsp: Fix multiple integer overflows in ict_int()
[14:08:11 CEST] <cone-638> ffmpeg 03Ronald S. Bultje 07master:9bab39dee52a: vp9: fix compilation with threading disabled.
[14:08:12 CEST] <cone-638> ffmpeg 03Ilia Valiakhmetov 07master:86eb50549a57: Changelog: add vp9 tile threading support
[15:24:40 CEST] <GoD-Tony> heh, out of curiousity i wrote a script that does the above, encodes the audio with 8 ffmpeg processses and throws it all back together. ends up being a little over 2.5x faster
[15:26:08 CEST] <nevcairiel> i would think you can get almost linear scaling, well, minus the cost of splitting and re-assembling, which is a bunch of disc IO i guess
[15:33:16 CEST] <GoD-Tony> yea you're right, ill play around with it more a bit later to see if im missing something
[15:34:21 CEST] <rcombs> up to whatever [total available CPU time] / [CPU time used by a single encode] is
[15:35:51 CEST] <nevcairiel> of course if you run out of cpus it will stop
[15:55:05 CEST] <BBB> anyone feel like fixing trivial tsan issues? http://fate.ffmpeg.org/report.cgi?time=20170906040551&slot=x86_64-archlinux-gcc-tsan
[15:55:16 CEST] <BBB> c->exit should be mutex-protected or atomic
[15:55:49 CEST] <BBB> making it atomic is probably what the code intended it to be
[15:59:07 CEST] <BBB> or Ill do it myself, lets be not-lazy
[16:10:58 CEST] <jamrial> ubitux: your fate slots haven't run in five days
[16:41:44 CEST] <cone-638> ffmpeg 03Steven Liu 07master:ad0d016f1cd2: MAINTAINERS: Add myself to maintainer of dashdec
[18:28:19 CEST] <GoD-Tony> conclusion is the new bottleneck is IO on my hdd, which I dont mind. was cool to see the audio encode crazy fast anyway :P
[19:45:56 CEST] <cone-638> ffmpeg 03Michael Niedermayer 07release/3.3:b590758298cc: avcodec/scpr: optimize shift loop.
[19:45:57 CEST] <cone-638> ffmpeg 03Michael Niedermayer 07release/3.3:a295d1870a8c: avcodec/diracdec: Fix overflow in DC computation
[19:45:58 CEST] <cone-638> ffmpeg 03Michael Niedermayer 07release/3.3:32fa6ce64afd: avcodec/hevcdsp_template: Fix undefined shift in put_hevc_pel_bi_w_pixels
[19:45:59 CEST] <cone-638> ffmpeg 03Michael Niedermayer 07release/3.3:4f97556f5489: avcodec/jpeg2000dsp: Fix multiple integer overflows in ict_int()
[19:46:00 CEST] <cone-638> ffmpeg 03Michael Niedermayer 07release/3.3:eca53fd52bdc: Update for 3.3.4
[20:51:31 CEST] <ubitux> jamrial: mmh yeah, it's stalled in asan
[20:51:41 CEST] <ubitux> i'll restart asap
[21:29:16 CEST] <jamrial> ubitux: cool
[21:30:04 CEST] <jamrial> oh, you took the chance to run pacman. good since it should have updated glibc as well
[21:30:58 CEST] <jamrial> i think there was a ticket related to some compilation issue with glibc 2.26
[21:31:12 CEST] <jamrial> but it might have been related to cuda instead
[21:38:34 CEST] <jamrial> BBB: http://fate.ffmpeg.org/log.cgi?time=20170911192454&log=compile&slot=x86_64-archlinux-gcc-ddebug the error at the end
[21:39:12 CEST] <BBB> hm?
[21:39:34 CEST] <BBB> lemmecheck
[21:39:36 CEST] <BBB> sorry about that
[21:39:58 CEST] <jamrial> i wonder why it doesn't fail on all cases
[21:40:49 CEST] <jamrial> assert() seems to be ignored during compilation outside of that one fate client
[21:41:20 CEST] <jamrial> it should be av_assert0 or 2 for that matter
[21:42:02 CEST] <BBB> oh
[21:42:03 CEST] <BBB> I chose 1
[21:42:08 CEST] <BBB> it was a random toincoss
[21:42:09 CEST] <BBB> is 1 ok?
[21:45:12 CEST] <jamrial> BBB: it should be ok i guess. i always see 0 used for checks that must be always done and 2 for checks in speed critical code
[21:45:25 CEST] <jamrial> hence my suggestion. i didn't read the doxy in avassert.h
[21:55:19 CEST] <BBB> jamrial: if its important, please review patch so I can push it
[22:01:07 CEST] <BBB> jamrial: its only in a per-frame loop (2-pass decoding), so it executes once or twice per frame, and this is only ni 1-pass mode-related code
[22:01:28 CEST] <BBB> so I think av_assert1 is more correct
[22:01:32 CEST] <jamrial> ok, 1 should be good then
[22:01:33 CEST] <BBB> its not per-block coding or anything like that
[22:01:36 CEST] <BBB> ty
[22:01:38 CEST] <BBB> will push
[22:02:50 CEST] <BBB> pushed
[22:02:58 CEST] <cone-638> ffmpeg 03Ronald S. Bultje 07master:4ce99e96d611: vp9: assert -> av_assert and fix associated compile error.
[22:08:42 CEST] <atomnuker> jkqxz: are you going to push the kmsgrabber changes soon?
[22:35:47 CEST] <jkqxz> atomnuker: Does that constitute approving the patch series? Noone has said anything about the wrapped_avframe stuff.
[22:38:23 CEST] <jamrial> jkqxz: i think wm4 pointed it was a security concern. not sure if you addressed that
[22:39:47 CEST] <jkqxz> Muhammad Faiz did. Is he here? The second version should have addressed that.
[22:42:14 CEST] <jamrial> i don't think he is
[22:42:24 CEST] <jamrial> maybe ping the thread?
[22:42:56 CEST] <atomnuker> jkqxz: yep, it does, I've tested it and looked at the code
[22:45:07 CEST] <atomnuker> jkqxz: though I'm not sure if the trusted flag would do any good
[22:47:36 CEST] <jkqxz> It prevents the described attack which invokes the decoder externally; there might be some more subtle problem (which is why I'd like someone to actually look at it).
[00:00:00 CEST] --- Tue Sep 12 2017
More information about the Ffmpeg-devel-irc