[Ffmpeg-devel-irc] ffmpeg-devel.log.20140630

Tue Jul 1 02:05:02 CEST 2014

[00:01] <BBB> kierank: yeah it sounds like they mean they change some types from uint8_t to uint16_t, that doent mean 16bits in terms of hw will require all 16 bits
[00:01] <BBB> kierank: it just means a marketing dude rather than a tech dude
[00:01] <BBB> but what do I know, I dont work there
[00:01] <kierank> heh
[00:02] <BBB> Daemon404: also, if you have stuff youd rather have me work on, feel free to suggest it, Im open to different stuff
[01:20] <J_Darnley> Does git allow you to define global hooks like its global settings?
[01:22] <J_Darnley> I guess not.  There's only a file .gitconfig in ~
[04:34] <cone-388> ffmpeg.git 03Michael Niedermayer 07master:418e5768c68b: swresample/resample_template: move division out of loop for float/double swri_resample_linear()
[05:21] <cone-388> ffmpeg.git 03Michael Niedermayer 07master:ca384d708b74: avformat/rtsp: use av_malloc_array()
[05:21] <cone-388> ffmpeg.git 03Michael Niedermayer 07master:f054d1e7aedb: avformat/hdsenc: Use av_mallocz_array()
[10:06] <Varuh> hi guys. tell me please, can the Windows Media player correctly switch SPS PPS ? I've made mp4/h264 file from several pieces.
[10:46] <anshul> Varuh, wrong maililing list ask it at  #ffmpeg if its related using ffmpeg
[10:57] <Varuh> anshul, I want improve mp4 multiplexer (movenc.c) to make WMP compatible mp4 files.
[11:23] <anshul> varuh, Sorry for previous one
[11:26] <anshul> Varuh ,I am not sure but have heard that wmp does not have any codec of his own , it use some external codecs.
[11:27] <anshul> If Wmp is using some xyz encoder for h264 then look at xyz encoder spec, it might be helpful to get in that direction
[11:33] <anshul> Varuh, http://www.mediacodec.org/ from here, i got to know that wmp use ffdshow, it look like opensource and may ur solution lies here 
[11:35] <Varuh> I have not ffdshow. WMP uses Microsoft DTV-DVD Video decoder. (native ms decoder)
[11:36] <Varuh> the question is how to satisfy
[11:36] <Varuh> it
[11:38] <Varuh> sometimes it eats new SPSPPS from stream sometimes not.)
[11:38] <av500> sometimes its hungry
[11:39] <Varuh> :)
[11:57] <anshul> Is there any difference between pps/sps when it accept
[12:08] <Varuh> everything is much more strange.
[12:08] <Varuh> I have 2 files. the first contain only video stream, second contain video and audio.
[12:08] <Varuh> video stream was created from 2 files with same resolution and different FPS and SPSPPS.
[12:08] <Varuh> so WMP switches SPSPPS on v+a file and can't do it when video only.
[12:08] <Varuh> video bitstream and timestamps on this files are the same.
[12:24] <anshul> Varuh,It look like issue is at muxing level not encoding level
[12:27] <anshul> It may be variable fps issue did you try after converting the fps of both video to common value.
[12:46] <Varuh> you're right, anshul. fps alignment fixes it. thank you) 
[13:51] <plepere> BBB : is there some strange voodoo magic to benefit from AVX2 speed-ups ?
[13:51] <plepere> because I'm not getting any. :/
[13:56] <BBB> lol
[13:56] <BBB> I dont have a test machine, so I dont know
[13:58] <nevcairiel> Aren't you using Intel's emulation thing? Maybe that's just not suited for benchmarking
[13:58] <plepere> ubitux gave me access to a AVX2 laptop
[13:58] <plepere> through SSH
[13:58] <nevcairiel> Ah
[13:59] <plepere> the times weren't changing, so I did a valgrind and the cycles are just too similar
[13:59] <plepere> less than 1% difference, but in favor of SSE2
[14:00] <nevcairiel> Did you confirm that it's really calling the new code?
[14:00] <BBB> valgrind?
[14:00] <plepere> I get errors when I put bugs in it. :p
[14:00] <BBB> wait
[14:00] <BBB> what?
[14:01] <BBB> you have to cycle-count
[14:01] <BBB> START_TIMER/STOP_TIMER
[14:01] <BBB> dont use valgrind or time, thats insane
[14:01] <plepere> valgrind --tool=callgrind 
[14:01] <BBB> use START_TIMER/STOP_TIMER
[14:01] <plepere> gives cycle count
[14:01] <BBB> again, use START_TIMER/STOP_TIMER
[14:01] <plepere> ok ok
[14:01] <plepere> I'll try that
[14:01] <BBB> you really want to learn to use the tools that others use
[14:01] <BBB> if 100 people use one tool and you use another
[14:01] <BBB> youre probably wrong
[14:02] <BBB> even if youre not, its better to conform to majority rule in this case
[14:02] <plepere> ok, I'll do a timer test.
[14:02] <BBB> thanks
[14:02] <BBB> now, as for why does avx2 not give speedups: need to see code (old sse2/whatever code and new avx2 code)
[14:02] <BBB> you should get speedups from 3-op instructions on things like idct, and double registe width as long as you dont cross lanes
[14:03] <BBB> (plus sse2 was slow on core duo also, slower than mmx2)
[14:03] <nevcairiel> That lane thing is so lame
[14:03] <BBB> yay intel
[14:06] <plepere> the worst thing for me with AVX2 is the unpacks and packs that are done on 128b registers, not 256b. :/
[14:14] <plepere> http://pastebin.com/zPSYM8G5
[14:15] <plepere> it's for epel_hv on 16x16
[14:19] <plepere> nothing significan
[14:19] <plepere> t
[14:19] <BBB> code? (specifically, disassembly; not source)
[14:26] <plepere> BBB : http://pastebin.com/r5HqWC3h
[14:27] <plepere> good luck.
[14:27] <plepere> the sse2 version is in hevc_mc.asm
[15:37] <cone-445> ffmpeg.git 03Clément BSsch 07master:ec94c52e9724: doc: remove trailing ':' at the end of sections
[16:01] <Daemon404> shitty thing of the day: i can't apply a cmyk profile to a cmyk jpeg decoded with libavcodec because it converts to rgb
[16:02] <Daemon404> god i hate image format crap.
[18:10] <cone-445> ffmpeg.git 03John Stebbins 07master:253d0be6a1ec: pgssubdec: handle more complex PGS scenarios
[18:11] <cone-445> ffmpeg.git 03John Stebbins 07master:376f353e3d76: avcodec/pgssubdec: rename PICTURE_SEGMENT
[18:11] <cone-445> ffmpeg.git 03John Stebbins 07master:ca7f2a737256: avcodec/pgssubdec: do not fail when part of the packet is faulty unless AV_EF_EXPLODE is set
[18:11] <cone-445> ffmpeg.git 03John Stebbins 07master:4701f7676ce9: avcodec/pgssubdec: split out flush_cache()
[18:11] <cone-445> ffmpeg.git 03John Stebbins 07master:5c019ec91d94: avcodec/pgssubdec: Pass AVSubtitleRect to decode_rle()
[18:11] <cone-445> ffmpeg.git 03John Stebbins 07master:fc7da418ff5d: avcodec/pgssubdec: better error codes
[18:11] <cone-445> ffmpeg.git 03John Stebbins 07master:066a4819cc1b: avcodec/pgssubdec: Bail out of decode_rle() if error and AV_EF_EXPLODE is set
[18:11] <cone-445> ffmpeg.git 03John Stebbins 07master:0c911c8fbc87: avcodec/pgssubdec: fix end display time
[18:11] <cone-445> ffmpeg.git 03Michael Niedermayer 07master:ae9a73de2a2f: Merge commit '253d0be6a1ecc343d29ff8e1df0ddf961ab9c772'
[18:26] <ubitux> fun, i made dctdnoiz free of the avcodec dep, but it's now almost twice as slow (yeah that's possible...), but the funny thing is that going from 16x16 to 8x8 is like more than 7 times faster
[18:27] <nevcairiel> so its now 3.5 times faster?
[18:29] <J_Darnley> Cripes!  What will I replace GetTickCount() with?
[18:29] <cone-445> ffmpeg.git 03Michael Niedermayer 07master:89bcb77726e2: avcodec/pgssubdec: Check input buffer size in parse_presentation_segment()
[18:29] <cone-445> ffmpeg.git 03Michael Niedermayer 07master:e429d6c1971d: avcodec/pgssubdec: remove unused variables
[18:30] <ubitux> nevcairiel: yeah about that
[18:32] <nevcairiel> J_Darnley: there is plenty timing functions, most non-portable, any particular environment you're targetting? :p
[18:33] <J_Darnley> I don't want to replace it with another real time function
[18:34] <J_Darnley> I will replace it with something like a timestamp
[18:39] <J_Darnley> Now...  Do I plan ahead and use a 64-bit integers or stick with 32?  I guess I will stick with 32 and add 64 to the list of future improvements.
[20:09] <cone-445> ffmpeg.git 03Lukasz Marek 07master:e5c806fd676f: lavd/xv: handle delete window message
[20:41] <iive> ubitux: what codec have 16x16 dct?
[20:46] <cone-445> ffmpeg.git 03Ronald S. Bultje 07master:847bb638c098: swr: convert resample_common/linear_int16_mmx2/sse2 to yasm.
[20:48] <ubitux> iive: vp9? probably hevc too
[20:49] <iive> i know hevc doesn't have asm accelerated dct, so it can't be fast(er)
[20:55] <iive> ubitux: btw, is dctdnoiz works in a way similar to spp, how do they compare in speed?
[21:00] <jamrial> fate is very red right now
[21:11] <Daemon404> collect2: ld terminated with signal 11 [Segmentation fault], core dumped
[21:11] <Daemon404> nice
[21:13] <jamrial> there are like 20 slots failing with libavdevice/xv.c
[21:15] <Daemon404> yeah i saw
[21:15] <Daemon404> theyre all prerelease 4.9 though
[21:15] <Daemon404> and 4.9.0 is known to miscompile ffmpeg
[21:15] <Daemon404> so i am suspicious of gcc
[21:15] <jamrial> the file is missing a {
[21:15] <Daemon404> ah
[21:16] <Daemon404> just a coincidence i saw a slew of 4.9s then
[21:16] <jamrial> yeah, it's just that the rest didn't get to run again just yet
[21:16] <jamrial> the commit that introduced this is recent. less than two hours old
[21:33] <cone-445> ffmpeg.git 03Michael Niedermayer 07master:9efa7f82ce4a: avdevice/xv: fix missing {
[22:31] <ubitux> for a 8-len dct, do we do better than 11 mult and 29 add nowadays?
[22:32] <ubitux> (just wondering if i should look into this: http://www3.matapp.unimib.it/corsi-2007-2008/matematica/istituzioni-di-analisi-numerica/jpeg/papers/11-multiplications.pdf or if there is a better reference/strategy i should consider)
[22:41] <cone-445> ffmpeg.git 03Muhammad Faiz 07master:46563af79c65: avfilter/showcqt: adding freetype support
[22:47] <michaelni> ubitux, AAN does it in fewer multiplies, also if you dont limit yourself to doing all row transforms and then all column transforns seperately than 8x8 dct can be done with even fewer multiplies IIRC
[22:47] <michaelni> ubitux, see libavcodec/faandct.c
[22:48] <ubitux> isn't aan more approximative than the others?
[22:48] <michaelni> no
[22:48] <michaelni> its exact with real numbers 
[22:48] <ubitux> ah, ok
[22:48] <michaelni> or floats more or less
[22:48] <michaelni> with integers it might be less accurate
[22:49] <ubitux> i'm looking for an integer version
[22:51] <michaelni> for 8x8 ? 
[22:51] <michaelni> for that using the existing code would make most sense i guess
[22:52] <michaelni> as its optimized
[22:53] <ubitux> i'm actually somehow trying to understand the reasoning behind this common butterfly algorithm
[22:53] <ubitux> so i can apply it to a 16x16
[22:55] <ubitux> ...but strangely i don't see that much documentation on AAN or the other various variants
[22:58] <michaelni> dunno about how it was derived but the 2 obvious pathes i could imagine are to start with a DCT implemented via a FFT or to directly factorize the matrix use din the vector maxtrix multiply that represents the (i)dct
[00:00] --- Tue Jul  1 2014