[Ffmpeg-devel-irc] ffmpeg-devel.log.20140517

Sun May 18 02:05:02 CEST 2014

[00:33] <BBB> Voicu: java is pretty, but tries to solve a problem that we don't have, it's higher-level
[01:20] <wm4> <Voicu> [15:55:50] whoever designed the java API must have been drunk and/or high
[01:20] <wm4> <Voicu> [15:56:27] there is no way a sane person would unleash something like that on a fellow programmer
[01:20] <wm4> Voicu: is it specific to the java binding, or the ffmpeg API (assuming you're talking about a java ffmpeg binding)
[03:02] <jamrial> just sent some hevc patches to fix and improve some minor things
[03:23] <jamrial> BBB: sorry for the AVX inline :P
[03:23] <jamrial> i tried to refactor the function two times, and as michaelni mentioned the best i could get was %14 slower than inline because of the calling overhead
[03:24] <jamrial> only solution is probably rewriting the entire loop in yasm, and i'm not really up for that
[03:30] <compn> jamrial : hehe its ok, there was a large inline asm purge a few years ago
[03:30] <compn> converted a lot of stuff to yasm/nasm
[03:30] <jamrial> i don't like inline either. which is why i tried to port that stuff to yasm
[03:32] <jamrial> but %14 is indeed a performance hit too big to swallow
[03:32] <compn> it might be worth having the %14 hit just for msvc anyhow
[03:33] <compn> for msvc only i mean
[03:33] <compn> the code is still faster than C right ?
[03:33] <compn> than the c-code i eman
[03:33] <compn> if you can also provide it
[03:35] <jamrial> yes, even with that hit it would still be faster than the C version
[03:35] <compn> then we'll have both versions, one for if_msvc and one for if_!msvc
[03:35] <compn> :)
[03:38] <jamrial> the yasm port i wrote is on the ml in case someone wants to make it work alongside the c and the inline version
[03:38] <compn> ok good
[03:39] <compn> we have to bend over backwards to appease the windows users :P
[03:40] <compn> but its good to be faster on any platform too
[03:40] <compn> optimize everything! 
[03:40] <BBB> jamrial: well yeah obviously inlining makes a massive difference
[03:41] <BBB> jamrial: I can look at that, at least try to suggest on how to do it, not sure, don't have tons of time lately :/
[03:41] <BBB> anyway
[03:41] <BBB> inline avx is beyond omg
[03:48] <jamrial> and gcc 4.9.0 added AVX-512 inline :P
[04:02] <wm4> why not use intrinsics
[04:20] <bloody123> hi all, there's a minor description bug in   http://ffmpeg.org/ffmpeg-filters.html#concat    - anyone wanna fix? it's about replacing one word...
[04:25] <bloody123> ...in option 'a', it reads: "that is also the number of video streams in each segment."   - should be "audio streams", not "video streams", right?
[05:11] <michaelni> j-b, is cone ok? she is a bit silent lately ...
[06:41] <jamrial> for the record, my recent patchset did not attempt to address the problems with plepere's third patch
[06:41] <jamrial> it's mostly the sse2 stuff i mentioned in the ml, plus some misc stuff
[06:42] <jamrial> i still have to send the patch that adds the sse2 version of the luma functions in question
[10:32] <Voicu> wm4: I was referring to the java binding for ffmpeg
[10:32] <Voicu> wm4, I can't say I'm 100% happy with the ffmpeg API (C++) but that on the whole is pretty ok
[10:59] <kurosu> mraulet, do you know if Pierre-Edouard finished working on MC?
[12:40] <cone-664> ffmpeg.git 03Carl Eugen Hoyos 07master:4c49d0824a10: Fix alaw and mulaw muxing in caf.
[12:52] <wm4> Voicu: well, you could write a better one
[13:17] <Voicu> wm4, I really did think about a solution like that
[13:18] <Voicu> i.e. make an API for my C++ code and use the C++ code instead of porting it to java
[13:20] <Voicu> wm4, are you familiar with the java wrappers ?
[13:20] <wm4> no
[13:20] <Voicu> I'm asking because I think there might be a bug nad if it's not a bug it's a really weird behavior that should be rectified
[13:25] <wm4> I'm just saying... this java wrapper might be some of the many projects depending on ffmpeg that don't receive enough maintenance
[13:25] <wm4> because ffmpeg's API is a relatively fast moving target
[13:26] <Voicu> wm4: I don't have anything against the ffmpeg API
[13:26] <Voicu> it's not perfect but then again no API is perfect
[13:27] <Voicu> and ffmpeg itself is awesome, it can do a lot of things 
[13:27] <Voicu> I just got frustrated with this project :D
[13:27] <Voicu> android and ffmpeg apparently don't mix that well (yet!)
[13:29] <Voicu> I certainly will make a post/article about my findings and troubles I had
[13:30] <Voicu> and maybe even try and improve some of them
[13:30] <Voicu> so others won't have to feel the same pain :D
[13:41] <mraulet> kurosu, more or less yes
[13:41] <mraulet> we dont exactly have the same performance than intrinsics for some functions
[13:47] <mraulet> kurosu what do you have in mind?
[13:47] <wm4> Voicu: what kind of insane stuff does the java binding do? I'm just curious
[13:48] <Voicu> wm4, well I had to use an AVIOContext with a custom read procedure - I need to feed data from Android's MediaCodec into a ffmpeg stream
[13:49] <Voicu> in C++ I did it with a callback of the form int (*p)(void *ptr, uint8_t *buffer, int buffer_size)
[13:50] <Voicu> on the java side there are 4 different classes, one of which (I assumed) I had to extend
[13:50] <Voicu> but... having more than 1 class (i.e. 2 callback) is not possible because somehow, somewhere a pointer gets overwritten
[13:50] <Voicu> so all my avio contexts call the same callback
[13:51] <Voicu> (the last one initialized)
[13:51] <Voicu> so I said OK, I'll use one callback and use the pointer to differentiate between different callback behaviors
[13:51] <Voicu> but! 
[13:52] <Voicu> I don't get the exact pointer object I provide. Instead I get a copy
[13:52] <Voicu> so I had to resort to a horrible hack - I use the .address field of the Pointer object (which is meant to refer to a C++ side object) 
[13:53] <Voicu> the .address field is apparently the only one that gets preserved from the object I provide to the one I get in the callback
[13:54] <Voicu> and that's just part of it
[13:55] <Voicu> in C++ I can do avformat_alloc_output_context2(context=NULL, ... )
[13:55] <BBB> wm4: intrinsics have the problem of not knowing what goes in registers or memory (stack)
[13:56] <BBB> wm4: plus it depends on the compiler doing well
[13:56] <Voicu> in java I have to do context=avformat_alloc_context() first
[13:56] <BBB> wm4: if compilers were so good, wouldn't we all use java or python?
[13:56] <Voicu> otherwise the whole thing crashes
[13:56] <wm4> BBB: well, it seems totally standard to use intrinsics on MSVC/C++
[13:56] <BBB> wm4: compilers suck, they are created using automated algorithms that work a little bit sometimes, but they typically generate inefficient code
[13:56] <BBB> wm4: sure, you can, it's better than c; but yams typically beats intrinsics
[13:56] Action: Daemon404 smashes BBB's stack
[13:57] <BBB> wm4: yams written by a knowledgeable person
[13:57] <BBB> hm, stack smashing, yummy
[13:57] <ubitux> http://b.pkh.me/writing_filters.txt
[13:57] <nevcairiel> MSVC is pretty good at translating intrinsics to asm, as long as you are aware of how many registers you use
[13:57] <ubitux> review welcome, maybe i'll submit it as a patch in doc/ soon
[13:57] <nevcairiel> i usually browse through the generated asm for my intrinsics to check
[13:58] <BBB> yams has a higher chance of doing the correct thing
[13:58] <BBB> apple's gcc used to do horrible on inline asm
[13:58] <nevcairiel> also takes like double the time to write it though
[13:58] <BBB> basically inserting movs after and before each and every single inline asm block
[13:58] <BBB> kinda like windows 32bit inline asm
[13:58] <BBB> nevcairiel: I find it fun ..
[13:59] <nevcairiel> intrinsics take all the C crazy and just handle it for you, so you can focus on the actual algorithm
[13:59] <wm4> intrinsics also don't require you to track down weird crashes because the yasm code might violate the ABI or do something invalid
[14:00] <nevcairiel> might end up slightly less optimized
[14:00] <BBB> people use python, people use java, people use c, people use asm
[14:00] <BBB> likewise, people use inline, yasm, intrinsics, even crazy win32 inline asm
[14:00] <BBB> use whatever suits you
[14:00] <BBB> in ffmpeg, we use yasm :-p
[14:01] <nevcairiel> someone should invent a way to inline a yasm block
[14:01] <Daemon404> perl mypreproc.pl
[14:01] <Voicu> hehe, when I was in highschool I made a graphics library in Turbo Pascal with inline assembly (16 bit) :D
[14:01] <BBB> oh I love perl - and bash
[14:02] <BBB> bash is so sweet, like honey
[14:02] <Voicu> it was probably pretty bad code but fun to make
[14:02] <BBB> and then this happens: http://stilldrinking.org/programming-sucks
[14:02] <nevcairiel> actual inline algorithms have their right to live in many parts of the code, sadly yasm can't help us there
[14:03] <BBB> if it's actually inlined... sure
[14:03] <BBB> but if it's just another dsp function
[14:04] <BBB> and you may think that, oh, this is just 1% here or there
[14:04] <BBB> but all these 1%s add up to ffvp9 considerably beating libvpx-vp9
[14:04] <BBB> we didn't implement another algorithm, it's exactly the same decoder algo
[14:05] <BBB> it's just more efficient, so yes, 1%s matter
[14:05] <BBB> a lot
[14:05] <mraulet> BBB it is not yet the case for MC intrinsics beat yasm on my laptop
[14:16] <cone-664> ffmpeg.git 03Clément BSsch 07master:11e490334e0d: avfilter/edgedetect: reuse already defined ctx.
[14:21] <BBB> mraulet: so fix the assembly :-p
[14:22] <mraulet> let see what plepere can do
[14:22] <mraulet> :p too
[14:40] <mraulet> openhevc/ffmpeg/rext_new_mc (5.8s) ffmpeg/master (9.5s)
[14:41] <mraulet> still some ASM to do on IDCT and SAO
[14:54] <cone-664> ffmpeg.git 03Marton Balint 07master:ae6fe159f299: ffplay: increase AV_SYNC_THRESHOLD_MIN to 0.04
[14:54] <cone-664> ffmpeg.git 03Marton Balint 07master:1fab67b6851f: ffplay: fix compilation with Visual Studio
[14:54] <cone-664> ffmpeg.git 03John Peebles 07master:e11697759d06: cmdutils: replace usages of "#ifdef __MINGW32__" with "#ifdef _WIN32" because MSVC only defines _WIN32
[14:54] <cone-664> ffmpeg.git 03Marton Balint 07master:0c8d8c0c8067: ffplay: try multiple sample rates if audio open fails
[14:54] <cone-664> ffmpeg.git 03Marton Balint 07master:affc41047e96: ffplay: fix typo in docs
[14:54] <cone-664> ffmpeg.git 03Marton Balint 07master:a583e2bebe77: ffplay: add support for toggling between multiple video filters with the w key
[14:54] <cone-664> ffmpeg.git 03Michael Niedermayer 07master:5b0c7052fb2c: Merge remote-tracking branch 'cus/stable'
[14:55] <mraulet> BBB 5.9s with MC intrinsics compared to 6.3s MC asm on BasketBallDrive QP 27 RA with one core
[14:58] <BBB> mraulet: you're a university student, right?
[14:58] <mraulet> not really student ^^
[14:58] <BBB> whatever, you've got an academic background
[14:58] <mraulet> Research engineer :-) at a Engineering school
[14:58] <BBB> so apply your academic principles on this problem
[14:59] <BBB> your end goal shouldn't be to convince me that intrinsics are better than asm
[14:59] <BBB> a) they're not, and b) it's irrelevant, we already decided to not use intrinsics in ffmpeg
[14:59] <BBB> your goal should be to learn _why_ your particular intrinsics implementation behaves better than your particular asm implementation
[15:00] <BBB> then use what you just learned, improve the asm
[15:00] <BBB> and voila, you've got better-performing asm
[15:00] <BBB> and everyone is happy
[15:02] <wm4> didn't they recently decide to use intrinsics for some arm code in libav?
[15:03] <cone-664> ffmpeg.git 03Carl Eugen Hoyos 07master:ef2713747f9d: Fix compilation of libavcodec/x86/hevc_deblock.asm with nasm.
[15:03] <cone-664> ffmpeg.git 03Michael Niedermayer 07master:a7320c1574d2: Merge remote-tracking branch 'cehoyos/master'
[15:03] <nevcairiel> probably because there is no proper inline support
[15:03] <nevcairiel> arm is all kinds of crazy
[15:08] <BBB> I believe they also found it to be slower (on arm 32bit) than asm
[15:14] <kurosu> mraulet, some of the assembly is kind of lazy - suboptimal argument loading and register allocation
[15:16] <kurosu> I have an example with 8 preloaded gprs, 10 needed and 11 xmms
[15:16] <kurosu> turns out, it can probably be done in 5/5/7
[15:17] <mraulet> kurosu, do you plan to work on MC?
[15:17] <kurosu> not really, I was mostly looking out of curiosity
[15:32] <nevcairiel> 64-bit makes people lazy
[15:32] <nevcairiel> so many registers to waste
[15:37] <kurosu> on the other hand, I might be picky
[15:38] <cone-664> ffmpeg.git 03Olivier Langlois 07master:0eec06ed8747: lavu: add av_gettime_relative()
[15:45] <kurosu> mraulet, what would be a good sample to test the uni_w functions ?
[15:46] <mraulet> I am using sample from here
[15:46] <mraulet> ftp://ftp.kw.bbc.co.uk/hevc/hm-11.0-anchors/bitstreams/ra_main/
[15:47] <mraulet> BasketballDrive_1920x1080_50_qp27.bin
[15:53] <kurosu> I'm getting no output on stderr or stdout with this command-line: ./ffmpeg.exe -threads 1 -i <file> -loglevel debug -f null -
[15:53] <kurosu> Anybody has an idea why that would happen with mingw ?
[15:54] <BBB> -v debug?
[15:54] <BBB> not sure though
[15:55] <kurosu> nope
[15:55] <mraulet> your command line works for me
[15:55] <kurosu> libav's avconv is ok
[15:55] <mraulet> macos
[15:56] <kurosu> well, my changes pass fate, but I can't get a START_TIMER to work
[15:56] <BBB> does -v error work?
[15:56] <BBB> instead of -v debug
[15:56] <BBB> maybe someone broke -v debug to mean "only debug", not "all until debug"
[15:56] <BBB> (START/STOP_TIMER is error)
[15:57] <kurosu> -v error -loglevel error doesn't work either
[15:57] <mraulet> but the stderr output has changed recently (it does not look really nice now :) )
[15:57] <BBB> sorry no idea :(
[15:57] <BBB> change STOP_TIMER to do printf :-p
[15:57] <kurosu> actually, I'm having strictly no output
[15:57] <cone-664> ffmpeg.git 03Olivier Langlois 07master:b052bccbe4b5: lavc: Use av_gettime_relative()
[15:57] <BBB> maybe your terminal is broken
[15:57] <cone-664> ffmpeg.git 03Olivier Langlois 07master:41120e6e40e3: tools: Use av_gettime_relative()
[15:57] <kurosu> not even the program prolog
[15:57] <BBB> why don't you kill the terminal and restart it
[15:57] <kurosu> yeah, the pity avconv works
[15:57] <BBB> oh right
[15:57] <kurosu> *is avconv
[15:58] <BBB> ...
[15:58] <BBB> I have no idea
[15:58] <kurosu> mingw's terminal
[15:58] <BBB> does printf work?
[15:58] <BBB> like, if you add a printf somewhere, does it print?
[15:58] <michaelni> kurosu, does it work with older ffmpeg ? if so which commit broke it ?
[15:59] <kurosu> I believe not: I'm not even getting a program output (like ffmpeg (c) something or compiled with as I think it used to)
[16:00] <kurosu> michaelni, need to bisect I guess, but given slowness of configure + compile... what would be a minimal build to test just that ?
[16:01] <michaelni> i think there were some tricks to make configure faster on win32, not sure which though
[16:01] <nevcairiel> kurosu: that means your ffmpeg was linked with SDL, which caused it to be a GUI app and not have stdout/stderr
[16:02] <kurosu> even with --disable-outdev=sdl,opengl ?
[16:02] <michaelni> i think configure is faster with dash than bash
[16:02] <kurosu> nevcairiel, yeah, funky -Dmain=SDLmain
[16:02] <kurosu> gonna remove that from config.mak
[16:02] <nevcairiel> the real problem is -mwindows somewhere
[16:03] <nevcairiel> which SDL enforces
[16:03] <kurosu> I thought they were redirecting stdout/err to files also ?
[16:04] <michaelni> btw configure with bash 12sec, configre with dash 4sec (on linux)
[16:04] <nevcairiel> should really only apply the SDL CFLAGS to ffplay and not to ffmpeg
[16:05] <nevcairiel> stupid -mwindows
[16:05] <nevcairiel> i just dont have SDL installed anymore :p
[16:05] <kurosu> I used to have another project relying a lot on SDL
[16:05] <kurosu> but no longer today, so that's probably the best option
[16:06] <kurosu> but really, yes, it should only apply to ffplay (patch welcome :D)
[16:06] <nevcairiel> you could also mess with its pkgconfig file and strip the -mwindows
[16:07] <ubitux> kurosu: well, ffmpeg needs libavdevice, and libavdevice has a sdl device
[16:07] <ubitux> kurosu: you can use sdl with ffmpeg
[16:07] <ubitux> typically, you can follow an encode by doing a -f sdl -
[16:07] <kurosu> ubitux, I'm running configure with --disable-outdev=sdl,opengl
[16:08] <kurosu> and --disable-ffplay
[16:08] <nevcairiel> i never managed to get rid of SDL cflags in the build
[16:08] <ubitux> then it shouldn't link against sdl :p
[16:08] <nevcairiel> as long as SDL is installed, it'll add them
[16:08] <kurosu> detecting and using SDL shouldn't even happen
[16:08] <nevcairiel> so.. byebye sdl
[16:08] <ubitux> i guess that's a bug :(
[16:08] <kurosu> the problem is sdlconfigure passes -Dmain=SDL_main and relevant LDFLAGS/CFLAGS
[16:09] <kurosu> *sdl_config iirc
[16:09] <kurosu> if it's not through some pkgconfig detection
[16:09] <nevcairiel> i suppose both ways exist
[16:09] <kurosu> anyway, it's recompiling but I'll get rid of sdl then
[16:10] <nevcairiel> the weird thing is that configure adds the sdl cflags globally to the main cflags, but then additionally to the ffplay one as well, as if someone tried to fix it once, but then it broke again :D
[16:20] <ubitux> i guess the problem is that sdl is auto detected
[16:22] <cone-664> ffmpeg.git 03Clément BSsch 07master:2ca97c7802b6: configure: make sure pkg-config flags are populated in FT test.
[16:22] <cone-664> ffmpeg.git 03Clément BSsch 07master:9986e50a6ec8: configure: make vp9 decoder select the parser.
[16:25] <kurosu> I can't find a sample testing it
[16:29] <kurosu> ok, WP_A_Toshiba_3.bit it was
[16:29] <ubitux> the sdl stuff in configure is pretty sick&
[17:06] <kurosu> I don't really intend to work more on the hevc code, but on closer inspection, this one struck me as odd
[17:06] <kurosu> I don't expect everything to be doable so as to make it work on 32bits systems, but still
[17:11] <kurosu> nevcairiel: thanks for the hint btw, last time I noticed that, I thought it was a fluke
[20:23] <cone-612> ffmpeg.git 03Olivier Langlois 07master:f78bc96b7c1c: lavf: Use av_gettime_relative()
[20:27] <cone-612> ffmpeg.git 03nu774 07master:584f88409062: riff: Pass block_align to estimate frame duration
[20:27] <cone-612> ffmpeg.git 03Michael Niedermayer 07master:43c18fec6e7c: Merge commit '584f88409062f7a134e7391887899e8e723ab6ff'
[21:05] <Daemon404> 300 get
[21:11] <nevcairiel> we should just take the patch and apply it, can't get any worse
[21:12] <cone-612> ffmpeg.git 03Olivier Langlois 07master:0ca0b4c29cf7: ffplay: Use av_gettime_relative()
[23:02] <jamrial> michaelni: kurosu's patch compiles for me on mingw64 with Yasm 1.2.0, 1.1.0, 0.8.0, and NASM 2.11.2
[23:02] <jamrial> what are you using?
[23:07] <michaelni> jamrial, yasm 1.1.0.2352 on ubuntu
[23:09] <jamrial> odd
[23:11] <mraulet> jamrial, it does not compile for me too
[23:11] <mraulet> on osx
[23:11] <jamrial> i'm booting a linux vm to check there
[23:12] <mraulet> yasm 1.2.0
[23:13] <mraulet> exactly the same messages than michael
[23:14] <jamrial> yeah, same here on Gentoo x86_64
[23:15] <jamrial> weird that it compiles fine in win64
[23:15] <jamrial> kurosu also tested it there now that i read his email again
[23:25] <jamrial> changing the two r4q to r4d seems to fix it, and doesn't break mingw64
[23:25] <jamrial> can't run fate to test it, though
[23:30] <mraulet> compilation ok but fate fails
[23:58] <cone-612> ffmpeg.git 03Michael Niedermayer 07master:46380e8d2619: avformat/aviobuf/ff_get_line: also accept \r as end of line character
[00:00] --- Sun May 18 2014