[FFmpeg-devel] [PATCH 09/10] avfilter/vsrc_mandelbrot: use hypot()

Michael Niedermayer michaelni at gmx.at
Tue Nov 24 03:21:16 CET 2015


On Mon, Nov 23, 2015 at 03:48:40PM -0500, Ganesh Ajjanagadde wrote:
> On Mon, Nov 23, 2015 at 2:13 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Mon, Nov 23, 2015 at 01:57:24PM -0500, Ganesh Ajjanagadde wrote:
> >> On Mon, Nov 23, 2015 at 1:02 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> > On Mon, Nov 23, 2015 at 12:43:52PM -0500, Ganesh Ajjanagadde wrote:
> >> >> On Sun, Nov 22, 2015 at 3:56 PM, Ganesh Ajjanagadde <gajjanag at mit.edu> wrote:
> >> >> > On Sun, Nov 22, 2015 at 3:07 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> >> >> On Sun, Nov 22, 2015 at 12:05:49PM -0500, Ganesh Ajjanagadde wrote:
> >> >> >>> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
> >> >> >>> ---
> >> >> >>>  libavfilter/vsrc_mandelbrot.c | 2 +-
> >> >> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> >> >>>
> >> >> >>> diff --git a/libavfilter/vsrc_mandelbrot.c b/libavfilter/vsrc_mandelbrot.c
> >> >> >>> index 950c5c8..a0c101e 100644
> >> >> >>> --- a/libavfilter/vsrc_mandelbrot.c
> >> >> >>> +++ b/libavfilter/vsrc_mandelbrot.c
> >> >> >>> @@ -291,7 +291,7 @@ static void draw_mandelbrot(AVFilterContext *ctx, uint32_t *color, int linesize,
> >> >> >>>
> >> >> >>>              use_zyklus= (x==0 || s->inner!=BLACK ||color[x-1 + y*linesize] == 0xFF000000);
> >> >> >>>              if(use_zyklus)
> >> >> >>> -                epsilon= scale*1*sqrt(SQR(x-s->w/2) + SQR(y-s->h/2))/s->w;
> >> >> >>> +                epsilon= scale*hypot(x-s->w/2, y-s->h/2)/s->w;
> >> >> >>
> >> >> >> old:
> >> >> >>  704 decicycles in hypo, 1048570 runs,      6 skips
> >> >> >>
> >> >> >> new:
> >> >> >> 1075 decicycles in hypo, 1048566 runs,     10 skips
> >> >> >>
> >> >> >> that is from START/STOP_TIMER over hypot()
> >> >> >>
> >> >> >> the code is speed relevant as its executed per pixel
> >> >> >
> >> >> > Thanks for testing. Looking more closely, I see no reason for
> >> >> > expensive sqrt calls anyway: one can simply square both sides; it
> >> >> > should be cheaper. Will rework, post benchmark if it is indeed faster
> >> >> > and does not suffer from floating point overflow, else will simply
> >> >> > push a trivial removal of the "1".
> >> >>
> >> >> It seems like getting rid of the sqrt altogether has a very slight
> >> >> positive impact (if any at all). I can post the patch, but would like
> >> >> to know what to benchmark. There are numerous choices, e.g
> >> >> draw_mandelbrot as a whole, the outer loop, or the inner loop.
> >> >> I personally think the inner x loop (lines 268-388) is a good place to
> >> >> look at, since the difference is very small anyway, and further
> >> >> localization is impossible.
> >> >
> >
> >> > please post the patch
> >>
> >> bench posted first to see if it is considered interesting enough.
> >> Bench over whole draw_mandelbrot using START/STOP timer on x86-64,
> >> Haswell, GNU/Linux, command line:
> >> ffmpeg -v error -f lavfi -i mandelbrot -f null -
> >> new (draw_mandelbrot):
> > [...]
> >> 20857881401 decicycles in draw_mandelbrot,    1024 runs,      0 skips
> >>
> >> old (draw_mandelbrot):
> > [...]
> >> 21393227201 decicycles in draw_mandelbrot,    1024 runs,      0 skips
> >
> > if this is consistent over several tries then its interresting
> 
> There is a reason why I am posting a full vector, since it is very
> hard to judge. I ran for a longer duration below. I do see a downward
> trend, but unfortunately the magnitude of the effect is unclear.
> Furthermore, there seem to be runtime variations in the actual numbers
> compared to the previous run, though they ran on the same hardware. I
> did not use any fancy tricks like core pinning etc, which could have
> helped in ensuring minimal background task interference.
> 

> BTW, this filter is terribly slow as it zooms in, together with a
> bunch of messages at the info level "Mandelbrot cache is too small!"
> that do not seem very user friendly to me.

fixed

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Asymptotically faster algorithms should always be preferred if you have
asymptotical amounts of data
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151124/f35b8b74/attachment.sig>


More information about the ffmpeg-devel mailing list