[FFmpeg-devel] [PATCH] force dnxhd encoder to be independent of qsort internals

Reimar Döffinger Reimar.Doeffinger
Sun Sep 20 13:33:59 CEST 2009


On Sun, Sep 20, 2009 at 12:28:17PM +0200, Michael Niedermayer wrote:
> On Sun, Sep 20, 2009 at 12:25:57PM +0200, Reimar D?ffinger wrote:
> > On Sun, Sep 20, 2009 at 12:10:24PM +0200, Michael Niedermayer wrote:
> > > On Sun, Sep 20, 2009 at 11:39:26AM +0200, Reimar D?ffinger wrote:
> > > > On Sun, Sep 20, 2009 at 11:08:45AM +0200, Reimar D?ffinger wrote:
> > > > > On Sat, Sep 19, 2009 at 01:16:51PM +0200, Reimar D?ffinger wrote:
> > > > > > this change makes sure there are no equal elements in the data passed to
> > > > > > qsort, thus making sure the result is independent of implementation
> > > > > > internals.
> > > > > 
> > > > > Very slightly different, swapped the mb comparison so the original order
> > > > > is kept by default. Seems nicer and might decreases needed memory
> > > > > bandwidth if a lot of values are identical (very unlikely I think).
> > > > 
> > > > qsort itself is nearly 5% slower for the Linux implementation:
> > > > 31596660 dezicycles in encode_fast, 1 runs, 0 skips
> > > > 30286580 dezicycles in encode_fast, 2 runs, 0 skips
> > > > 29324712 dezicycles in encode_fast, 4 runs, 0 skips
> > > > 28957121 dezicycles in encode_fast, 8 runs, 0 skips
> > > > 28722058 dezicycles in encode_fast, 16 runs, 0 skips
> > > > 28665462 dezicycles in encode_fast, 32 runs, 0 skips
> > > > 28597622 dezicycles in encode_fast, 64 runs, 0 skips
> > > > 
> > > > 32579940 dezicycles in encode_fast, 1 runs, 0 skips
> > > > 31456890 dezicycles in encode_fast, 2 runs, 0 skips
> > > > 30567462 dezicycles in encode_fast, 4 runs, 0 skips
> > > > 30156817 dezicycles in encode_fast, 8 runs, 0 skips
> > > > 30013356 dezicycles in encode_fast, 16 runs, 0 skips
> > > > 29935421 dezicycles in encode_fast, 32 runs, 0 skips
> > > > 29924871 dezicycles in encode_fast, 64 runs, 0 skips
> > > > 
> > > > But difference for the whole encode_fast function is within the
> > > > variance, at most 0.2 % slower I'd say as in this case (again,
> > > 
> > > why dont you just do a radix sort? Should only be a few lines
> > > of C, faster and portable
> > 
> > Yes, but I like to have stable make test before changing code.
> > As simple as radix sort may be, I am almost certain to mess it up on
> > the first try.
> 
> for(all elements - 1)
>     assert(x[i] < x[i+1]);

That's not the only thing you can mess up. Also, unfortunately radix
sort seems to be a rather big mess to do in-place, so it requires some
more code changes or a malloc...



More information about the ffmpeg-devel mailing list