[FFmpeg-devel] [PATCH 2/5] lavfi/gradfun: fix rounding in MMX code.
Clément Bœsch
ubitux at gmail.com
Tue Dec 18 03:00:57 CET 2012
On Tue, Dec 18, 2012 at 02:31:24AM +0100, Clément Bœsch wrote:
> On Fri, Dec 07, 2012 at 11:49:40PM +0000, Loren Merritt wrote:
> > On Fri, 7 Dec 2012, Reimar Döffinger wrote:
> > > On 7 Dec 2012, at 16:28, Loren Merritt <lorenm at u.washington.edu> wrote:
> > >> On Fri, 7 Dec 2012, Reimar Döffinger wrote:
> > >>> "Clément Bÿÿsch" <ubitux at gmail.com> wrote:
> > >>>
> > >>>> Current code divide before increasing precision.
> > >>>
> > >>> Are you sure the shift can _never_ overflow? It is definitely very tight.
> > >>
> > >> I expect it can overflow, but the case where it does so also has m=0 for
> > >> sane filter strengths. So the appropriate fix would be to limit strength,
> > >> not change the asm.
> > >
> > > Since the proposal is to change the asm, do you mean the patch is right, it just should be combined with limiting the strength?
> >
> > Yes.
> >
> > > I have no objections about that, I just kind of assumed there was a good reason for doing things in this order.
> >
> > The original order probably was about avoiding overflow, but limiting
> > strength is better.
> >
>
> Just to make sure I understand the issue correctly:
>
> In the ASM, delta is expressed as u16, so delta<<2 will overflow for
> values > 0x3fff. In these cases we want that m, defined by:
> int m = abs(delta) * thresh >> 16; (1)
> m = FFMAX(0, 127 - m);
> ...to be 0, so whatever the value of delta, the expression:
> m = m * m * delta >> 14;
> ...will always be 0.
>
> To do so, we just need to make sure that abs(delta) * thresh >> 16 to be
> always >= 127.
>
> We currently have the threshold in the range from (1<<15)/0.51 to
> (1<<15)/255, with [0.51;255] being the range the user can specify. The
> higher the user specified threshold, the closer we get to 127 (from +oo).
>
> We need to solve:
>
> delta * ((1<<15)/x) >> 16 >= 127 (based on (1)),
> with delta > 0x3fff and x >= 0.51
>
> => delta * (1<<15)/x / (1<<16) >= 127
> delta / (2x) >= 127
>
> so we have 0.51 <= x <= 2 * delta / 127
>
Here is where is my mistake, the solution is 0.51 <= x <= 129/2, which is
indeed not in the range.
Does a threshold > 64 makes sense? If so, Michael proposed to fallback on
the C version in these cases. What do people prefer?
--
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20121218/687cd5c7/attachment.asc>
More information about the ffmpeg-devel
mailing list