[FFmpeg-devel] [PATCH] Make MMX2 put_no_rnd_pixels _x2 and _y2 bitexact
Sat May 29 00:25:37 CEST 2010
On 05/28/2010 11:49 PM, David Conrad wrote:
> The mmx2/3dnow put_no_rnd functions don't always round correctly, since they compensate for the +1 in pavgb by subtracting 1 from one of the inputs. This causes our theora decoder to not be bitexact to libtheora, though I haven't found any real source where the error accumulates enough to be visible.
> This fixes it by using the property that (a+b)>>1 is equivalent to ~(~a+~b+1)>>1. This makes these functions 5 cycles slower on my penryn, but on my atom the additional instructions appear to be free probably due to load stalls.
Wouldn't it be worth creating new bitexact functions, but still
override them with the old/faster ones if BITEXACT is not set?
More information about the ffmpeg-devel