[FFmpeg-devel] [PATCH] MMX VP3 Loop Filter
Sun Oct 12 02:40:25 CEST 2008
On Oct 11, 2008, at 5:14 AM, Jason Garrett-Glaser wrote:
> On Sat, Oct 11, 2008 at 1:53 AM, David Conrad <lessen42 at gmail.com>
>> filter_limit *= 0x02020202;
>> "movd "#flim", %%mm5 \n\t" \
>> "punpcklbw %%mm5, %%mm5 \n\t" \
> Which is faster, this, or SPLATB in the form of punpcklbw + pshufw +
> psllw (psllw because the filter_limit values are guaranteed to be <
> 128, so a word left shift is equivalent to a byte left shift)?
> The SPLATB would avoid the integer multiply, and perhaps also as
> importantly avoid the register->mm move, since you'll be able to load
> it directly off the stack.
I couldn't measure any difference between these, but I'm
precalculating the *2 and loading it memory now anyways.
More information about the ffmpeg-devel