[FFmpeg-devel] [PATCH] MMX2/SSSE3 VC1 loop filter
David Conrad
lessen42
Mon Jul 5 23:25:45 CEST 2010
On Jul 5, 2010, at 5:10 PM, Jason Garrett-Glaser wrote:
> On Mon, Jul 5, 2010 at 2:04 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>> Hi,
>>
>> On Mon, Jul 5, 2010 at 5:02 PM, Jason Garrett-Glaser
>> <darkshikari at gmail.com> wrote:
>>> On Mon, Jul 5, 2010 at 1:30 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>>>> On Mon, Jul 5, 2010 at 1:44 AM, David Conrad <lessen42 at gmail.com> wrote:
>>>>> +%macro STORE_4_WORDS_SSE4 6
>>>>> + pextrw %1, %5, %6+0
>>>>> + pextrw %2, %5, %6+1
>>>>> + pextrw %3, %5, %6+2
>>>>> + pextrw %4, %5, %6+3
>>>>> +%endmacro
>>>> [..]
>>>
>>> I don't recall pextrw being SSE4...
>>
>> Awesome, I should be able to use that for VP8 simple H loopfilter as
>> well (it's mmxext/sse2).
>
> Make sure it's actually faster; do benches on a few CPUs. pextr/pinsr
> are not always as fast as one wants.
Indeed, iirc pextrw to a register (i.e. pre-sse4) was slower than doing lots of shift+mov
More information about the ffmpeg-devel
mailing list