[FFmpeg-cvslog] r25254 - trunk/libavcodec/x86/h264dsp_mmx.c

Ronald S. Bultje rsbultje
Wed Sep 29 20:10:23 CEST 2010


Hi,

On Wed, Sep 29, 2010 at 1:41 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> On Wed, Sep 29, 2010 at 1:11 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> On Wed, Sep 29, 2010 at 12:19:45PM -0400, Ronald S. Bultje wrote:
>>> On Wed, Sep 29, 2010 at 12:03 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
>>> > On Wed, Sep 29, 2010 at 04:02:32PM +0200, rbultje wrote:
>>> >> Author: rbultje
>>> >> Date: Wed Sep 29 16:02:32 2010
>>> >> New Revision: 25254
>>> >>
>>> >> Log:
>>> >> Remove d_idx as a variable, and instead load it as a constant in the asm.
>>> >> This has no measurable speed effect because the surrounding code doesn't
>>> >> take advantage of this yet.
>>> > [...]
>>> >> @@ -125,34 +124,41 @@ static av_always_inline void h264_loop_f
>>> >> ? ? ? ? ? ? ? ? ? ? ? ? ?"por ? ? ? ? ? %%mm1, %%mm0 \n"
>>> >> ? ? ? ? ? ? ? ? ? ? ? ? ?"pshufw $0x4E, %%mm0, %%mm1 \n"
>>> >> ? ? ? ? ? ? ? ? ? ? ? ? ?"pminub ? ? ? ?%%mm1, %%mm0 \n"
>>> >> - ? ? ? ? ? ? ? ? ? ? ? ?::"r"(d_idx),
>>> >> - ? ? ? ? ? ? ? ? ? ? ? ? ?"r"(ref[0]+b_idx),
>>> >> - ? ? ? ? ? ? ? ? ? ? ? ? ?"r"(mv[0]+b_idx)
>>> >> + ? ? ? ? ? ? ? ? ? ? ? ?::"r"(ref[0]+b_idx),
>>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"r"(mv[0]+b_idx),
>>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx),
>>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx+40),
>>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4),
>>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4+8),
>>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4+160),
>>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ?"i"(d_idx*4+168)
>>> >
>>> > It appears that some gccs have difficulty with constant propagation, so i
>>> > suspect that this has to be changed to a macro instead of a always_inline
>>> > function
>>>
>>> Grmbl, starting to think yasm is easier after all... Patch attached.
>>
>> Iam starting to think that too in this case ...
>> either way patch ok
>
> Let's leave it as-is now, it was 3 cycles faster after all... If more
> stuff breaks or becomes strange, we can change it whenever we want...
> I was positively surprised that clang/icc worked with this code. :-).

Hmm, sunCC is still unhappy...

http://fate.ffmpeg.org/x86_32-linux-suncc-5.11/20100929174548/compile

Ronald



More information about the ffmpeg-cvslog mailing list