[FFmpeg-devel] Patch: Inline asm fixes for Intel compiler on Windows

Michael Niedermayer michaelni at gmx.at
Mon Mar 17 00:26:05 CET 2014


On Sun, Mar 16, 2014 at 03:10:40PM +1100, Matt Oliver wrote:
> >
> > In the very first patch you seem to have a space at the start of the
> >
> line before "extern"?
> 
> 
> Fixed, thanks.
> 
> And for the CTLD patch: inline asm code is normal strings, there is no
> > point in writing
> > ""CTLD
> > just use
> > CLTD
> 
> 
> Fixed, I knew the double "'s was unnecessary but i initially left them in
> there to make it very clear to other devs that CLTD is string. Of course
> any dev playing round with the asm would probably have worked that out
> anyway so its removed.
> 
> The patches have been rebased again. Recent configure changes moved the way
> items are listed. So now the changes are added to TOOLCHAIN_FEATURES
> instead of HAVE_LIST directly so that its inline with those recent upstream
> modifications (also upstream merge of Diegos "K&R formatting cosmetics"
> necessitated another rebase). And again all of them included together for
> simplicity. Constantly rebasing and fixing merge conflicts is time
> consuming so if there is no interest in this then let me know so I can save
> some effort.

[...]
[...]
> diff --git a/libavcodec/x86/h264_i386.h b/libavcodec/x86/h264_i386.h
> index f6a8043..da112c4 100644
> --- a/libavcodec/x86/h264_i386.h
> +++ b/libavcodec/x86/h264_i386.h
> @@ -55,6 +55,7 @@ static int decode_significance_x86(CABACContext *c, int max_coeff,
>      __asm__ volatile(
>          "lea   "MANGLE(ff_h264_cabac_tables)", %0      \n\t"
>          : "=&r"(tables)
> +        : NAMED_CONSTRAINTS(ff_h264_cabac_tables)
>      );
>  #endif
>  
> @@ -130,6 +131,7 @@ static int decode_significance_8x8_x86(CABACContext *c,
>      __asm__ volatile(
>          "lea    "MANGLE(ff_h264_cabac_tables)", %0      \n\t"
>          : "=&r"(tables)
> +        : NAMED_CONSTRAINTS(ff_h264_cabac_tables)
>      );
>  #endif
>  
> diff --git a/libavcodec/x86/idct_sse2_xvid.c b/libavcodec/x86/idct_sse2_xvid.c
> index af4790c..e1878fa 100644
> --- a/libavcodec/x86/idct_sse2_xvid.c
> +++ b/libavcodec/x86/idct_sse2_xvid.c
> @@ -147,7 +147,7 @@ DECLARE_ASM_CONST(16, int32_t, walkenIdctRounders)[] = {
>  
>  #endif
>  
> -#define ROUND(x) "paddd   "MANGLE(x)
> +#define ROUND(x) "paddd   "x
>  
>  #define JZ(reg, to)                         \
>      "testl     "reg","reg"            \n\t" \
> @@ -347,13 +347,13 @@ inline void ff_idct_xvid_sse2(short *block)
>  {
>      __asm__ volatile(
>      "movq     "MANGLE(m127)", %%mm0                              \n\t"
> -    iMTX_MULT("(%0)",     MANGLE(iTab1), ROUND(walkenIdctRounders),      PUT_EVEN(ROW0))
> -    iMTX_MULT("1*16(%0)", MANGLE(iTab2), ROUND(walkenIdctRounders+1*16), PUT_ODD(ROW1))
> -    iMTX_MULT("2*16(%0)", MANGLE(iTab3), ROUND(walkenIdctRounders+2*16), PUT_EVEN(ROW2))
> +    iMTX_MULT("(%0)",     MANGLE(iTab1), ROUND(MANGLE(walkenIdctRounders)),      PUT_EVEN(ROW0))
> +    iMTX_MULT("1*16(%0)", MANGLE(iTab2), ROUND("1*16+"MANGLE(walkenIdctRounders)), PUT_ODD(ROW1))
> +    iMTX_MULT("2*16(%0)", MANGLE(iTab3), ROUND("2*16+"MANGLE(walkenIdctRounders)), PUT_EVEN(ROW2))
>  
>      TEST_TWO_ROWS("3*16(%0)", "4*16(%0)", "%%eax", "%%ecx", CLEAR_ODD(ROW3), CLEAR_EVEN(ROW4))
>      JZ("%%eax", "1f")
> -    iMTX_MULT("3*16(%0)", MANGLE(iTab4), ROUND(walkenIdctRounders+3*16), PUT_ODD(ROW3))
> +    iMTX_MULT("3*16(%0)", MANGLE(iTab4), ROUND("3*16+"MANGLE(walkenIdctRounders)), PUT_ODD(ROW3))
>  
>      TEST_TWO_ROWS("5*16(%0)", "6*16(%0)", "%%eax", "%%edx", CLEAR_ODD(ROW5), CLEAR_EVEN(ROW6))
>      TEST_ONE_ROW("7*16(%0)", "%%esi", CLEAR_ODD(ROW7))
> @@ -368,20 +368,20 @@ inline void ff_idct_xvid_sse2(short *block)
>      "2:                                                          \n\t"
>      iMTX_MULT("4*16(%0)", MANGLE(iTab1), "#", PUT_EVEN(ROW4))
>      "3:                                                          \n\t"
> -    iMTX_MULT("5*16(%0)", MANGLE(iTab4), ROUND(walkenIdctRounders+4*16), PUT_ODD(ROW5))
> +    iMTX_MULT("5*16(%0)", MANGLE(iTab4), ROUND("4*16+"MANGLE(walkenIdctRounders)), PUT_ODD(ROW5))
>      JZ("%%edx", "1f")
>      "4:                                                          \n\t"
> -    iMTX_MULT("6*16(%0)", MANGLE(iTab3), ROUND(walkenIdctRounders+5*16), PUT_EVEN(ROW6))
> +    iMTX_MULT("6*16(%0)", MANGLE(iTab3), ROUND("5*16+"MANGLE(walkenIdctRounders)), PUT_EVEN(ROW6))
>      JZ("%%esi", "1f")
>      "5:                                                          \n\t"
> -    iMTX_MULT("7*16(%0)", MANGLE(iTab2), ROUND(walkenIdctRounders+5*16), PUT_ODD(ROW7))
> +    iMTX_MULT("7*16(%0)", MANGLE(iTab2), ROUND("5*16+"MANGLE(walkenIdctRounders)), PUT_ODD(ROW7))

the change to round could be in a seperate patch

[...]

>  mpegvideoenc_template.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 2b12c0ec5941fd82fe2c38c5646db8c6583ad574  0005-5-6-Fixed-64bit-conformance-with-mvzbl.patch
> From 754b9dc487260dedfae4be38c88dc60734561f67 Mon Sep 17 00:00:00 2001
> From: Matt Oliver <protogonoi at gmail.com>
> Date: Sun, 9 Feb 2014 17:10:12 +1100
> Subject: [PATCH 5/6] [5/6] Fixed 64bit conformance with mvzbl.

patch applied

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When you are offended at any man's fault, turn to yourself and study your
own failings. Then you will forget your anger. -- Epictetus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140317/85d77fce/attachment.asc>


More information about the ffmpeg-devel mailing list