[FFmpeg-devel] [PATCH] Detect and check for CMOV.

Michael Niedermayer michaelni at gmx.at
Sat Feb 11 21:15:11 CET 2012


On Sat, Feb 11, 2012 at 04:07:10PM +0100, Reimar Döffinger wrote:
> Some MMX-only CPUs do not have support for CMOV.
> All SSE/MMX2 CPUs should be fine, thus no check was
> added to those functions.
> See also https://sourceforge.net/tracker/?func=detail&aid=3358347&group_id=205275&atid=992986
> 
> Signed-off-by: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
> ---
>  libavcodec/x86/h264_intrapred_init.c |    3 ++-
>  libavcodec/x86/h264dsp_mmx.c         |    3 ++-
>  libavutil/cpu.h                      |    1 +
>  libavutil/x86/cpu.c                  |    2 ++
>  4 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/libavcodec/x86/h264_intrapred_init.c b/libavcodec/x86/h264_intrapred_init.c
> index 540ec87..58740e2 100644
> --- a/libavcodec/x86/h264_intrapred_init.c
> +++ b/libavcodec/x86/h264_intrapred_init.c
> @@ -188,7 +188,8 @@ void ff_h264_pred_init_x86(H264PredContext *h, int codec_id, const int bit_depth
>                  if (chroma_format_idc == 1)
>                      h->pred8x8  [PLANE_PRED8x8] = ff_pred8x8_plane_mmx;
>                  if (codec_id == CODEC_ID_SVQ3) {
> -                    h->pred16x16[PLANE_PRED8x8] = ff_pred16x16_plane_svq3_mmx;
> +                    if (mm_flags & AV_CPU_FLAG_CMOV)
> +                        h->pred16x16[PLANE_PRED8x8] = ff_pred16x16_plane_svq3_mmx;
>                  } else if (codec_id == CODEC_ID_RV40) {
>                      h->pred16x16[PLANE_PRED8x8] = ff_pred16x16_plane_rv40_mmx;
>                  } else {
> diff --git a/libavcodec/x86/h264dsp_mmx.c b/libavcodec/x86/h264dsp_mmx.c
> index b337462..063e3de 100644
> --- a/libavcodec/x86/h264dsp_mmx.c
> +++ b/libavcodec/x86/h264dsp_mmx.c
> @@ -361,7 +361,8 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, const int chrom
>          if (chroma_format_idc == 1)
>              c->h264_idct_add8       = ff_h264_idct_add8_8_mmx;
>          c->h264_idct_add16intra     = ff_h264_idct_add16intra_8_mmx;
> -        c->h264_luma_dc_dequant_idct= ff_h264_luma_dc_dequant_idct_mmx;
> +        if (mm_flags & AV_CPU_FLAG_CMOV)
> +            c->h264_luma_dc_dequant_idct= ff_h264_luma_dc_dequant_idct_mmx;
>  
>          if (mm_flags & AV_CPU_FLAG_MMX2) {
>              c->h264_idct_dc_add    = ff_h264_idct_dc_add_8_mmx2;
> diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> index 5f7eed2..564d76f 100644
> --- a/libavutil/cpu.h
> +++ b/libavutil/cpu.h
> @@ -38,6 +38,7 @@
>  #define AV_CPU_FLAG_SSE4         0x0100 ///< Penryn SSE4.1 functions
>  #define AV_CPU_FLAG_SSE42        0x0200 ///< Nehalem SSE4.2 functions
>  #define AV_CPU_FLAG_AVX          0x4000 ///< AVX functions: requires OS support even if YMM registers aren't used
> +#define AV_CPU_FLAG_CMOV        0x10000 ///< supports cmov instruction
>  #define AV_CPU_FLAG_XOP          0x0400 ///< Bulldozer XOP functions
>  #define AV_CPU_FLAG_FMA4         0x0800 ///< Bulldozer FMA4 functions
>  #define AV_CPU_FLAG_IWMMXT       0x0100 ///< XScale IWMMXT

please use a value more distant from the existing so chances of ABI
conflicts are decreased

rest of the patch LGTM if you volunteer to maintain it.
(maintain here probably means revert in case someone changes the asm
 so it works without cmov)

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Asymptotically faster algorithms should always be preferred if you have
asymptotical amounts of data
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120211/49ded162/attachment.asc>


More information about the ffmpeg-devel mailing list