[FFmpeg-devel] [PATCH] enable auto vectorization for gcc 7 and higher

James Almer jamrial at gmail.com
Wed Jul 27 20:39:22 EEST 2022

On 7/27/2022 2:34 PM, Swinney, Jonathan wrote:
> I recognize that this patch is going to be somewhat controversial. I'm submitting it mostly to see what the opinions are and evaluate options. I am working on improving performance for aarch64. On that architecture, there are fewer hand written assembly implementations of hot functions than there are for x86_64 and allowing gcc to auto-vectorize yields noticeable improvements.
> Gcc vectorization has improved recently and it hasn't been evaluated on the mailing list for a few years. This is the latest discussion I found in my searches: http://ffmpeg.org/pipermail/ffmpeg-devel/2016-May/193977.html

Every time this was done, it was inevitably reverted after complains and 
crash reports started piling up because gcc can't really handle all the 
inline code our codebase has, among other things.

> If the community is not comfortable accepting a patch like this outright, would you be willing to accept a new option to the configure script, something like --enable-auto-vectorization?

--extra-cflags can be used for this.

> Thanks!
> Signed-off-by: Jonathan Swinney <jswinney at amazon.com>
> ---
>   configure | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> diff --git a/configure b/configure
> index 6629d14099..c63c9348ad 100755
> --- a/configure
> +++ b/configure
> @@ -7173,7 +7173,9 @@ if enabled icc; then
>               disable aligned_stack
>       fi
>   elif enabled gcc; then
> -    check_optflags -fno-tree-vectorize
> +    case $gcc_basever in
> +        2|2.*|3.*|4.*|5.*|6.*) check_optflags -fno-tree-vectorize ;;
> +    esac
>       check_cflags -Werror=format-security
>       check_cflags -Werror=implicit-function-declaration
>       check_cflags -Werror=missing-prototypes

More information about the ffmpeg-devel mailing list