[FFmpeg-devel] [RFC] Loop unrolling in C code for 'vector_fmul_*' functions

Alexander Strange astrange
Mon Apr 21 00:08:22 CEST 2008


On Sun, Apr 20, 2008 at 6:01 PM, Siarhei Siamashka
<siarhei.siamashka at gmail.com> wrote:
> [..]
>
>  Getting back to this issue.
>
>  It is good that I did not submit a report to the gcc devels, otherwise I would
>  make an idiot out of myself submitting invalid report :)
>
>  The problem is that
>
>  void vector_fmul_c_unrolled(float *dst, const float *src, int len)
>  {
>     int i;
>     for(i = 0; i < len; i += 8) {
>         dst[i + 0] *= src[i + 0];
>         dst[i + 1] *= src[i + 1];
>         dst[i + 2] *= src[i + 2];
>         dst[i + 3] *= src[i + 3];
>         dst[i + 4] *= src[i + 4];
>         dst[i + 5] *= src[i + 5];
>         dst[i + 6] *= src[i + 6];
>         dst[i + 7] *= src[i + 7];
>     }
>  }
>
>  and
>
>  void vector_fmul_c_other_unrolled(float *dst, const float *src, int len)
>  {
>     int i;
>     register float tmp0, tmp1, tmp2, tmp3, tmp4, tmp5, tmp6, tmp7;
>     for(i = 0; i < len; i += 8) {
>         tmp0 = src[i + 0];
>         tmp1 = src[i + 1];
>         tmp2 = src[i + 2];
>         tmp3 = src[i + 3];
>         tmp4 = src[i + 4];
>         tmp5 = src[i + 5];
>         tmp6 = src[i + 6];
>         tmp7 = src[i + 7];
>         dst[i + 0] *= tmp0;
>         dst[i + 1] *= tmp1;
>         dst[i + 2] *= tmp2;
>         dst[i + 3] *= tmp3;
>         dst[i + 4] *= tmp4;
>         dst[i + 5] *= tmp5;
>         dst[i + 6] *= tmp6;
>         dst[i + 7] *= tmp7;
>     }
>  }
>
>  are not actually identical.
>
>  The compiler needs to take into account the case when 'dst' and
>  'src' buffers overlap and it is impossible to optimize the code
>  from 'vector_fmul_c_unrolled' function scheduling instructions just
>  like in 'vector_fmul_c_other_unrolled'.
>
>  The fact that 'dst' and 'src' buffers don't overlap is one more useful
>  constraint which can be exploited when doing optimizations.
>
>  Those who are interested in this issue, can look at '-fargument-alias',
>  '-fargument-noalias' and '-fargument-noalias-global' gcc options.
>
>  Too bad that I did not find any gcc function attribute that could be used to
>  tell the compiler that pointer arguments from some particular function do not
>  alias without using this setting for all the project risking to break
>  something.
>
>  Anyway, at least it in this case gcc was not at fault :)

The C keyword "restrict" will do this. gcc has some problems with it -
it's ignored for char*, so we can't use it to fix cases like
get_cabac* where char* aliasing causes a lot of unnecessary stores -
but it might work here. If not, you can sometimes fix it by inlining
the function into somewhere where the original definition of src/dst
are both visible.




More information about the ffmpeg-devel mailing list