[FFmpeg-cvslog] ppc: dsputil: Merge some declarations and initializations
Clément Bœsch
u at pkh.me
Fri Mar 21 10:44:41 CET 2014
On Thu, Mar 20, 2014 at 09:57:17PM +0100, Diego Biurrun wrote:
> ffmpeg | branch: master | Diego Biurrun <diego at biurrun.de> | Wed Jan 15 14:36:28 2014 +0100| [b7d24fd4b2213104c001ed504074495568600b9c] | committer: Diego Biurrun
>
> ppc: dsputil: Merge some declarations and initializations
>
> > http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=b7d24fd4b2213104c001ed504074495568600b9c
> ---
>
> libavcodec/ppc/dsputil_altivec.c | 403 +++++++++++++++++---------------------
> libavcodec/ppc/dsputil_ppc.c | 9 +-
> libavcodec/ppc/fdct_altivec.c | 3 +-
> libavcodec/ppc/gmc_altivec.c | 31 ++-
> libavcodec/ppc/idct_altivec.c | 37 ++--
> libavcodec/ppc/int_altivec.c | 6 +-
> 6 files changed, 219 insertions(+), 270 deletions(-)
>
> diff --git a/libavcodec/ppc/dsputil_altivec.c b/libavcodec/ppc/dsputil_altivec.c
> index 2091023..a8985fd 100644
> --- a/libavcodec/ppc/dsputil_altivec.c
> +++ b/libavcodec/ppc/dsputil_altivec.c
[...]
> @@ -903,6 +862,33 @@ static int hadamard8_diff16x8_altivec(/* MpegEncContext */ void *s, uint8_t *dst
> register vector signed short line3C = vec_add(line3B, line7B);
> register vector signed short line7C = vec_sub(line3B, line7B);
>
> + register vector signed short line0S = vec_add(temp0S, temp1S);
> + register vector signed short line1S = vec_sub(temp0S, temp1S);
> + register vector signed short line2S = vec_add(temp2S, temp3S);
> + register vector signed short line3S = vec_sub(temp2S, temp3S);
> + register vector signed short line4S = vec_add(temp4S, temp5S);
> + register vector signed short line5S = vec_sub(temp4S, temp5S);
> + register vector signed short line6S = vec_add(temp6S, temp7S);
> + register vector signed short line7S = vec_sub(temp6S, temp7S);
> +
> + register vector signed short line0BS = vec_add(line0S, line2S);
> + register vector signed short line2BS = vec_sub(line0S, line2S);
> + register vector signed short line1BS = vec_add(line1S, line3S);
> + register vector signed short line3BS = vec_sub(line1S, line3S);
> + register vector signed short line4BS = vec_add(line4S, line6S);
> + register vector signed short line6BS = vec_sub(line4S, line6S);
> + register vector signed short line5BS = vec_add(line5S, line7S);
> + register vector signed short line7BS = vec_sub(line5S, line7S);
> +
> + register vector signed short line0CS = vec_add(line0BS, line4BS);
> + register vector signed short line4CS = vec_sub(line0BS, line4BS);
> + register vector signed short line1CS = vec_add(line1BS, line5BS);
> + register vector signed short line5CS = vec_sub(line1BS, line5BS);
> + register vector signed short line2CS = vec_add(line2BS, line6BS);
> + register vector signed short line6CS = vec_sub(line2BS, line6BS);
> + register vector signed short line3CS = vec_add(line3BS, line7BS);
> + register vector signed short line7CS = vec_sub(line3BS, line7BS);
> +
> vsum = vec_sum4s(vec_abs(line0C), vec_splat_s32(0));
> vsum = vec_sum4s(vec_abs(line1C), vsum);
> vsum = vec_sum4s(vec_abs(line2C), vsum);
> @@ -912,33 +898,6 @@ static int hadamard8_diff16x8_altivec(/* MpegEncContext */ void *s, uint8_t *dst
> vsum = vec_sum4s(vec_abs(line6C), vsum);
> vsum = vec_sum4s(vec_abs(line7C), vsum);
>
> - line0S = vec_add(temp0S, temp1S);
> - line1S = vec_sub(temp0S, temp1S);
> - line2S = vec_add(temp2S, temp3S);
> - line3S = vec_sub(temp2S, temp3S);
> - line4S = vec_add(temp4S, temp5S);
> - line5S = vec_sub(temp4S, temp5S);
> - line6S = vec_add(temp6S, temp7S);
> - line7S = vec_sub(temp6S, temp7S);
> -
> - line0BS = vec_add(line0S, line2S);
> - line2BS = vec_sub(line0S, line2S);
> - line1BS = vec_add(line1S, line3S);
> - line3BS = vec_sub(line1S, line3S);
> - line4BS = vec_add(line4S, line6S);
> - line6BS = vec_sub(line4S, line6S);
> - line5BS = vec_add(line5S, line7S);
> - line7BS = vec_sub(line5S, line7S);
> -
> - line0CS = vec_add(line0BS, line4BS);
> - line4CS = vec_sub(line0BS, line4BS);
> - line1CS = vec_add(line1BS, line5BS);
> - line5CS = vec_sub(line1BS, line5BS);
> - line2CS = vec_add(line2BS, line6BS);
> - line6CS = vec_sub(line2BS, line6BS);
> - line3CS = vec_add(line3BS, line7BS);
> - line7CS = vec_sub(line3BS, line7BS);
> -
> vsum = vec_sum4s(vec_abs(line0CS), vsum);
> vsum = vec_sum4s(vec_abs(line1CS), vsum);
> vsum = vec_sum4s(vec_abs(line2CS), vsum);
Is it OK to move all the "register" initializations on top when usage is
not immediately required? Won't that stress a bit the compiler and make
it do nasty thing with the stack? Maybe it's smart enough, but I would
guess this wasn't tested.
[...]
--
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-cvslog/attachments/20140321/b38a2f62/attachment.asc>
More information about the ffmpeg-cvslog
mailing list