[Ffmpeg-devel] SSE load and store doubts

Balatoni Denes dbalatoni
Thu Oct 6 11:44:08 CEST 2005


Hi!

If do_stuff doesn't use the MMX/SSE intrinsics, than SSE instructions usually 
won't be used (the compiler itself however could decide to use SSE, but than 
it is not because of you using the intrinsics). 

On another note I didn't actually find anything resembling your example in 
current ffmpeg.

bye
Denes

cs?t?rt?k 06 okt?ber 2005 11.10-kor Roberto Pariset ezeket a bolcs 
gondolatokat fogalmazta meg:
> hello everyone.
>
> please consider the following example:
>
> /* imagine long_array filled with floats here */
> float long_array[MANY] __attribute__((aligned(16)));
>
> i often[1] see this kind of code:
> __m128 *reg = (__m128 *) long_array;
> for(i=0; i<MANY; i+=4)
> {
> 	do_stuff();
> 	r++; /* skip to next 4 floats of long_array */
> }
>
> while i'd expect the following:
> __m128 reg;
> for(i=0; i<MANY; i+=4)
> {
> 	reg = _mm_load_ps( &long_array[i] );
> 	reg = do_stuff();
> 	_mm_store_ps( &long_array[i], reg );
> }
>
> if i compile and deassemble a simple example as the one before, i see
> the first doesn't actually use XMMn registers, while the second does:
>
>     reg = _mm_load_ps( &long_array[i] );
> 400548:    48 8d 7d 80             lea    0xffffffffffffff80(%rbp),%rdi
> 40054c:    e8 67 00 00 00          callq  4005b8 <_mm_load_ps> 400551:
> 0f 29 45 e0             movaps %xmm0,0xffffffffffffffe0(%rbp)
>
>     __m128 *reg = (__m128 *) long_array;
> 400555:    48 8d 85 70 ff ff ff    lea    0xffffffffffffff70(%rbp),%rax
> 40055c:    48 89 45 f0             mov    %rax,0xfffffffffffffff0(%rbp)
>
> so, basically, i am not sure if this is an error or not, as i am just a
> n00b with SSE. to me, it seems that the first syntax is not taking
> advantage of sse register, so it'd not make things faster. i might be
> wrong, of course. i just wanted to point it out, and would appreciate
> much if i could get some explanations, as i haven't found any on the web
> (all the code i have found use either load/store or pointer with no
> apparent difference, and none explains motivation of the choice). thanks
> alot,
> roberto
>
>
>
>
> [1] as in ffmpeg-0.4.9-pre1/libavcodec/i386/
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at mplayerhq.hu
> http://mplayerhq.hu/mailman/listinfo/ffmpeg-devel
>
>
> --- Hirdet?s ---
> Minden nap sz?ks?ged van egy kis Witaminra!
> Klikkelj ide, pr?b?ld ki, ?ll?tsd be nyit?lapodnak:
> http://ads4.adverticum.net/b/cl,1,4008,78817,132589/click.prm

-- 
---
What kills me, doesn't make me stronger.





More information about the ffmpeg-devel mailing list