[Ffmpeg-devel] Re: about mmx instructions
Fri Sep 2 01:42:17 CEST 2005
Firstly, I did rewrite that part of the code in SSE2 instructions and I
achieved about 20% performance gain on my Pentium 4 pc
for that part of the code.
Secondly, if you take a closer look at the original implementation. you
will find that the code can still be improved in terms of CPU resource
utilization . For example , if you changing some of the movq
instructions into pshufw xxx, xxx, 0xe4 ( which means use shift unit
load unit), you can gain some improvement.
By the way, I had a hard time to understand that magic too. I finally
gave up( but I did write a small code to verify that formula :)
I will try again this weekend.
Phone: 61-2-6163 8776
More information about the ffmpeg-devel