[FFmpeg-devel] [PATCH 1/2] swresample: Refactor resample asm and port it to yasm

James Almer jamrial at gmail.com
Wed Mar 19 19:00:30 CET 2014


On 19/03/14 2:34 PM, Michael Niedermayer wrote:
> On Wed, Mar 19, 2014 at 02:24:05PM -0300, James Almer wrote:
>> On 19/03/14 9:18 AM, Michael Niedermayer wrote:
>>> On Wed, Mar 19, 2014 at 02:49:33AM -0300, James Almer wrote:
>>>> This reduces code duplication and makes it easier to implement new asm 
>>>> functions in the future
>>>>
>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>> ---
>>>>  libswresample/resample.c            | 96 ++++++++++---------------------------
>>>>  libswresample/resample_template.c   | 49 +++++++------------
>>>>  libswresample/swresample_internal.h | 24 ++++++++++
>>>>  libswresample/x86/Makefile          |  1 +
>>>>  libswresample/x86/resample.asm      | 64 +++++++++++++++++++++++++
>>>>  libswresample/x86/resample_mmx.h    | 74 ----------------------------
>>>>  libswresample/x86/swresample_x86.c  | 16 +++++++
>>>>  7 files changed, 148 insertions(+), 176 deletions(-)
>>>>  create mode 100644 libswresample/x86/resample.asm
>>>>  delete mode 100644 libswresample/x86/resample_mmx.h
>>>
>>> what effect does this has on speed?
>>> you are adding a function call in a in inner loop
>>
>> At least on my end it seems to be a couple cycles faster (Measured with 
>> timer.h macros surrounding the c->scalarproduct() function call, and the 
>> COMMON_CORE inline asm in the pre-patch version).
> 
> IMO to meassure the call overhead the timer macros should be
> farther out, that is outside that loop
> we want to know how the extra function call interacts with the
> code before and after
> the timer code will affect this interaction if its between

Pre patch
300068 decicycles in swri_resample_int16_sse2, 65438 runs, 98 skips

Post patch
291174 decicycles in swri_resample_int16, 65414 runs, 122 skips

This was converting a 44100khz 16 bits stereo stream to 22050khz.


More information about the ffmpeg-devel mailing list