[FFmpeg-devel] [PATCH] ARM: remove useless stack push/pop

Måns Rullgård mans
Wed Jun 9 01:43:54 CEST 2010


Rafa?l Carr? <rafael.carre at gmail.com> writes:

> Hi,
>
> r12 doesn't need to be saved in called functions because it's a scratch
> register.
>
> While I'm here, did anyone try to build FFmpeg with -mthumb yet ?

Yes, gcc generated invalid asm.  For that reason, and others, we force
-marm.  There is no gain from using thumb with ffmpeg.

> "grep -Er '(pop|ldm).*pc' libavcodec/arm" shows that there is a lot of
> functions which can't be called from thumb on armv4t : using ldm ...,pc
> will not perform the switch from arm to thumb on these CPU.

So use interworking if you need to.  Any decent linker support that.

> If you want to support both thumb code and armv4t this needs changing
> to use 1 more instruction (without speed cost on anything but arm7tdmi
> where it would take 1 more cycle to return).

Thumb doesn't work anyway, so there's no point.

See also a blog post I did some time ago on the topic.  Perhaps I
should revisit that.

BTW, many, if not most, Cortex-A8 chips in the field have hardware
bugs rendering any mixing of Thumb and ARM code unreliable.  Older
cores work, but most of those are pre-Thumb2 and the speed penalty
there is too great for FFmpeg.

> diff --git a/libavcodec/arm/jrevdct_arm.S b/libavcodec/arm/jrevdct_arm.S
> index 4fcf351..4ce37d0 100644
> --- a/libavcodec/arm/jrevdct_arm.S
> +++ b/libavcodec/arm/jrevdct_arm.S
> @@ -58,7 +58,7 @@
>          .align
>  
>  function ff_j_rev_dct_arm, export=1
> -        stmdb   sp!, { r4 - r12, lr }   @ all callee saved regs
> +        stmdb   sp!, { r4 - r11, lr }   @ all callee saved regs
>  
>          sub sp, sp, #4                  @ reserve some space on the stack
>          str r0, [ sp ]                  @ save the DCT pointer to the stack
> @@ -369,7 +369,7 @@ empty_odd_column:
>  the_end:
>          @ The end....
>          add sp, sp, #4
> -        ldmia   sp!, { r4 - r12, pc }   @ restore callee saved regs and return
> +        ldmia   sp!, { r4 - r11, pc }   @ restore callee saved regs and return

Does this function call any other functions?  If so, the stack must
maintain 8-byte alignment, and this is the easiest way to accomplish
that.  Not that you'd want to use that DCT implementation anyway.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list