[FFmpeg-devel] [PATCH] update doc/optimization.txt

Michael Niedermayer michaelni
Wed Sep 22 17:31:06 CEST 2010


On Wed, Sep 22, 2010 at 09:54:42AM -0400, Ronald S. Bultje wrote:
> Hi,
> 
> On Wed, Sep 22, 2010 at 9:47 AM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> > 2010/9/22 M?ns Rullg?rd <mans at mansr.com>:
> >> Michael Niedermayer <michaelni at gmx.at> writes:
> >>> On Wed, Sep 22, 2010 at 02:11:49PM +0100, M?ns Rullg?rd wrote:
> >>>> Michael Niedermayer <michaelni at gmx.at> writes:
> >>>> > On Wed, Sep 22, 2010 at 08:37:07AM -0400, Ronald S. Bultje wrote:
> >>>> >> Also, should I mention (in general tips) that functions that use huge
> >>>> >> structs (e.g. MpegEncContext) are A) "discouragable" in general and B)
> >>>> >> better written in inline asm than yasm because of the difficulty of
> >>>> >> predicting struct offsets?
> >>>> >
> >>>> > B is ok about A, who uses huge structs when its not needed ...
> >>>>
> >>>> If possible, the asm functions should be passed pointers into (or
> >>>> values from) the struct in preference over passing a pointer to the
> >>>> whole struct.
> >>>
> >>> if one needs just 1 pointer into a struct yes, if one needs 5, its not
> >>> efficient to use up 5 registers and waste the time to initialize them
> >>> besides that x86-32 has too few registers for this to work out
> >>
> >> In many such cases, those values could be put in a struct of their own
> >> contained within the larger struct. ?Manually managing offsets in a
> >> small struct is simple enough.

in which of the cases in svn where a pointer to struct is passed to asm
would this apply?
you said "many such cases" so you know of at least 1 i assume


> >
> > I think this is sort of the message I was hoping to put forward. I'm
> > not saying we should change MpegEncContext, that code exists and works
> > well, I'm just saying that for future (newly written) pieces of code
> > requiring new simd, we should look into alternatives as both of you
> > pointed out. Both of your alternatives could work in some situations,
> > and there's others where a big struct is really the best solution.
> > Hence no details in the doc, just "isn't always best".
> 
> And another updated patch. If particular chunks look OK, let me know
> and I'll apply them to prevent this patch from becoming monstrously
> big.
> 
> Ronald

>  optimization.txt |   35 +++++++++++++++++++++++++++++++++--
>  1 file changed, 33 insertions(+), 2 deletions(-)
> 2e585b6899d435edacfbfea80ba926b283de65ed  doc.diff
> Index: doc/optimization.txt
> ===================================================================
> --- doc/optimization.txt	(revision 25157)
> +++ doc/optimization.txt	(working copy)
> @@ -164,10 +164,41 @@
>          ...
>  }while()
>  
> -Use __asm__() instead of intrinsics. The latter requires a good optimizing compiler
> -which gcc is not.

> +Do not use multiple inline asm blocks in a single C function. The compiler is
> +not required to maintain register values between asm blocks, and depending on
> +this behaviour can break with any future version of gcc.

Using multiple asm blocks in a C function and having code that depends on
register values to be maintained across asm blocks are 2 different things.
Please use precisse language and make sure what you mean is also what is
written there


>  
> +For x86, mark registers that are clobbered in your asm. This means both
> +general x86 registers (e.g. eax) as well as XMM registers. This last one is
> +particularly important on Win64, where xmm6-15 are callee-save, and not
> +restoring their contents leads to undefined results. In external asm, you do
> +this by using: "cglobal functon_name, num_args, num_regs, num_xmm_regs". In
> +inline asm, you specify clobbered registers at the end of your asm:
> +__asm__(".." ::: "%eax").

This recommandition has to be cross checked with generated code, for example
we must make sure gcc does not emmit a *emms for mmx register clobbers.


>  
> +For x86, use yasm or __asm__(), do not use intrinsics. The latter requires a
> +good optimizing compiler which gcc is not.

non x86 intrinsics are none the better and the recommandition against them
should stay, also i see no reason to mention x86 anyway, if someone manages
to generate sparc asm with yasm cleanly, why not ...


> +
> +Inline asm vs. external asm
> +---------------------------
> +Both inline asm (__asm__("..") in a .c file, handled by a compiler such as gcc)
> +and external asm (.s or .asm files, handled by an assembler such as yasm/nasm)
> +are accepted in FFmpeg. Which one to use differs per specific case.
> +
> +- if your code is intended to be inlined in a C function, inline asm is always
> +   better, because external asm cannot be inlined
> +- if your code calls external functions, yasm is always better
> +- if your code takes huge and complex structs as function arguments (e.g.
> +   MpegEncContext; note that this is not ideal and is discouraged if there
> +   are alternatives), then inline asm is always better, because predicting
> +   member offsets in complex structs is almost impossible. It's safest to let
> +   the compiler take care of that
> +- in many cases, both can be used and it just depends on the preference of the
> +   person writing the asm. For new asm, the choice is up to you. For existing
> +   asm, you'll likely want to maintain whatever form it is currently in unless
> +   there is a good reason to change it.

also add that inline<->yasm changes must go in seperate patches please

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No great genius has ever existed without some touch of madness. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100922/4bad6330/attachment.pgp>



More information about the ffmpeg-devel mailing list