[FFmpeg-devel] [PATCH] move h264 loopfilter strength code to yasm

Michael Niedermayer michaelni
Fri Sep 24 21:53:12 CEST 2010


On Fri, Sep 24, 2010 at 08:36:48PM +0100, M?ns Rullg?rd wrote:
> "Ronald S. Bultje" <rsbultje at gmail.com> writes:
> 
> > Hi,
> >
> > On Fri, Sep 24, 2010 at 12:26 PM, Daniel Verkamp <daniel at drv.nu> wrote:
> >> On Fri, Sep 24, 2010 at 9:04 AM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> >>> So removing pand (which doesn't do anything in the one case, and can
> >>> be replaced by a pxor in the other). With the attached patch #2, I get
> >>> this:
> >>> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//cc8uAjPS.s:315:bad
> >>> register name `%%mm0'
> >>> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//cc8uAjPS.s:520:bad
> >>> register name `%%mm0'
> >>>
> >>> What does that mean?
> >>
> >> If you omit all of the optional colon-separated arguments to asm, the
> >> % symbols before register names in the asm no longer need to be
> >> escaped with a second % (I suppose since there can be no substitution
> >> when there are no operand constraints). ?You can add an empty : or
> >> just drop the doubled % to avoid this.
> >
> > OK, that fixes it. Oddly, it's the same speed, even though
> > #instructions is less. OK, so next then. Attached patch is supposed to
> > be part of a patch that decreases the insane amount of registers used
> > for temporary stuff that could be loaded directly (so instead of doing
> > (%0) where %0="m"(var[idx1]), use (%0,%1) with %0="r"(var) and
> > %1="r"(idx1). This works and is not slower (eventually it will be
> > faster when it saves a few registers, this is work-in-progress.
> 
> Why are you spending time and effort trying to find a magical piece of
> C code that gcc does what you want with?  It would be much simpler to
> write the code as you want it (yasm or inline) directly.

we need reviewable patches (id like to review them for example),
it can be done with inline->yasm and then optims in yasm or optims in inline

ronald has just today started to seriously learn inline asm AFIAK, and so it doesnt
all work at first try. Thats not because we miss a magical bullet but because
ronald is lerning this stuff as he applies it.

it would be great if you could keep the trolling off ffmpeg lists and irc
channels, i dont need a developer that says no every time i say yes and
yes every time i say no, i can write a script for that


> The next gcc
> release will break this anyway.

please refrain from unfounded FUD

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The misfortune of the wise is better than the prosperity of the fool.
-- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100924/3da99418/attachment.pgp>



More information about the ffmpeg-devel mailing list