[FFmpeg-devel] [PATCH] intreadwrite: Altivec implementations of AV_COPY128 and AV_ZERO128.
jamrial at gmail.com
Sun Mar 8 17:35:26 CET 2015
On 08/03/15 1:25 PM, Reimar Döffinger wrote:
> On Sun, Mar 08, 2015 at 01:21:20PM -0300, James Almer wrote:
>> On 08/03/15 1:16 PM, Reimar Döffinger wrote:
>>> Slightly (ca. 4%?) faster and smaller ff_h264_decode_mb_cavlc
>>> in my tests on a G4 7450.
>>> Signed-off-by: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
>>> libavutil/ppc/intreadwrite.h | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>>> diff --git a/libavutil/ppc/intreadwrite.h b/libavutil/ppc/intreadwrite.h
>>> index 7671676..65b346e 100644
>>> --- a/libavutil/ppc/intreadwrite.h
>>> +++ b/libavutil/ppc/intreadwrite.h
>>> @@ -24,6 +24,16 @@
>>> #include <stdint.h>
>>> #include "config.h"
>>> +#if HAVE_ALTIVEC
>>> +#include "util_altivec.h"
>>> +#if HAVE_BIGENDIAN
>>> +#define AV_COPY128(d, s) vec_st(vec_ld(0, (const unsigned char *)(s)), 0, (unsigned char *)(d))
>>> +#define AV_COPY128(d, s) vec_vsx_st(vec_vsx_ld(0, (const unsigned char *)(s)), 0, (unsigned char *)(d))
>>> +#define AV_ZERO128(d) VEC_ST(vec_splat_u8(0), 0, (unsigned char *)(d))
>> Why not use static av_always_inline functions, like it's done on other arches (and
>> also for other defines in ppc)?
> Well, it would mean a define and the function itself for overall around 4
> lines of code what like this is a single line.
> I don't have much of an opinion, I did that at first, but it just seemed
> fairly bloated for what it does.
> Also, not using a function is consistent with what the implementations
> in the non-arch-specific intreadwrite.h do.
Ah, for some reason i thought i saw a semicolon in one of the defines, meaning more
than one line.
Fair enough then.
More information about the ffmpeg-devel