[FFmpeg-devel] [PATCH] Optimized unscaled yuvp9/yuvp10 -> yuvp16 conversion.

Michael Niedermayer michaelni at gmx.at
Sat Aug 11 16:52:19 CEST 2012


On Sat, Aug 11, 2012 at 02:18:36PM +0200, Reimar Döffinger wrote:
> About 30% faster on 32 bit Atom, 120% faster on 64 bit Phenom2.
> This is interesting because supporting P16 is easier in e.g.
> OpenGL (can misuse support for any 2-component 8 bit format),
> whereas supporting p9/p10 without conversion needs a texture
> format with at least 14 bits actual precision.
> 
> Signed-off-by: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
> ---
>  libswscale/swscale_unscaled.c |   26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
> index c391a07..6618966 100644
> --- a/libswscale/swscale_unscaled.c
> +++ b/libswscale/swscale_unscaled.c
> @@ -830,7 +830,33 @@ static int planarCopyWrapper(SwsContext *c, const uint8_t *src[],
>                          srcPtr  += srcStride[plane];
>                      }
>                  } else if (src_depth <= dst_depth) {
> +                    int orig_length = length;
>                      for (i = 0; i < height; i++) {
> +                        if(isBE(c->srcFormat) == HAVE_BIGENDIAN &&
> +                           isBE(c->dstFormat) == HAVE_BIGENDIAN) {
> +                             unsigned shift = dst_depth - src_depth;
> +                             length = orig_length;
> +#if HAVE_FAST_64BIT
> +#define FAST_COPY_UP(shift) \
> +    for (j = 0; j < length - 3; j += 4) { \
> +        uint64_t v = AV_RN64A(srcPtr2 + j); \
> +        AV_WN64A(dstPtr2 + j, v << shift); \
> +    } \
> +    length &= 3;
> +#else
> +#define FAST_COPY_UP(shift) \
> +    for (j = 0; j < length - 1; j += 2) { \
> +        uint32_t v = AV_RN32A(srcPtr2 + j); \
> +        AV_WN32A(dstPtr2 + j, v << shift); \
> +    } \
> +    length &= 1;
> +#endif

these look wrong for the shiftonly==0 case


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

There will always be a question for which you do not know the correct awnser.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120811/cc2944de/attachment.asc>


More information about the ffmpeg-devel mailing list