[FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

Clément Bœsch u at pkh.me
Tue Aug 8 22:41:23 EEST 2017


On Tue, Aug 08, 2017 at 09:09:44AM +0000, maxime taisant wrote:
> From: Maxime Taisant <maximetaisant at hotmail.fr>
> 
> Hi,
> 
> Here is some SSE optimisations for the dwt function used to decode JPEG2000.
> I tested this code by using the time command while reading a JPEG2000 encoded video with ffmpeg and, on average, I observed a 4.05% general improvement, and a 12.67% improvement on the dwt decoding part alone.
> In the nasm code, you can notice that the SR1DFLOAT macro appear twice. One version is called in the nasm code by the HORSD macro and the other is called in the C code of the dwt function, I couldn't figure out a way to make only one macro.
> I also couldn't figure out a good way to optimize the VER_SD part, so that is why I left it unchanged, with just a SSE-optimized version of the SR_1D_FLOAT function.
> 
> Regards.
> 
> ---
>  libavcodec/jpeg2000dwt.c          |  21 +-
>  libavcodec/jpeg2000dwt.h          |   6 +
>  libavcodec/x86/jpeg2000dsp.asm    | 794 ++++++++++++++++++++++++++++++++++++++
>  libavcodec/x86/jpeg2000dsp_init.c |  55 +++
>  4 files changed, 863 insertions(+), 13 deletions(-)
> 
> diff --git a/libavcodec/jpeg2000dwt.c b/libavcodec/jpeg2000dwt.c
> index 55dd5e89b5..69c935980d 100644
> --- a/libavcodec/jpeg2000dwt.c
> +++ b/libavcodec/jpeg2000dwt.c
> @@ -558,16 +558,19 @@ int ff_jpeg2000_dwt_init(DWTContext *s, int border[2][2],
>          }
>      switch (type) {
>      case FF_DWT97:
> +        dwt_decode = dwt_decode97_float;
>          s->f_linebuf = av_malloc_array((maxlen + 12), sizeof(*s->f_linebuf));
>          if (!s->f_linebuf)
>              return AVERROR(ENOMEM);
>          break;
>       case FF_DWT97_INT:
> +        dwt_decode = dwt_decode97_int;
>          s->i_linebuf = av_malloc_array((maxlen + 12), sizeof(*s->i_linebuf));
>          if (!s->i_linebuf)
>              return AVERROR(ENOMEM);
>          break;
>      case FF_DWT53:
> +        dwt_decode = dwt_decode53;
>          s->i_linebuf = av_malloc_array((maxlen +  6), sizeof(*s->i_linebuf));
>          if (!s->i_linebuf)
>              return AVERROR(ENOMEM);

Using globals is not acceptable, you need to fix that.

[...]

-- 
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170808/7378bd45/attachment.sig>


More information about the ffmpeg-devel mailing list