[FFmpeg-devel] avcodec/huffyuvenc : try to call dsp with aligned data, and remove code duplication
Michael Niedermayer
michael at niedermayer.cc
Sat Dec 2 00:04:57 EET 2017
On Sun, Nov 26, 2017 at 07:07:41PM +0100, Martin Vignali wrote:
> Hello,
>
> in attach patchs
>
> 0001-avcodec-huffyuvenc-increase-scalar-loop-count
> and
> 0003-avcodec-huffyuvenc-sub_left_prediction_bgr32-call-ds
>
> like diff_bytes and diff_bytes16, have AVX2 version, increase the scalar
> loop
> to call the aligned version in most case
>
>
>
> 0002-avcodec-huffyuvenc-remove-code-duplication-in
> remove some code duplication, for width < 32 and for the initial scalar loop
>
>
> pass fate test for me (x86_64, mac os 10.12)
>
> Martin
> huffyuvenc.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 32eecc99e666808926e1dec4ff35c17a94f5f86e 0001-avcodec-huffyuvenc-increase-scalar-loop-count.patch
> From 9477be212247012ac386beeff009a2edb78abb31 Mon Sep 17 00:00:00 2001
> From: Martin Vignali <martin.vignali at gmail.com>
> Date: Sun, 26 Nov 2017 19:01:29 +0100
> Subject: [PATCH 1/3] avcodec/huffyuvenc : increase scalar loop count
>
> in order to try to call dsp in aligned mode
> (diff_int16 have AVX2 now)
> ---
> libavcodec/huffyuvenc.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/libavcodec/huffyuvenc.c b/libavcodec/huffyuvenc.c
> index 89639b75df..4f3a28e033 100644
> --- a/libavcodec/huffyuvenc.c
> +++ b/libavcodec/huffyuvenc.c
> @@ -80,12 +80,12 @@ static inline int sub_left_prediction(HYuvContext *s, uint8_t *dst,
> }
> return left;
> } else {
> - for (i = 0; i < 16; i++) {
> + for (i = 0; i < 32; i++) {
> const int temp = src16[i];
> dst16[i] = temp - left;
> left = temp;
> }
> - s->hencdsp.diff_int16(dst16 + 16, src16 + 16, src16 + 15, s->n - 1, w - 16);
> + s->hencdsp.diff_int16(dst16 + 32, src16 + 32, src16 + 31, s->n - 1, w - 32);
> return src16[w-1];
> }
> }
> --
> 2.11.0 (Apple Git-81)
>
> huffyuvenc.c | 46 ++++++++++++++++------------------------------
> 1 file changed, 16 insertions(+), 30 deletions(-)
> ba80747db2582141ec0faefc5ccd04fba65c7d72 0002-avcodec-huffyuvenc-remove-code-duplication-in.patch
> From 7fa991ae72c97f4d1f74789e543cf01dcb93adb9 Mon Sep 17 00:00:00 2001
> From: Martin Vignali <martin.vignali at gmail.com>
> Date: Sun, 26 Nov 2017 19:02:10 +0100
> Subject: [PATCH 2/3] avcodec/huffyuvenc : remove code duplication in
> sub_left_prediction
>
> start of the line (before dsp call), can be merge with width < 32 part
> ---
> libavcodec/huffyuvenc.c | 46 ++++++++++++++++------------------------------
> 1 file changed, 16 insertions(+), 30 deletions(-)
>
> diff --git a/libavcodec/huffyuvenc.c b/libavcodec/huffyuvenc.c
> index 4f3a28e033..59da49212e 100644
> --- a/libavcodec/huffyuvenc.c
> +++ b/libavcodec/huffyuvenc.c
> @@ -53,41 +53,27 @@ static inline int sub_left_prediction(HYuvContext *s, uint8_t *dst,
> {
> int i;
> if (s->bps <= 8) {
> - if (w < 32) {
> - for (i = 0; i < w; i++) {
> - const int temp = src[i];
> - dst[i] = temp - left;
> - left = temp;
> - }
> - return left;
> - } else {
> - for (i = 0; i < 32; i++) {
> - const int temp = src[i];
> - dst[i] = temp - left;
> - left = temp;
> - }
> - s->llvidencdsp.diff_bytes(dst + 32, src + 32, src + 31, w - 32);
> - return src[w-1];
> + for (i = 0; i < FFMIN(w, 32); i++) { /* scalar loop before dsp call */
> + const int temp = src[i];
> + dst[i] = temp - left;
> + left = temp;
requiring FFMIN() to be evaluated per iteration could be slower
if the compiler fails to factor it out
no other comments from me, the patches should be ok otherwise
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
If you fake or manipulate statistics in a paper in physics you will never
get a job again.
If you fake or manipulate statistics in a paper in medicin you will get
a job for life at the pharma industry.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20171201/48df0680/attachment.sig>
More information about the ffmpeg-devel
mailing list