[FFmpeg-devel] [PATCH] yadif: restore speed of the C filtering code
Michael Niedermayer
michaelni at gmx.at
Sat Mar 2 13:02:07 CET 2013
On Fri, Mar 01, 2013 at 06:20:19PM +0100, James Darnley wrote:
> Always use the special filter for the first and last 3 columns (only).
>
> The changes made in 64ed397 slowed the filter to just under 3/4 of what
> it was. This commit restores almost all of that speed while maintaining
> identical output.
>
> For reference, on my Athlon64:
> 1733222 decicycles in old
> 2358563 decicycles in new
> 1740014 decicycles in this
> ---
> libavfilter/vf_yadif.c | 93 +++++++++++++++++++++++---------------
> libavfilter/x86/vf_yadif_init.c | 12 +----
> libavfilter/yadif.h | 4 +-
> 3 files changed, 60 insertions(+), 49 deletions(-)
>
> diff --git a/libavfilter/vf_yadif.c b/libavfilter/vf_yadif.c
> index b7c2d80..3bd0d17 100644
> --- a/libavfilter/vf_yadif.c
> +++ b/libavfilter/vf_yadif.c
> @@ -34,9 +34,9 @@
> #define PERM_RWP AV_PERM_WRITE | AV_PERM_PRESERVE | AV_PERM_REUSE
>
> #define CHECK(j)\
> - { int score = FFABS(cur[mrefs + off_left + (j)] - cur[prefs + off_left - (j)])\
> + { int score = FFABS(cur[mrefs - 1 + (j)] - cur[prefs - 1 - (j)])\
> + FFABS(cur[mrefs +(j)] - cur[prefs -(j)])\
> - + FFABS(cur[mrefs + off_right + (j)] - cur[prefs + off_right - (j)]);\
> + + FFABS(cur[mrefs + 1 + (j)] - cur[prefs + 1 - (j)]);\
> if (score < spatial_score) {\
> spatial_score= score;\
> spatial_pred= (cur[mrefs +(j)] + cur[prefs -(j)])>>1;\
> @@ -51,15 +51,46 @@
> int temporal_diff2 =(FFABS(next[mrefs] - c) + FFABS(next[prefs] - e) )>>1; \
> int diff = FFMAX3(temporal_diff0 >> 1, temporal_diff1, temporal_diff2); \
> int spatial_pred = (c+e) >> 1; \
> - int off_right = (x < w - 1) ? 1 : -1;\
> - int off_left = x ? -1 : 1;\
> - int spatial_score = FFABS(cur[mrefs + off_left] - cur[prefs + off_left]) + FFABS(c-e) \
> - + FFABS(cur[mrefs + off_right] - cur[prefs + off_right]) - 1; \
> + int spatial_score = FFABS(cur[mrefs - 1] - cur[prefs - 1]) + FFABS(c-e) \
> + + FFABS(cur[mrefs + 1] - cur[prefs + 1]) - 1; \
> \
> - if (x > 2 && x < w - 3) {\
> - CHECK(-1) CHECK(-2) }} }} \
> - CHECK( 1) CHECK( 2) }} }} \
> - }\
> + CHECK(-1) CHECK(-2) }} }} \
> + CHECK( 1) CHECK( 2) }} }} \
> + \
> + if (mode < 2) { \
> + int b = (prev2[2 * mrefs] + next2[2 * mrefs])>>1; \
> + int f = (prev2[2 * prefs] + next2[2 * prefs])>>1; \
> + int max = FFMAX3(d - e, d - c, FFMIN(b - c, f - e)); \
> + int min = FFMIN3(d - e, d - c, FFMAX(b - c, f - e)); \
> + \
> + diff = FFMAX3(diff, min, -max); \
> + } \
> + \
> + if (spatial_pred > d + diff) \
> + spatial_pred = d + diff; \
> + else if (spatial_pred < d - diff) \
> + spatial_pred = d - diff; \
> + \
> + dst[0] = spatial_pred; \
> + \
> + dst++; \
> + cur++; \
> + prev++; \
> + next++; \
> + prev2++; \
> + next2++; \
> + }
> +
> +#define FILTER_EDGES(start, end) \
> + for (x = start; x < end; x++) { \
> + int c = cur[mrefs]; \
> + int d = (prev2[0] + next2[0])>>1; \
> + int e = cur[prefs]; \
> + int temporal_diff0 = FFABS(prev2[0] - next2[0]); \
> + int temporal_diff1 =(FFABS(prev[mrefs] - c) + FFABS(prev[prefs] - e) )>>1; \
> + int temporal_diff2 =(FFABS(next[mrefs] - c) + FFABS(next[prefs] - e) )>>1; \
> + int diff = FFMAX3(temporal_diff0 >> 1, temporal_diff1, temporal_diff2); \
> + int spatial_pred = (c+e) >> 1; \
this duplciates the macro, i dont think thats neccessary
it should be enough to fix the implementation so the compiler can
optimize things to constants in the main case
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I have never wished to cater to the crowd; for what I know they do not
approve, and what they approve I do not know. -- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130302/a5b65169/attachment.asc>
More information about the ffmpeg-devel
mailing list