[FFmpeg-devel] [PATCH 2/2] Add hflip filter.

Michael Niedermayer michaelni
Mon Aug 16 20:21:37 CEST 2010


On Mon, Aug 16, 2010 at 02:25:59PM +0100, M?ns Rullg?rd wrote:
> "Ronald S. Bultje" <rsbultje at gmail.com> writes:
> 
> > Hi,
> >
> > On Thu, Aug 12, 2010 at 2:35 PM, Stefano Sabatini
> > <stefano.sabatini-lala at poste.it> wrote:
> >> On date Thursday 2010-08-12 12:49:25 -0400, Ronald S. Bultje encoded:
> >>> On Thu, Aug 12, 2010 at 12:39 PM, Stefano Sabatini
> >>> <stefano.sabatini-lala at poste.it> wrote:
> >>> > On date Wednesday 2010-08-04 14:23:49 +0200, Michael Niedermayer encoded:
> >>> >> On Sat, Jul 31, 2010 at 02:07:29AM +0200, Stefano Sabatini wrote:
> >>> > [...]
> >>> >> > +static void draw_slice(AVFilterLink *inlink, int y, int h, int slice_dir)
> >>> >> > +{
> >>> >> > + ? ?FlipContext *flip = inlink->dst->priv;
> >>> >> > + ? ?AVFilterPicRef *inpic ?= inlink->cur_pic;
> >>> >> > + ? ?AVFilterPicRef *outpic = inlink->dst->outputs[0]->outpic;
> >>> >> > + ? ?uint8_t *inrow, *outrow;
> >>> >> > + ? ?int i, j, plane, step, hsub, vsub;
> >>> >> > +
> >>> >> > + ? ?for (plane = 0; plane < 4 && inpic->data[plane]; plane++) {
> >>> >> > + ? ? ? ?step = flip->max_step[plane];
> >>> >> > + ? ? ? ?hsub = (plane == 1 || plane == 2) ? flip->hsub : 0;
> >>> >> > + ? ? ? ?vsub = (plane == 1 || plane == 2) ? flip->vsub : 0;
> >>> >> > +
> >>> >> > + ? ? ? ?outrow = outpic->data[plane] + (y>>vsub) * outpic->linesize[plane];
> >>> >> > + ? ? ? ?inrow ?= inpic ->data[plane] + (y>>vsub) * inpic ->linesize[plane] + ((inlink->w >> hsub) - 1) * step;
> >>> >> > + ? ? ? ?for (i = 0; i < h>>vsub; i++) {
> >>> >> > + ? ? ? ? ? ?for (j = 0; j < (inlink->w >> hsub); j++)
> >>> >> > + ? ? ? ? ? ? ? ?memcpy(outrow + j*step, inrow - j*step, step);
> >>> >>
> >>> >> variable length memcpy on a per pixel base is slow
> >>> >
> >>> > Updated.
> >>> >
> >>> > I didn't manage to understand how bswap/dsputils may be used, I don't
> >>> > know if that would improve it.
> >>>
> >>> You could create a VideoFilterDSPContext (or a
> >>> HFlipVideoFilterDSPContext), add a function hflip to it, and then any
> >>> one of us could optimize it. E.g. for RGBA32, where step is probably
> >>> 4, we would read it as 8/16-bytes-at-once, flip them using e.g. pshufw
> >>> or something, (do the same for the opposite pixels at the end of the
> >>> row, ) and then write them out again -> you just did 2x 2/4 pixels at
> >>> once. By using multiple registries and making sure there's enough
> >>> padding (which I think is always the case), this'd get even faster,
> >>> also because for at least the left read/write, we can use aligned r/w
> >>> which is faster.
> >>>
> >>> Not sure if that's what Michael meant, but I guess it's sort of in the
> >>> right direction.
> >>
> >> OK I see thanks, I suggest anyway to commit this simple variant, and
> >> then work on the optimizations.
> > [..]
> >> +            case 3:
> >> +            {
> >> +                uint8_t *in  =  inrow;
> >> +                uint8_t *out = outrow;
> >> +                for (j = 0; j < (inlink->w >> hsub); j++, out += 3, in -= 3) {
> >> +                    out[0] = in[0];
> >> +                    out[1] = in[1];
> >> +                    out[2] = in[2];
> >> +                }
> >> +            }
> >> +            break;
> >
> > You can use a uint16+t + uint8_t write here instead of 3 uint8_t writes.

dont forget alignment


> 
> Better still, use AV_[RW]B24() or the bytestream macros, which will in
> theory do the right thing.

in theory memcpy() should be fine too


[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Good people do not need laws to tell them to act responsibly, while bad
people will find a way around the laws. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100816/2c588f4c/attachment.pgp>



More information about the ffmpeg-devel mailing list