[FFmpeg-devel] Indeo3 replacement, take 3

Vitor Sessak vitor1001
Mon Nov 2 03:41:41 CET 2009


Maxim wrote:
> Vitor Sessak schrieb:
>> [...]
>>> Note that since plane->pixels[] is not aligned, dst is not aligned
>>> neither. So I'd suggest something on the lines of
>>>
>>>> typedef struct Plane {
>>>>     uint8_t         *buffers[2];
>>>>     DECLARE_ALIGNED_16(uint8_t, *pixels[2]); ///< pointer to the
>>>> actual pixel data of the buffers above
>>>>     uint32_t        width;
>>>>     uint32_t        height;
>>>>     uint32_t        pitch;
>>>> } Plane;
>> Err, scrap that, I see that pixels[] are pointers to av_malloc'ed
>> buffers, hence aligned. So no ideas here. Does anyone know the actual
>> alignment requirements of dsp.put_no_rnd_pixels_tab? It is documented
>> nowhere...
>>
> 
> 
> Two days with the GDB and I've found out what's up! 

Nice!

> The problem resides
> in the function "put_pixels16_altivec()" from dsputils_ppc.c and is
> caused actually by the wrong alignment

Please send a patch adding a doxy documentation explaining the alignment 
requirements of dsp.put_no_rnd_pixels_tab. It might save a few days of 
GDB work of the next person who has the same problem ;)

>     /* setup output and reference pointers */
>     offset  = (cell->ypos << 2) * plane->pitch + (cell->xpos << 2);
>     dst     = plane->pixels[ctx->buf_sel] + offset;
>     mv_y    = cell->mv_ptr[0];
>     mv_x    = cell->mv_ptr[1];
>     offset += mv_y * plane->pitch + mv_x;
>     src     = plane->pixels[ctx->buf_sel ^ 1] + offset;
> 
>     h = cell->height << 2;
> 
>     if (HAVE_ALTIVEC && ((int)dst & 15)) {
>         ctx->dsp.put_no_rnd_pixels_tab[1][0](dst, src, plane->pitch, h);
>         src += 8;
>         dst += 8;
>     }

This code goes a little against the dsputils philosophy. From what I 
understand, it is supposed to "just work", no matter what CPU, if the 
right alignment requirements are fulfilled. The problem with your 
approach is that it can fail when one implements 
dsp.put_no_rnd_pixels_tab[0][0] in NEON/SS4/AVR32/etc requiring 16-byte 
alignment. Also casting the integer to pointer is not very pretty. What 
I suggest is

int offset_dst  = (cell->ypos << 2) * plane->pitch + (cell->xpos << 2);

[...]

     if (offset_dst & 15) {
         ctx->dsp.put_no_rnd_pixels_tab[1][0](dst, src, plane->pitch, h);
         src += 8;
         dst += 8;
     }

but for this to work you have to do "pitch = FFALIGN(width, 16)" instead 
of "pitch = FFALIGN(width, 8)" so plane->pixels[ctx->buf_sel] is 16-byte 
aligned.

-Vitor



More information about the ffmpeg-devel mailing list