[FFmpeg-devel] Indeo3 replacement, take 3
Vitor Sessak
vitor1001
Mon Nov 2 03:41:41 CET 2009
Maxim wrote:
> Vitor Sessak schrieb:
>> [...]
>>> Note that since plane->pixels[] is not aligned, dst is not aligned
>>> neither. So I'd suggest something on the lines of
>>>
>>>> typedef struct Plane {
>>>> uint8_t *buffers[2];
>>>> DECLARE_ALIGNED_16(uint8_t, *pixels[2]); ///< pointer to the
>>>> actual pixel data of the buffers above
>>>> uint32_t width;
>>>> uint32_t height;
>>>> uint32_t pitch;
>>>> } Plane;
>> Err, scrap that, I see that pixels[] are pointers to av_malloc'ed
>> buffers, hence aligned. So no ideas here. Does anyone know the actual
>> alignment requirements of dsp.put_no_rnd_pixels_tab? It is documented
>> nowhere...
>>
>
>
> Two days with the GDB and I've found out what's up!
Nice!
> The problem resides
> in the function "put_pixels16_altivec()" from dsputils_ppc.c and is
> caused actually by the wrong alignment
Please send a patch adding a doxy documentation explaining the alignment
requirements of dsp.put_no_rnd_pixels_tab. It might save a few days of
GDB work of the next person who has the same problem ;)
> /* setup output and reference pointers */
> offset = (cell->ypos << 2) * plane->pitch + (cell->xpos << 2);
> dst = plane->pixels[ctx->buf_sel] + offset;
> mv_y = cell->mv_ptr[0];
> mv_x = cell->mv_ptr[1];
> offset += mv_y * plane->pitch + mv_x;
> src = plane->pixels[ctx->buf_sel ^ 1] + offset;
>
> h = cell->height << 2;
>
> if (HAVE_ALTIVEC && ((int)dst & 15)) {
> ctx->dsp.put_no_rnd_pixels_tab[1][0](dst, src, plane->pitch, h);
> src += 8;
> dst += 8;
> }
This code goes a little against the dsputils philosophy. From what I
understand, it is supposed to "just work", no matter what CPU, if the
right alignment requirements are fulfilled. The problem with your
approach is that it can fail when one implements
dsp.put_no_rnd_pixels_tab[0][0] in NEON/SS4/AVR32/etc requiring 16-byte
alignment. Also casting the integer to pointer is not very pretty. What
I suggest is
int offset_dst = (cell->ypos << 2) * plane->pitch + (cell->xpos << 2);
[...]
if (offset_dst & 15) {
ctx->dsp.put_no_rnd_pixels_tab[1][0](dst, src, plane->pitch, h);
src += 8;
dst += 8;
}
but for this to work you have to do "pitch = FFALIGN(width, 16)" instead
of "pitch = FFALIGN(width, 8)" so plane->pixels[ctx->buf_sel] is 16-byte
aligned.
-Vitor
More information about the ffmpeg-devel
mailing list