[FFmpeg-devel] [PATCH] improve yuv422p to rgb in libswscale

Baptiste Coudurier baptiste.coudurier
Wed Dec 1 04:05:05 CET 2010


On 11/30/2010 06:53 PM, Michael Niedermayer wrote:
> On Tue, Nov 30, 2010 at 06:29:32PM -0800, Baptiste Coudurier wrote:
>> On 11/30/2010 06:15 PM, Michael Niedermayer wrote:
>>> On Tue, Nov 30, 2010 at 05:20:42PM -0800, Baptiste Coudurier wrote:
>>>> On 11/30/2010 05:13 PM, Michael Niedermayer wrote:
>>>>> On Tue, Nov 30, 2010 at 04:07:20AM -0800, Baptiste Coudurier wrote:
>>>>>> Hi
>>>>>>
>>>>>> $subject, use full vertical data when convert 422p, improve quality a lot.
>>>>>>
>>>>>> --
>>>>>> Baptiste COUDURIER
>>>>>> Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
>>>>>> FFmpeg maintainer                                  http://www.ffmpeg.org
>>>>>
>>>>>>     x86/yuv2rgb_template.c |   25 ++++-------
>>>>>>     yuv2rgb.c              |  109 ++++++-------------------------------------------
>>>>>>     2 files changed, 26 insertions(+), 108 deletions(-)
>>>>>> 16f384c9b114c76572a539511c42267ce2942c67  yuv422p_to_rgb.patch
>>>>>
>>>>> this looks like it would make 420p->rgb quite a bit slower.
>>>>
>>>> Do you think changing>>1 to>>vshift would make is quite a bit slower ?
>>>
>>> no but that stuff:
>>> @@ -152,134 +144,102 @@
>>>    YUV2RGBFUNC(yuv2rgb_c_48, uint8_t, 0)
>>>        LOADCHROMA(0);
>>>        PUTRGB48(dst_1,py_1,0);
>>> -    PUTRGB48(dst_2,py_2,0);
>>>
>>>        LOADCHROMA(1);
>>> -    PUTRGB48(dst_2,py_2,1);
>>>        PUTRGB48(dst_1,py_1,1);
>>>
>>>        LOADCHROMA(2);
>>>        PUTRGB48(dst_1,py_1,2);
>>> -    PUTRGB48(dst_2,py_2,2);
>>>
>>>        LOADCHROMA(3);
>>> -    PUTRGB48(dst_2,py_2,3);
>>>        PUTRGB48(dst_1,py_1,3);
>>>    ENDYUV2RGBLINE(48)
>>>        LOADCHROMA(0);
>>>        PUTRGB48(dst_1,py_1,0);
>>> -    PUTRGB48(dst_2,py_2,0);
>>>
>>>        LOADCHROMA(1);
>>> -    PUTRGB48(dst_2,py_2,1);
>>>        PUTRGB48(dst_1,py_1,1);
>>>    ENDYUV2RGBFUNC()
>>>
>>> and then running the code twice
>>
>> This is the C code, the mmx routine for yuv420 to rgb24 is already present.
>>
>> I don't understand running the code twice, can you please clarify ?
>
> yes, iam seeing this:
> -    for (y=0; y<srcSliceH; y+=2) {\
> +    for (y=0; y<srcSliceH; y++) {\
>
> and thus iam thinking it runs the code twice after the patch,
> do i miss something?

The old code was processing 2 lines at once, see py_1 and py_2, that's 
why a lot of code is removed in the macros.

Do you agree that the mmx change looks trivial ?

[...]

-- 
Baptiste COUDURIER
Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
FFmpeg maintainer                                  http://www.ffmpeg.org



More information about the ffmpeg-devel mailing list