[FFmpeg-devel] GSoC Weekly report (libswscale)

Michael Niedermayer michael at niedermayer.cc
Sat Aug 15 12:24:09 CEST 2015


On Sat, Aug 15, 2015 at 12:17:27AM -0300, Pedro Arthur wrote:
> Hi,
> Since the last patch I was trying to improve the performance regression.
> First I tried to process horizontal lines in batches, processing
> (horizontal_filter_size + n)
> lines at a time. I also tried to remove branch code from the processing
> function, for example:
> int process(...) {
>     if (c->hcscale_fast) {
>         do_x()
>     } else {
>          do_y()
>      }
> }
> changed to:
> int process_fast(...) {do_x()}
> int process_(...) {do_y()}
> 
> But these changes more or less didn't improve the performance at all.

yes, a single if() more or less per line is unlikely to make
much of a differece, lines have hudreads of pixels normally so they,
compared to pixels would only have a comparably small impact


> As the most significant difference between the old and new code is that
> the color conversion is separated from the horizontal scaling I merged
> back the color conversion with the horizontal scaling and the performance
> seemed to be on par with the original code again.
> 
> One point I would like to comment is the performance measurement method. I
> used 3 methods
> 1 - using the scaling code, scale each line n times and measure the total
> scaling time
> this method was the most reliable as the measured time deviation between
> different runs
> was > 0.1%.
> 2 - Call the scaling function n times, this method was not much reliable
> with total time
> deviation of 0.1% to 20%.
> 3 - Run the program n times,  measured time as not reliable deviation of
> 10%-30%.
> For all the 3 methods the time measurement as done for only the horizontal
> scaling code.
> 
> I think method 2 and 3 would be more close to real world usage but its
> deviation is to high
> to get any conclusion from its results.
> 
> 
> Using method 1 with merge color conversion + horizontal scaling performance
> seems to be
> on par with the original code.
> 
> Some numbers. Performance penalty %. (< 0 means gain)
> 

these are not git patches

> A - New code

doesnt compile (but that doesnt matter as you say this is slower anyway)
libswscale/swscale.c: In function ‘swscale’:
libswscale/swscale.c:529:18: error: ‘i’ undeclared (first use in this function)


> B - New code with merged color conversion and horizontal scaling

time ./ffmpeg -i matrixbench_mpeg2.mpg -an -vf scale=1920:1080,scale=720:480 -f null -
old code:
real    0m20.730s
real    0m20.763s
real    0m20.765s

new code:
real    0m20.929s
real    0m20.892s
real    0m20.893s


> C - B + line batches

new code:
real    0m20.730s
real    0m20.690s
real    0m20.683s

also this seems well working except
make -j4 libswscale/swscale-test
gdb --args libswscale/swscale-test
r
bt
#0  ff_rgbaToY_avx.loop () at libswscale/x86/input.asm:524
#1  0x000000000044cc17 in lum_h_scale1 (c=0x6d7100, desc=0x6e29a0, sliceY=6, sliceH=5) at libswscale/hscale.c:115
#2  0x00000000004059e9 in swscale (c=0x6d7100, src=0x7fffffffe120, srcStride=0x7fffffffe160, srcSliceY=0, srcSliceH=96, dst=0x7fffffffe140, dstStride=0x7fffffffe170) at libswscale/swscale.c:558
#3  0x00000000004082d0 in sws_scale (c=0x6d7100, srcSlice=0x7fffffffe330, srcStride=0x7fffffffe370, srcSliceY=0, srcSliceH=96, dst=0x7fffffffe350, dstStride=0x7fffffffe380) at libswscale/swscale.c:1205
#4  0x00000000004032c6 in main (argc=1, argv=0x7fffffffe4c8) at libswscale/swscale-test.c:402
(gdb) up
#1  0x000000000044cc17 in lum_h_scale1 (c=0x6d7100, desc=0x6e29a0, sliceY=6, sliceH=5) at libswscale/hscale.c:115
115                 c->lumToYV12(lBuf, src[0], src[1], src[2], srcW, pal);
(gdb) print lBuf
$1 = (uint8_t *) 0x6e0460 ""
(gdb) print src[0]
$2 = (const uint8_t *) 0x0

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Asymptotically faster algorithms should always be preferred if you have
asymptotical amounts of data
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20150815/bacd2228/attachment.sig>


More information about the ffmpeg-devel mailing list