[FFmpeg-devel] [PATCH] PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 VSX SIMD.

Dan Parrot dan.parrot at mail.com
Mon Jul 4 21:18:48 EEST 2016


On Mon, 2016-07-04 at 16:30 +0000, Carl Eugen Hoyos wrote:
> Dan Parrot <dan.parrot <at> mail.com> writes:
> 
> > > Did you test if using ffmpeg -benchmark -f rawvideo -i /dev/zero... 
> > > showed different results?
> > > I believe this should be both easier and faster to test.
> >
> > Sorry, I don't understand what that command line just above 
> > is trying to achieve. Could you elaborate?
> 
> Instead of running the whole fate suite that takes long and 
> does not test libswscale for most commands, just test an 
> ffmpeg command line that only tests libswscale:
> $ ffmpeg -benchmark -f rawvideo -pix_fmt rgb24 
> -i /dev/zero -pix_fmt yuv420p -f null -vframes 10000 -
> vs
> 
> $ ffmpeg -cpuflags 0 -benchmark -f rawvideo -pix_fmt rgb24 
> -i /dev/zero -pix_fmt yuv420p -f null -vframes 10000 -
> 
Ok. Thanks for the explanation. I will run those commands and post the
reported results.

> [...]
> 
> > Surprisingly, gcc is producing some badly suboptimal assembly.
> 
> Just to make sure I don't misunderstand:
> Does this mean intrinsics are suboptimal to write assembly 
> code?
Here's what I mean: All variables below are of type "vector int"

1. v0 = v2 * v3
2. v0 = v4 * v5 + v6 * v7 + v8 * v9

The first statement produces 1 multiply, 1 multiply-sum and 1 addition
instruction in assembly.

The second produces 6 multiply, 6 multiply-sum, and 10 addition
instructions in assembly! I expected 3, 3, 3 of each respective
operations from (1) plus 2 additions.

> 
> > > Can you confirm with START_TIMER / STOP_TIMER that there is no 
> > > gain?
> >
> > SystemTap probes provide identical functionality by measuring 
> > deltas between function entry and function return.
> 
> Sorry, I don't understand:
> Did you test with both methods to verify that they provide 
> the same results?
> 
> Note that if it turns out that START_TIMER / STOP_TIMER 
> cannot be used on ppc64 (le) this would be important 
> information for us.
> 
I'll insert these macros and inform of the results if the code compiles
and runs.




More information about the ffmpeg-devel mailing list