[FFmpeg-devel] transcoding on nvidia tesla

Reimar Döffinger Reimar.Doeffinger
Fri Feb 1 11:20:16 CET 2008


Hello,
On Thu, Jan 31, 2008 at 04:09:22PM -0800, Dan Khasis wrote:
> The performance enhancements are 100% real. For example,
> http://www.o0o.it/pro/, shows that a Mac Pro with two dual core xeon
> woodcrests at 2.66ghz crank out no more than 60gigaflops. So a tesla at a
> minimum of 8 teraflops per pc is 130 times more powerful. 

FLOPS are a useless measure. I can easily achieve TFLOPS if I allow
adding 1.0 as the only operation.
I guess you did not read the test results of the first H.264 decoders
using the GPU? The were _slower_ than a well optimized CPU-only decoder
(CoreAVC) while sometimes even decoding incorrectly.
Things have much improved nowadays, but this is _in part_ also due to
dedicated hardware for doing the bitstream decoding.

> If it [tesla] wasn't that much higher in performance, I'm sure nvidia
> wouldn't have invested tens of millions or more into developing such a
> system, if it wasn't powerful. Also, ATI made an announcement to release
> something similar to tesla in 09.

Sure, GPUs are extremely powerful. For some things. But there are also
algorithms that will not even achieve the speed of a Pentium (non-MMX)
on a GPU. If just 1% of your code is of such a kind, that definitely
limits your speedup to 100x, even if you have the perfect
implementation.
I do not want to be mean, but I have the impression that you know
nothing at all about GPU processing besides the marketing numbers.
In which case if you want to do some coding yourself or just be able to
estimate things yourself, you should try to port some typical algorithms
to the GPU.
Also read up on maybe
http://developer.nvidia.com/object/gpu-gems-3.html,
e.g.
http://developer.download.nvidia.com/books/gpu_gems_3/samples/gems3_ch38.pdf
seems to also talk about some of the problems (though I only had a quick
look).

Greetings,
Reimar D?ffinger




More information about the ffmpeg-devel mailing list