[FFmpeg-devel] transcoding on nvidia tesla
Sun Feb 10 23:25:04 CET 2008
On Sun, Feb 10, 2008 at 11:12:23PM +0100, christophelorenz wrote:
> Having done some gpu dev, I can tell that there's some good and some
> very bad things to do...
> Easy ones, -huge- performance increase :
> Rescaling with various algos, color space conversions, basic deblocking,
> denoise ...
> More tricky, probably faster by factor of 10 but with quite some
> optimisation and dev time :
> (i)Motion compensation, (i)dct, wavelets ...
> Useless, same speed or 10x slower : (because conditionnal branching
> cannot be avoided)
> Byte stream parsing, sorting...
> Total lost of time and 100x slower on gpu : (gpu probably has to
> emulates all the required bit functions and data impose a serial
> operation so no parallelisation is possible)
> Bit stream parsing....
Quite what I figured with only theory and some FX5200-level GPU
> CUDA has a much better memory transfer performance than DirectX /
> OpenGL, examples show 3Gbytes/sec (up and down) but it vastly depends on
> motherboard used.
> Anyhow, it is still a memory copy. If you need to do this often it will
> ruin performance.
Hmm... I though when using things like PixelBuffers the mapped memory
can (and if you are lucky will) be graphics memory (or at the very least
directly DMA-capable), so no additional memcpy would be necessary if you
write/read directly into/from that.
There is still some additional latency though.
And admittedly I never got it to work with anything besides RGB32 data...
More information about the ffmpeg-devel