[FFmpeg-devel] hardware aided video decoding
Sun Jul 8 18:00:17 CEST 2007
On Sun, 8 Jul 2007, Attila Kinali wrote:
> On Fri, 6 Jul 2007, Loren Merritt wrote:
>> common mc:
>> The primitive operation of mc is a fir filter. Implement a 2/4/6/8-tap
>> fir filter (applying to a block of pixels) with programmable coefficients
>> and rounding modes, and allow the firs to be chained in arbitrary ways.
>> A generic fir filter could by used for wavelets too.
>> mpeg4 qpel also has some weirdness whereby it mirrors the block edges
>> before sending them into the 8-tap.
> I don't understand how you can abstract MC to a FIR filter.
> From my understanding of MC (which might be wrong) MC uses
> a vector pointing into the previously decoded frame to predict
> the currently processed macro block. To me, that's an operation
> that rather resambles a texture mapping than a FIR filter.
I don't see anything incompatible about those statements. The FIR is
specifying exactly what algorithm the texture mapper uses to predict the
When using textures in 3d rendering, you just need to return a sample
value at a given non-integer location in the texture. The interpolation
algorithm (bilinear, bicubic, lanczos, etc) is an implementation decision.
In video decoding the interpolation is exactly specified, and is different
for each compression standard.
e.g. in h264 the hpel samples are the convolution of the original samples
with the kernel (1 -5 20 20 -5 1)/32. The qpel samples are then the
average of two hpel samples. It keeps the full 14-bit precision between
horizontal and vertical hpel passes, but rounds to 8-bit between hpel and
In vc1 the hpel samples are the convolution of the original samples with
the kernel (-1 9 9 -1)/16. The qpel samples are the convolution of the
original samples with the kernel (-4 53 18 -3)/64 (or its reflection,
depending on which qpel). It rounds to 8-bit between horizontal and
>> Decoding a h264 intra block in a software codec:
>> idct the residual of this block.
>> Predict the pixels of this block, using the decoded pixels of the
>> neighboring blocks (all neighbors: left, top-left, top, top-right), using
>> 1 of 22 prediction modes.
>> Add residual to prediction.
>> Use these newly decoded samples to predict the next block...
>> If you want to do the prediction in hardware without the idct, that's
> This rather sounds like i would like to leave that completely
> in software, as the host cpu has better memory bandwidth and
> has less trouble to handle large and random memory accesses.
If you do h264 intra prediction in software, you must read pixels back
from the gpu.
I was going to also say that the readback is latency sensitive and
must start after decoding one (inter) macroblock and finish before
decoding the next (intra) macroblock, but that can be avoided with
sufficient reorganization of the codec.
More information about the ffmpeg-devel