[FFmpeg-devel] [PATCH 2/5] truehd: break out part of rematrix_channels into platform-specific callback.

Ben Avison bavison at riscosopen.org
Thu Mar 20 15:55:29 CET 2014


On Thu, 20 Mar 2014 02:07:42 -0000, Michael Niedermayer <michaelni at gmx.at> wrote:
>> i think matrix_coeff is guranteed to fit in int16_t, this would allow
>> simplifying the code
>
> that is using int16 though maybe its not helpfull on arm

Yes, Christophe pointed out the same thing earlier. I can't see any way
to take advantage on ARM (including NEON) of the multiplier only being 16
bits, and in fact packing the matrix_coeff array more tightly would
actually make things worse.

> also the matrix_coeffs are trivial values for the file i looked at
> like 0x3000 or 0x4000
> so optimizing for such special cases might be worth it

Well, the matrices I can see look more like

/ F880, 05C0, 0000, FE40, C000, 0000 \
| 08E0, F8E0, 00C0, FF80, 1040, C000 |
| D900, C600, C000, FD00, DB00, CF00 |
| 0000, C000, D2B0, 0000, 0000, C000 |
\ C000, 0CD4, DBC4, 0000, C000, 0CD4 /

Only the zeros there look worth considering. But there are 2^6 possible
patterns for zeros in one row of that matrix (even worse 2^8 for 7.1
streams), and it doesn't look like any patterns in particular are
especially common. I could imagine all 2^6 or 2^8 possibilities being
expanded out, but that would make the binary 256 times bigger, so hurt
the I cache and branch predictor efficiency, and also need a 1K branch
table (a big chunk of the D cache). Alternatively, it could do some run
time assembly.

But since the matrix can change every frame (typically 40 samples for
TrueHD) my gut instinct is that we're better off sticking with a fixed
number of multiplies, even if some of the coefficients are zero.

Ben


More information about the ffmpeg-devel mailing list