[Ffmpeg-devel] FLAC encoder

Michael Niedermayer michaelni
Fri May 26 15:14:42 CEST 2006


On Fri, May 26, 2006 at 03:24:35AM -0400, Justin Ruggles wrote:
> Hello,
> My FLAC encoder has progressed to a point that I am happy with.  Test it
> out, and see what you think.  If the developers show an interest in
> including this in FFmpeg, I will consolidate the files and adapt it for
> inclusion.

requirements for acceptance are at least that the existing crc, rice &
bitstream facilities of libavcodec are used

one suggestions:

instead of
    for all partition orders (calc_optimal_partition_order)
        for all partitions (calc_optimal_rice_params)
            do gradient descent search for k to minimize: (calc_optimal_rice_param)
                the number of bits for all symbols in the current partition (rice_encode_count)

which is obviously going to be slow 
O(n * partition_orders * gardient_descent_steps), you can

sum up all bits in O(n) time and then run the optimization based on the sums
the result is identical

1. convert everything to unsiged by tmp[i] = (2*data[i]) ^ (data[i]>>31)
   which is equivalent to tmp[i] = (data[i] < 0) ? -2*data[i]-1 : 2*data[i];
2. calculate the partial sums (the following is written for readability its
   dead slow if implemented like that:
        part[0][i][j] = tmp[i] & (1<<j)
        part[p][i][j]= part[p-1][2*i][j] + part[p-1][2*i+1][j]

the number of bits for a single symbol is (data>>k) + k + 1 for all symbols
n to m its obviously (m-n)*(k+1) + sum of (data[i]>>k)
and "sum of (data[i]>>k)" is just the sum of the corresponding partial sums
for that area

i hope you understand my suggestion ...

now, how to find the partial sums quickly:
first pass, we want to quickly find the 32 sums of the bits of 2 integers
sounds like it would need a loop with 32 iterations ... no it needs 2
instructions only :)
lsb[i]= d[2i] ^ d[2i+1]
msb[i]= d[2i] & d[2i+1]

pass 2:
sum[1][i][0] = lsb[2i] ^ lsb[2i+1]
sum[1][i][1] = msb[2i] ^ msb[2i+1] ^ (lsb[2i] & lsb[2i+1])
sum[1][i][2] = msb[2i] & msb[2i+1]

pass n:
L= sum[n-1][2i]
R= sum[n-1][2i+1]
sum[n][i][0] = L[0] ^ R[0]
carry[0]     = L[0] & R[0]
sum[n][i][j] = L[j] ^ R[j] ^ carry[j-1]
carry[j] = ((L[j] ^ R[j]) & carry[j-1]) | (L[j] & R[j])
sum[n][i][n] = L[n-1] & R[n-1]

each pass will work only with half as many elements as the previous so this
is fast even though each pass is somewhat more complex then the previous
if you dont understand the whole, keep in mind each variable holds 32 bits not
1 bit which are worked on in parallel

alternatively if this is too complex for you then you could at least
move the "(data[i] < 0) ? -2*data[i]-1 : 2*data[i];" and "k + 1" out
of the innermost loop
and cache the partial bit counts (a partition from n..m with param k
needs exactly as many bits as the sum of the bits needed by the 2 smaller



In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is

More information about the ffmpeg-devel mailing list