[FFmpeg-devel] [PATCH] WMA Voice decoder

Ronald S. Bultje rsbultje
Fri Jan 22 17:33:49 CET 2010


Hi,

2010/1/22 M?ns Rullg?rd <mans at mansr.com>:
> "Ronald S. Bultje" <rsbultje at gmail.com> writes:
>> On Fri, Jan 22, 2010 at 11:15 AM, Uoti Urpala <uoti.urpala at pp1.inet.fi> wrote:
>>> x*49995 / 41 = x*(1219*41 + 16) / 41 = 1219*x + x * 16 / 41
>>> In the last form x*16 is at most 1048544.
>>
>> In fastdiv form, this'd be 3 muls, an add and a shift. Is that still
>> faster than 1 mul + 1 div?
>
> Don't forget the table lookup.
>
> Multiplication takes typically 3-5 cycles. ?Division takes 15-40
> cycles if the CPU has a hardware divider, 50-100 cycles if not. ?The
> fastdiv should be faster, even when including the table lookup. ?If
> this is called at all frequently, the table should be in L2 cache,
> which costs typically 10-20 cycles, still faster than a division.

If you read 3 numbers (you need the "41" in fastdiv-form, the 1219 and
the 16), does it make a difference? Or because you'd put them in the
same table [9][3] (3 numbers for 9 possible values of y), that's not
an issue?

Ronald



More information about the ffmpeg-devel mailing list