[FFmpeg-devel] [PATCH] avcodec/mpegaudio_tablegen: more dynamic initialization speedups

Ganesh Ajjanagadde gajjanagadde at gmail.com
Sat Nov 28 06:46:31 CET 2015


This further speeds up runtime initialization, with identical generated tables.

Sample benchmark (x86-64, Haswell, GNU/Linux):

old:
34441423 decicycles in mpegaudio_tableinit,    8192 runs,      0 skips

new:
10776291 decicycles in mpegaudio_tableinit,    8192 runs,      0 skips

Most low hanging fruit is taken care of here. For some idea, note that
83,064 array elements totalling 233,722 bytes need to be initialized.
Thus, with this patch, we average ~ 12.9 cycles per element or ~ 4.6
cycles per byte.

I personally consider this net ~ 10x and overall perf numbers sufficient
for using dynamic initialization all the time here, especially since the
tables are large.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
-------------------------------------------------------------------------------
The reason this is being posted before pushing in the other one is that if
people agree to do dynamic initialization here, the introduction of avutil/tablegen
can be deferred to a future date.

Note that if one had a ~8000 element static lut for the pow_43 values,
one can bring down the cost slightly, to ~ 8-10 cycles per element but not more,
so definitely not an order of improvement like the current patches.
I personally do not like this "middle path" as I find it too complex for a slight
speed gain, while still having a large ~ 64,000 byte size cost.
-------------------------------------------------------------------------------
 libavcodec/mpegaudio_tablegen.h | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/libavcodec/mpegaudio_tablegen.h b/libavcodec/mpegaudio_tablegen.h
index dd67a09..91b29cb 100644
--- a/libavcodec/mpegaudio_tablegen.h
+++ b/libavcodec/mpegaudio_tablegen.h
@@ -45,23 +45,28 @@ static float expval_table_float[512][16];
 static av_cold void mpegaudio_tableinit(void)
 {
     int i, value, exponent;
-    double exp2_lut[4] = {
+    static const double exp2_lut[4] = {
         1.00000000000000000000, /* 2 ^ (0 * 0.25) */
         1.18920711500272106672, /* 2 ^ (1 * 0.25) */
         M_SQRT2               , /* 2 ^ (2 * 0.25) */
         1.68179283050742908606, /* 2 ^ (3 * 0.25) */
     };
-    double cbrt_lut[16];
+    static double pow43_lut[16];
+    double exp2_base = 2.11758236813575084767080625169910490512847900390625e-22; // 2^(-72)
+    double exp2_val;
+    double pow43_val = 0;
     for (i = 0; i < 16; ++i)
-        cbrt_lut[i] = cbrt(i);
+        pow43_lut[i] = i * cbrt(i);
 
     for (i = 1; i < TABLE_4_3_SIZE; i++) {
-        double value = i / 4;
         double f, fm;
         int e, m;
-        f  = value / IMDCT_SCALAR * cbrt(value) * exp2_lut[i & 3];
+        double value = i / 4;
+        if (i % 4 == 0)
+            pow43_val = value / IMDCT_SCALAR * cbrt(value);
+        f  = pow43_val * exp2_lut[i & 3];
         fm = frexp(f, &e);
-        m  = (uint32_t)(fm * (1LL << 31) + 0.5);
+        m  = llrint(fm * (1LL << 31));
         e += FRAC_BITS - 31 + 5 - 100;
 
         /* normalized to FRAC_BITS */
@@ -69,8 +74,11 @@ static av_cold void mpegaudio_tableinit(void)
         table_4_3_exp[i]   = -e;
     }
     for (exponent = 0; exponent < 512; exponent++) {
+        if (exponent && exponent % 4 == 0)
+            exp2_base *= 2;
+        exp2_val = exp2_base * exp2_lut[exponent % 4] / IMDCT_SCALAR;
         for (value = 0; value < 16; value++) {
-            double f = value * cbrt_lut[value] * pow(2, (exponent - 400) * 0.25 + FRAC_BITS + 5) / IMDCT_SCALAR;
+            double f = pow43_lut[value] * exp2_val;
             expval_table_fixed[exponent][value] = (f < 0xFFFFFFFF ? llrint(f) : 0xFFFFFFFF);
             expval_table_float[exponent][value] = f;
         }
-- 
2.6.2



More information about the ffmpeg-devel mailing list