[FFmpeg-devel] [PATCH] Optimize QTRLE encoding

Malcolm Bechard malcolm.bechard at gmail.com
Tue Feb 12 15:56:21 CET 2013


Attached is the base64 encoded patch file.
The goal is to remove this loop which causes a 1->127 loop for every pixel

for (j = 1; j <= limit; j++) {
    if (s->length_table[i + j] + temp_cost < total_bulk_cost) {
        /* We have found a better bulk copy ... */
        total_bulk_cost = s->length_table[i + j] + temp_cost;
        bulkcount = j;
    }
    temp_cost += s->pixel_size;
}

Output video files should be identical to the old algorithm in both size
and binary content.

Performance gains may not be as strong on gcc since I was comparing (old
code) gcc vs. (new code) VS2010 in my initial comparisons. I expect 2-4x
speedups with gcc.

Feedback is appreciated since this is my first patch.
-------------- next part --------------
RnJvbSA4MzAzMTkwMTgwOGFhZDcyZDFkOTM2ZTRhOGJiNDRiMzg0MWJiNmJlIE1vbiBTZXAgMTcg
MDA6MDA6MDAgMjAwMQpGcm9tOiBNYWxjb2xtIEJlY2hhcmQgPG1hbGNvbG0uYmVjaGFyZEBnbWFp
bC5jb20+CkRhdGU6IFR1ZSwgMTIgRmViIDIwMTMgMDk6MDM6MzMgLTA1MDAKU3ViamVjdDogW1BB
VENIXSBJbXByb3ZlIFFUUkxFIGVuY29kaW5nIHBlcmZvcm1hbmNlLCBubyBjaGFuZ2UgdG8gb3V0
cHV0IGZpbGUKIHNpemUvY29udGVudAoKQXZvaWQgc2VhcmNoaW5nIGZvciB0aGUgbG93ZXN0IGJ1
bGsgY29zdCBmb3IgZWFjaCBwaXhlbCB0aGF0IGlzbid0IGEgcmVwZWF0L3NraXAuIEluc3RlYWQg
c3RvcmUgdGhlIGxvd2VzdCBjb3N0IGFzIHdlIGdvIGFsb25nIGVhY2ggcGl4ZWwsIGFuZCB1c2Ug
aXQgYXMgbmVlZGVkLgoKU2lnbmVkLW9mZi1ieTogTWFsY29sbSBCZWNoYXJkIDxtYWxjb2xtLmJl
Y2hhcmRAZ21haWwuY29tPgotLS0KIGxpYmF2Y29kZWMvcXRybGVlbmMuYyB8ICAxMTUgKysrKysr
KysrKysrKysrKysrKysrKysrKysrKysrKysrKystLS0tLS0tLS0tLS0tCiAxIGZpbGVzIGNoYW5n
ZWQsIDg0IGluc2VydGlvbnMoKyksIDMxIGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL2xpYmF2
Y29kZWMvcXRybGVlbmMuYyBiL2xpYmF2Y29kZWMvcXRybGVlbmMuYwppbmRleCBlMTUxYzllLi45
NTI1ZmJmIDEwMDY0NAotLS0gYS9saWJhdmNvZGVjL3F0cmxlZW5jLmMKKysrIGIvbGliYXZjb2Rl
Yy9xdHJsZWVuYy5jCkBAIC0xMjEsOCArMTIxLDYgQEAgc3RhdGljIHZvaWQgcXRybGVfZW5jb2Rl
X2xpbmUoUXRybGVFbmNDb250ZXh0ICpzLCBjb25zdCBBVkZyYW1lICpwLCBpbnQgbGluZSwgdWkK
ICAgICBpbnQgaTsKICAgICBzaWduZWQgY2hhciBybGVjb2RlOwogCi0gICAgLyogV2Ugd2lsbCB1
c2UgaXQgdG8gY29tcHV0ZSB0aGUgYmVzdCBidWxrIGNvcHkgc2VxdWVuY2UgKi8KLSAgICB1bnNp
Z25lZCBpbnQgYXZfdW5pbml0KGJ1bGtjb3VudCk7CiAgICAgLyogVGhpcyB3aWxsIGJlIHRoZSBu
dW1iZXIgb2YgcGl4ZWxzIGVxdWFsIHRvIHRoZSBwcmVpdm91cyBmcmFtZSBvbmUncwogICAgICAq
IHN0YXJ0aW5nIGZyb20gdGhlIGl0aCBwaXhlbCAqLwogICAgIHVuc2lnbmVkIGludCBza2lwY291
bnQ7CkBAIC0xMzEsMTIgKzEyOSwxNSBAQCBzdGF0aWMgdm9pZCBxdHJsZV9lbmNvZGVfbGluZShR
dHJsZUVuY0NvbnRleHQgKnMsIGNvbnN0IEFWRnJhbWUgKnAsIGludCBsaW5lLCB1aQogICAgIHVu
c2lnbmVkIGludCBhdl91bmluaXQocmVwZWF0Y291bnQpOwogCiAgICAgLyogVGhlIGNvc3Qgb2Yg
dGhlIHRocmVlIGRpZmZlcmVudCBwb3NzaWJpbGl0aWVzICovCi0gICAgaW50IHRvdGFsX2J1bGtf
Y29zdDsKICAgICBpbnQgdG90YWxfc2tpcF9jb3N0OwogICAgIGludCB0b3RhbF9yZXBlYXRfY29z
dDsKKyAgICBpbnQgcGl4ZWxfc2l6ZTsKIAotICAgIGludCB0ZW1wX2Nvc3Q7Ci0gICAgaW50IGo7
CisgICAgLy8gVGhlc2Ugd2lsbCBzdG9yZSB0aGUgbG93ZXN0IGFuZCBzZWNvbmQgbG93ZXN0IGJ1
bGsgY29zdCB2YWx1ZS9wb3N0aW9ucworICAgIGludCBsb3dlc3RfY29zdDsKKyAgICBpbnQgbG93
ZXN0X2Nvc3RfaW5kZXg7CisgICAgaW50IHNlY19sb3dlc3RfY29zdDsKKyAgICBpbnQgc2VjX2xv
d2VzdF9jb3N0X2luZGV4OwogCiAgICAgdWludDhfdCAqdGhpc19saW5lID0gcC0+ICAgICAgICAg
ICAgICAgZGF0YVswXSArIGxpbmUqcC0+ICAgICAgICAgICAgICAgbGluZXNpemVbMF0gKwogICAg
ICAgICAod2lkdGggLSAxKSpzLT5waXhlbF9zaXplOwpAQCAtMTQ2LDkgKzE0Nyw2NiBAQCBzdGF0
aWMgdm9pZCBxdHJsZV9lbmNvZGVfbGluZShRdHJsZUVuY0NvbnRleHQgKnMsIGNvbnN0IEFWRnJh
bWUgKnAsIGludCBsaW5lLCB1aQogICAgIHMtPmxlbmd0aF90YWJsZVt3aWR0aF0gPSAwOwogICAg
IHNraXBjb3VudCA9IDA7CiAKKyAgICAvLyBJbml0aWFsIHZhbHVlcworICAgIGxvd2VzdF9jb3N0
ID0gSU5UX01BWDsKKyAgICBsb3dlc3RfY29zdF9pbmRleCA9IHdpZHRoOworICAgIHNlY19sb3dl
c3RfY29zdCA9IElOVF9NQVg7CisgICAgc2VjX2xvd2VzdF9jb3N0X2luZGV4ID0gd2lkdGg7CisK
KyAgICBwaXhlbF9zaXplID0gcy0+cGl4ZWxfc2l6ZTsKKwogICAgIGZvciAoaSA9IHdpZHRoIC0g
MTsgaSA+PSAwOyBpLS0pIHsKIAotICAgICAgICBpZiAoIXMtPmZyYW1lLmtleV9mcmFtZSAmJiAh
bWVtY21wKHRoaXNfbGluZSwgcHJldl9saW5lLCBzLT5waXhlbF9zaXplKSkKKyAgICAgICAgaW50
IGJhc2VfY29zdCwgcHJldl9jb3N0OworICAgICAgICBpbnQgbGltaXQ7CisgICAgICAgIAorICAg
ICAgICBpbnQgcHJldiA9IGkgKyAxOworCisgICAgICAgIGJhc2VfY29zdCA9IDEgKyBwaXhlbF9z
aXplOworICAgICAgICBsaW1pdCA9IEZGTUlOKHdpZHRoIC0gaSwgTUFYX1JMRV9CVUxLKTsKKwor
ICAgICAgICAvLyBJZiBvdXIgbG93ZXN0IGNvc3QgaW5kZXggaXMgdG9vIGZhciBhd2F5LCByZXBs
YWNlIGl0CisgICAgICAgIC8vIHdpdGggdGhlIG5leHQgbG93ZXN0IGNvc3QKKyAgICAgICAgaWYg
KGkgKyBsaW1pdCA8IGxvd2VzdF9jb3N0X2luZGV4KQorICAgICAgICB7CisgICAgICAgICAgICBs
b3dlc3RfY29zdCA9IHNlY19sb3dlc3RfY29zdDsKKyAgICAgICAgICAgIGxvd2VzdF9jb3N0X2lu
ZGV4ID0gc2VjX2xvd2VzdF9jb3N0X2luZGV4OworCisgICAgICAgICAgICBzZWNfbG93ZXN0X2Nv
c3QgPSBJTlRfTUFYOworICAgICAgICAgICAgc2VjX2xvd2VzdF9jb3N0X2luZGV4ID0gd2lkdGg7
CisgICAgICAgIH0KKworICAgICAgICAvLyBEZWFsIHdpdGggdGhlIGZpcnN0IHBpeGVsJ3MgY29z
dAorICAgICAgICBpZiAoaSA9PSAwKQorICAgICAgICB7CisgICAgICAgICAgICBiYXNlX2Nvc3Qr
KzsKKyAgICAgICAgICAgIGxvd2VzdF9jb3N0Kys7CisgICAgICAgICAgICBzZWNfbG93ZXN0X2Nv
c3QrKzsKKyAgICAgICAgfQorCisgICAgICAgIC8vIExvb2sgYXQgdGhlIGNvc3Qgb2YgdGhlIHBy
ZXZpb3VzIGxvb3AgYW5kIHNlZSBpZiBpdCBpcworICAgICAgICAvLyBhIG5ldyBsb3dlciBjb3N0
CisgICAgICAgIHByZXZfY29zdCA9IHMtPmxlbmd0aF90YWJsZVtwcmV2XSArIGJhc2VfY29zdDsK
KyAgICAgICAgaWYgKHByZXZfY29zdCA8PSBzZWNfbG93ZXN0X2Nvc3QpIHsKKyAgICAgICAgICAg
IC8vIElmIGl0J3MgbG93ZXIgdGhhbiB0aGUgMm5kIGxvd2VzdCwgdGhlbiBpdCBtYXkgYmUgbG93
ZXIKKyAgICAgICAgICAgIC8vIHRoYW4gdGhlIGxvd2VzdCAKKyAgICAgICAgICAgIGlmIChwcmV2
X2Nvc3QgPD0gbG93ZXN0X2Nvc3QpIHsKKyAgICAgICAgICAgIAorICAgICAgICAgICAgICAgIC8v
IElmIHdlIGhhdmUgZm91bmQgYSBuZXcgbG93ZXN0IGNvc3QsCisgICAgICAgICAgICAgICAgLy8g
dGhlbiB0aGUgMm5kIGxvd2VzdCBjb3N0IGlzIG5vdyBmYXJ0aGVyIHRoYW4gdGhlCisgICAgICAg
ICAgICAgICAgLy8gbG93ZXN0IGNvc3QsIGFuZCB3aWxsIG5ldmVyIGJlIHVzZWQKKyAgICAgICAg
ICAgICAgICBzZWNfbG93ZXN0X2Nvc3QgPSBJTlRfTUFYOworCisgICAgICAgICAgICAgICAgbG93
ZXN0X2Nvc3QgPSBwcmV2X2Nvc3Q7CisgICAgICAgICAgICAgICAgbG93ZXN0X2Nvc3RfaW5kZXgg
PSBwcmV2OworICAgICAgICAgICAgfSBlbHNlIHsKKyAgICAgICAgICAgICAgICAvLyBUaGVuIGl0
IG11c3QgYmUgdGhlIDJuZCBsb3dlc3QgY29zdAorICAgICAgICAgICAgICAgIHNlY19sb3dlc3Rf
Y29zdCA9IHByZXZfY29zdDsKKyAgICAgICAgICAgICAgICBzZWNfbG93ZXN0X2Nvc3RfaW5kZXgg
PSBwcmV2OworICAgICAgICAgICAgfQorICAgICAgICB9CisgICAgICAgIAorICAgICAgICBpZiAo
IXMtPmZyYW1lLmtleV9mcmFtZSAmJiAhbWVtY21wKHRoaXNfbGluZSwgcHJldl9saW5lLCBwaXhl
bF9zaXplKSkKICAgICAgICAgICAgIHNraXBjb3VudCA9IEZGTUlOKHNraXBjb3VudCArIDEsIE1B
WF9STEVfU0tJUCk7CiAgICAgICAgIGVsc2UKICAgICAgICAgICAgIHNraXBjb3VudCA9IDA7CkBA
IC0xNTcsMTIgKzIxNSwxMyBAQCBzdGF0aWMgdm9pZCBxdHJsZV9lbmNvZGVfbGluZShRdHJsZUVu
Y0NvbnRleHQgKnMsIGNvbnN0IEFWRnJhbWUgKnAsIGludCBsaW5lLCB1aQogICAgICAgICBzLT5z
a2lwX3RhYmxlW2ldID0gc2tpcGNvdW50OwogCiAKLSAgICAgICAgaWYgKGkgPCB3aWR0aCAtIDEg
JiYgIW1lbWNtcCh0aGlzX2xpbmUsIHRoaXNfbGluZSArIHMtPnBpeGVsX3NpemUsIHMtPnBpeGVs
X3NpemUpKQorICAgICAgICBpZiAoaSA8IHdpZHRoIC0gMSAmJiAhbWVtY21wKHRoaXNfbGluZSwg
dGhpc19saW5lICsgcGl4ZWxfc2l6ZSwgcGl4ZWxfc2l6ZSkpCiAgICAgICAgICAgICByZXBlYXRj
b3VudCA9IEZGTUlOKHJlcGVhdGNvdW50ICsgMSwgTUFYX1JMRV9SRVBFQVQpOwogICAgICAgICBl
bHNlCiAgICAgICAgICAgICByZXBlYXRjb3VudCA9IDE7CiAKLSAgICAgICAgdG90YWxfcmVwZWF0
X2Nvc3QgPSBzLT5sZW5ndGhfdGFibGVbaSArIHJlcGVhdGNvdW50XSArIDEgKyBzLT5waXhlbF9z
aXplOworICAgICAgICB0b3RhbF9yZXBlYXRfY29zdCA9IHMtPmxlbmd0aF90YWJsZVtpICsgcmVw
ZWF0Y291bnRdICsgMSArIHBpeGVsX3NpemU7CisKIAogICAgICAgICAvKiBza2lwIGNvZGUgaXMg
ZnJlZSBmb3IgdGhlIGZpcnN0IHBpeGVsLCBpdCBjb3N0cyBvbmUgYnl0ZSBmb3IgcmVwZWF0IGFu
ZCBidWxrIGNvcHkKICAgICAgICAgICogc28gbGV0J3MgbWFrZSBpdCBhd2FyZSAqLwpAQCAtMTgz
LDI4ICsyNDIsMjIgQEAgc3RhdGljIHZvaWQgcXRybGVfZW5jb2RlX2xpbmUoUXRybGVFbmNDb250
ZXh0ICpzLCBjb25zdCBBVkZyYW1lICpwLCBpbnQgbGluZSwgdWkKICAgICAgICAgfQogICAgICAg
ICBlbHNlIHsKICAgICAgICAgICAgIC8qIFdlIGNhbm5vdCBkbyBuZWl0aGVyIHNraXAgbm9yIHJl
cGVhdAotICAgICAgICAgICAgICogdGh1cyB3ZSBzZWFyY2ggZm9yIHRoZSBiZXN0IGJ1bGsgY29w
eSB0byBkbyAqLwotCi0gICAgICAgICAgICBpbnQgbGltaXQgPSBGRk1JTih3aWR0aCAtIGksIE1B
WF9STEVfQlVMSyk7CisgICAgICAgICAgICAgKiB0aHVzIHdlIHVzZSB0aGUgYmVzdCBidWxrIGNv
cHkgICovCiAKLSAgICAgICAgICAgIHRlbXBfY29zdCA9IDEgKyBzLT5waXhlbF9zaXplICsgIWk7
Ci0gICAgICAgICAgICB0b3RhbF9idWxrX2Nvc3QgPSBJTlRfTUFYOworICAgICAgICAgICAgcy0+
bGVuZ3RoX3RhYmxlW2ldICA9IGxvd2VzdF9jb3N0OworICAgICAgICAgICAgcy0+cmxlY29kZV90
YWJsZVtpXSA9IGxvd2VzdF9jb3N0X2luZGV4IC0gaTsKIAotICAgICAgICAgICAgZm9yIChqID0g
MTsgaiA8PSBsaW1pdDsgaisrKSB7Ci0gICAgICAgICAgICAgICAgaWYgKHMtPmxlbmd0aF90YWJs
ZVtpICsgal0gKyB0ZW1wX2Nvc3QgPCB0b3RhbF9idWxrX2Nvc3QpIHsKLSAgICAgICAgICAgICAg
ICAgICAgLyogV2UgaGF2ZSBmb3VuZCBhIGJldHRlciBidWxrIGNvcHkgLi4uICovCi0gICAgICAg
ICAgICAgICAgICAgIHRvdGFsX2J1bGtfY29zdCA9IHMtPmxlbmd0aF90YWJsZVtpICsgal0gKyB0
ZW1wX2Nvc3Q7Ci0gICAgICAgICAgICAgICAgICAgIGJ1bGtjb3VudCA9IGo7Ci0gICAgICAgICAg
ICAgICAgfQotICAgICAgICAgICAgICAgIHRlbXBfY29zdCArPSBzLT5waXhlbF9zaXplOwotICAg
ICAgICAgICAgfQorICAgICAgICB9CiAKLSAgICAgICAgICAgIHMtPmxlbmd0aF90YWJsZVtpXSAg
PSB0b3RhbF9idWxrX2Nvc3Q7Ci0gICAgICAgICAgICBzLT5ybGVjb2RlX3RhYmxlW2ldID0gYnVs
a2NvdW50OworICAgICAgICAvLyBUaGVzZSBjb3N0cyBpbmNyZWFzZSBldmVyeSBpdGVyYXRpb24K
KyAgICAgICAgbG93ZXN0X2Nvc3QgKz0gcGl4ZWxfc2l6ZTsKKyAgICAgICAgaWYgKHNlY19sb3dl
c3RfY29zdCA8IElOVF9NQVgpCisgICAgICAgIHsKKyAgICAgICAgICAgIHNlY19sb3dlc3RfY29z
dCArPSBwaXhlbF9zaXplOwogICAgICAgICB9CiAKLSAgICAgICAgdGhpc19saW5lIC09IHMtPnBp
eGVsX3NpemU7Ci0gICAgICAgIHByZXZfbGluZSAtPSBzLT5waXhlbF9zaXplOworICAgICAgICB0
aGlzX2xpbmUgLT0gcGl4ZWxfc2l6ZTsKKyAgICAgICAgcHJldl9saW5lIC09IHBpeGVsX3NpemU7
CiAgICAgfQogCiAgICAgLyogR29vZCAhIE5vdyB3ZSBoYXZlIHRoZSBiZXN0IHNlcXVlbmNlIGZv
ciB0aGlzIGxpbmUsIGxldCdzIG91dHB1dCBpdCAqLwpAQCAtMjM3LDEwICsyOTAsMTAgQEAgc3Rh
dGljIHZvaWQgcXRybGVfZW5jb2RlX2xpbmUoUXRybGVFbmNDb250ZXh0ICpzLCBjb25zdCBBVkZy
YW1lICpwLCBpbnQgbGluZSwgdWkKICAgICAgICAgICAgICAgICAvLyBRVCBncmF5c2NhbGUgY29s
b3JzcGFjZSBoYXMgMD13aGl0ZSBhbmQgMjU1PWJsYWNrLCB3ZSB3aWxsCiAgICAgICAgICAgICAg
ICAgLy8gaWdub3JlIHRoZSBwYWxldHRlIHRoYXQgaXMgaW5jbHVkZWQgaW4gdGhlIEFWRnJhbWUg
YmVjYXVzZQogICAgICAgICAgICAgICAgIC8vIEFWX1BJWF9GTVRfR1JBWTggaGFzIGRlZmluZWQg
Y29sb3IgbWFwcGluZwotICAgICAgICAgICAgICAgIGZvciAoaiA9IDA7IGogPCBybGVjb2RlKnMt
PnBpeGVsX3NpemU7ICsraikKLSAgICAgICAgICAgICAgICAgICAgYnl0ZXN0cmVhbV9wdXRfYnl0
ZShidWYsICoodGhpc19saW5lICsgaSpzLT5waXhlbF9zaXplICsgaikgXiAweGZmKTsKKyAgICAg
ICAgICAgICAgICBmb3IgKGogPSAwOyBqIDwgcmxlY29kZSpwaXhlbF9zaXplOyArK2opCisgICAg
ICAgICAgICAgICAgICAgIGJ5dGVzdHJlYW1fcHV0X2J5dGUoYnVmLCAqKHRoaXNfbGluZSArIGkq
cGl4ZWxfc2l6ZSArIGopIF4gMHhmZik7CiAgICAgICAgICAgICB9IGVsc2UgewotICAgICAgICAg
ICAgICAgIGJ5dGVzdHJlYW1fcHV0X2J1ZmZlcihidWYsIHRoaXNfbGluZSArIGkqcy0+cGl4ZWxf
c2l6ZSwgcmxlY29kZSpzLT5waXhlbF9zaXplKTsKKyAgICAgICAgICAgICAgICBieXRlc3RyZWFt
X3B1dF9idWZmZXIoYnVmLCB0aGlzX2xpbmUgKyBpKnBpeGVsX3NpemUsIHJsZWNvZGUqcGl4ZWxf
c2l6ZSk7CiAgICAgICAgICAgICB9CiAgICAgICAgICAgICBpICs9IHJsZWNvZGU7CiAgICAgICAg
IH0KQEAgLTI0OSwxMCArMzAyLDEwIEBAIHN0YXRpYyB2b2lkIHF0cmxlX2VuY29kZV9saW5lKFF0
cmxlRW5jQ29udGV4dCAqcywgY29uc3QgQVZGcmFtZSAqcCwgaW50IGxpbmUsIHVpCiAgICAgICAg
ICAgICBpZiAocy0+YXZjdHgtPnBpeF9mbXQgPT0gQVZfUElYX0ZNVF9HUkFZOCkgewogICAgICAg
ICAgICAgICAgIGludCBqOwogICAgICAgICAgICAgICAgIC8vIFFUIGdyYXlzY2FsZSBjb2xvcnNw
YWNlIGhhcyAwPXdoaXRlIGFuZCAyNTU9YmxhY2ssIC4uLgotICAgICAgICAgICAgICAgIGZvciAo
aiA9IDA7IGogPCBzLT5waXhlbF9zaXplOyArK2opCi0gICAgICAgICAgICAgICAgICAgIGJ5dGVz
dHJlYW1fcHV0X2J5dGUoYnVmLCAqKHRoaXNfbGluZSArIGkqcy0+cGl4ZWxfc2l6ZSArIGopIF4g
MHhmZik7CisgICAgICAgICAgICAgICAgZm9yIChqID0gMDsgaiA8IHBpeGVsX3NpemU7ICsraikK
KyAgICAgICAgICAgICAgICAgICAgYnl0ZXN0cmVhbV9wdXRfYnl0ZShidWYsICoodGhpc19saW5l
ICsgaSpwaXhlbF9zaXplICsgaikgXiAweGZmKTsKICAgICAgICAgICAgIH0gZWxzZSB7Ci0gICAg
ICAgICAgICAgICAgYnl0ZXN0cmVhbV9wdXRfYnVmZmVyKGJ1ZiwgdGhpc19saW5lICsgaSpzLT5w
aXhlbF9zaXplLCBzLT5waXhlbF9zaXplKTsKKyAgICAgICAgICAgICAgICBieXRlc3RyZWFtX3B1
dF9idWZmZXIoYnVmLCB0aGlzX2xpbmUgKyBpKnBpeGVsX3NpemUsIHBpeGVsX3NpemUpOwogICAg
ICAgICAgICAgfQogICAgICAgICAgICAgaSAtPSBybGVjb2RlOwogICAgICAgICB9Ci0tIAoxLjcu
OQoK


More information about the ffmpeg-devel mailing list