[FFmpeg-devel] [PATCH 2/5] fate: avoid framemd5, use framecrc its faster

Reimar Döffinger Reimar.Doeffinger at gmx.de
Wed May 8 20:18:49 CEST 2013


On Wed, May 08, 2013 at 07:05:08PM +0200, Michael Niedermayer wrote:
> On Wed, May 08, 2013 at 06:24:25PM +0200, Reimar Döffinger wrote:
> > Michael Niedermayer <michaelni at gmx.at> wrote:
> > 
> > >Signed-off-by: Michael Niedermayer <michaelni at gmx.at>
> > 
> > I disagree with this, CRC can easily miss structural changes,
> 
> Can you give an example of a structural change that has occured over
> the lifetime of ffmpeg in one of its output files that would have
> failed to be detected with CRC (or rather adler32)?

Certainly no actual examples, even if they existed finding
one would be more effort that it is worth.
But for CRC a major issue would be that any changes in parts protected
by a CRC with the same polynomial would be undetectable.
I have never researched adler32 so I can't say whether it might
have similar issues.

> The question is not if you can construct a pattern that gives a
> crc equal to some other because noone sits there and tries to generate
> such collisions, the checks are there to detect unintended bugs.

The real question is if there are structures used in multimedia
formats or content that have a good chance of having collisions.
For CRC my answer would be "at the very least use an unusual
polynomial and preferably 64 bit", but I can't answer that for
adler32, however this paper:
http://www.zlib.net/maxino06_fletcher-adler.pdf seems to claim
it is pretty bad even on random data while also being slower than
Fletcher-32.
It also suggests that they may fail to detect e.g. padding changing
from 0x00 to 0xFF. This might be relevant to the valgrind memory
fill tests if we use 0x00 and 0xff respectively as the test values
(I don't know if we do).

> > particularly with "only" 32 bit, so I have doubts it is really suitable for this purpose.
> > Also the speed difference seems not all that relevant to me.
> 
> Sure thats because you run fate maybe once a day, i run it maybe 50
> times or more a day. The time it takes is definitly relevant to me.

According to your numbers even 100 times would still be 10 minutes per
day. Sorry if I'm inconsiderate, but I just don't consider that a
good risk/benefit ratio.
An interesting question (though to be honest unless you want to
I think I'd have you rather have you work on other stuff) would
be if there is something more hashlike than adler32 with similar speed,
RC4 maybe, I think it can be used for hashing?
Anyway I leave it up to you, it's mostly a gut feeling from my side so
far and not really worth wasting much of your time discussing it.

Reimar


More information about the ffmpeg-devel mailing list