[FFmpeg-devel] [PATCH] add signature filter for MPEG7 video signature

Thilo Borgmann thilo.borgmann at mail.de
Mon Mar 21 15:00:52 CET 2016

Am 21.03.16 um 14:15 schrieb Gerion Entrup:
> On Montag, 21. März 2016 11:53:27 CET Thilo Borgmann wrote:
>> Am 21.03.16 um 00:14 schrieb Gerion Entrup:
>>> On Sonntag, 20. März 2016 17:01:17 CET Thilo Borgmann wrote:
>>>>> On Sun, Mar 20, 2016 at 12:00:13PM +0100, Gerion Entrup wrote:
>>>> [...]
>>>>> This filter does not implement all features of MPEG7. Missing features:
>>>>> - binary output
>>>>> - compression of signature files
>>>> I assume these features are optional?
>>> Compression is optional (could be set as flag in the binary
>>> representation). I have not found, whether binary output is optional.
>>> It is definitely possible to only work with the XML-Files.
>> Of course, but having an unspecified XML output is almost useless if binary
>> output is not optional. So I think it is crucial to know what the spec says
>> about output.
> The spec defines the XML output my filter do atm and specifies a binary output 
> additional.
>>>>> - work only on (cropped) parts of the video
>>>> How useful is this then? Has fingerprint computed only on (cropped) parts
>>>> of the video any value outside of FFmpeg itself - does this comply to
>>>> the spec so that it can be compared with any other software generating
>>>> it?
>>> To clarify, the filter does not crop anything. The standard defines an
>>> optional cropping to, I guess, concentrate on specific video parts (this
>>> is not implemented). Assuming someone is recording a monitor, then e.g.
>>> the unrelated part of the video could be cropped out. Beside that, the
>>> signature itself is invariant to cropping up to a certain limit.
>>> The cropping values (upper left and bottom right position are specified in
>>> the xml, so another software could either crop the same way or compare
>>> only with the cropped input.
>>> (The fact, that ffmpeg has a cropping filter, would make such a feature
>>> some kind of redundant.)
>> If I understand it correctly, the filter should not crop the image but only
>> use the pixel information within the specified area or the whole image.
>> Making it a filter option is useful, because the fingerprint of a part of
>> the image can be used in a filter chain continuing with the entire image
>> (no actual crop is required).
> Of course it would be great, if the filter would support it. It needs some
> modification to the summed area table. Once the binary output is ready, I will
> try to do it.
>>> The XML is standard compliant.
>> So XML output is compliant to the spec? Or is the XML itself just valid XML?
> The XML output is compliant to the spec. The whole format is specified there.

>>> The signature is not bitexact. 3-4 (ternary)
>>> values in the framesignature differ from the signature of the sample
>>> files, but the conformence tests [1] allow up to 15 ternaryerrors.
>> Bitexact compared to what?
> The institute, where I write the filter, owns the sample files mentioned in the 
> doc together with the correspondent binary and XML signatures (so I could 
> compare it).
>> Does it allow up to 15 ternary errors for assume two inputs are equal enough
>> to be the same image or does it state that the fingerprint itself may
>> differ for 15 ternary errors for the very same image?
> The 15 tenary errors are valid between the sample signature and the 
> reimplementation for the same sample.
> But ok, you seem to be right, the binary representation seems to be necessary. 
> I quote (out of the linked doc):

> "The number of dimensions whose ternary values differ between the test and the 
> reference video signatures shall be less than or equal to 15 out of 380, if 
> the FrameConfidence values of both the test and the reference video signatures 
> are greater than or equal to 4. The ternary values of the frame signature 
> shall be decoded from the binary representation according to Table E.1."

This part of the spec seems to be dealing with comparing a reference image with
a test image. These two images differ, and how I read it, the spec says they are
identified as the same image if (FrameConfidence >= 4 && ndiff(sig_ref,
sig_test) <= 15).

That is about identifying images.
That is not about calculation of the signature.

So again: Does your fingerprint filter produces the exact same fingerprint for
_the exact same image_ like the reference software?
(Input reference image -> filter -> exact same fingerprint like in reference xml?)

If not, the difference has to be understood and your filter has to be updated to
match the reference fingerprint.


More information about the ffmpeg-devel mailing list