[FFmpeg-devel] [PATCH] Metadata

Michael Niedermayer michaelni
Tue Jan 6 20:30:09 CET 2009


On Mon, Jan 05, 2009 at 03:40:06PM -0800, Baptiste Coudurier wrote:
> Michael Niedermayer wrote:
> > On Mon, Jan 05, 2009 at 11:40:12AM -0800, Baptiste Coudurier wrote:
> >> Hi Michael,
> >>
> >> Michael Niedermayer wrote:
> >>> On Sat, Jan 03, 2009 at 03:26:05PM -0800, Baptiste Coudurier wrote:
> >>>> Hi Michael,
> >>>>
> >>>> Michael Niedermayer wrote:
> >>>>> [...]
> >>>>  >
> >>>>> + * 3. A tag whichs value is translated has the ISO 639 3-letter language code
> >>>>> + *    with a '-' between appended. So for example Author-ger=Michael, Author-eng=Mike
> >>>>> + *    the original/default language is in the unqualified "Author"
> >>>>> + *    A demuxer should set a default if it sets any translated tag.
> >>>>  >
> >>>>> [...]
> >>>>  >
> >>>>> +typedef struct {
> >>>>> +    char *key;
> >>>>> +    char *value;
> >>>>> +}AVMetaDataTag;
> >>>> Maybe it would be simpler and more extensible to have a "const char 
> >>>> **attributes" field where to store language, or anything else related to 
> >>>> the AVMetaDataTag entry. This would avoid parsing the '-'.
> >>>>
> >>>> What do people think ?
> >>> I am against it, let me explain why
> >>>
> >>> First, currently metadata support in svn is "too little" that is nothing
> >>> is really supported, no preserving of arbitrary tags, no way for users to
> >>> add anything but 5 standard tags ...
> >> I definitely agree.
> >>
> >>> Aurels variant, that had a language field and did use a tree based metadata
> >>> system allowing metadata about metadata is IMHO "too much" Its not something
> >>> anyone should need, nor is it really needed for language & metadata about
> >>> metadata, and still it wouldnt be able to handle all metadata about other
> >>> metadata like "the email address of the child of the author and producer"
> >>>
> >>> my sugestion of a simple key-value based system
> >>> can be stored in any container that supporte key-value string based
> >>> metadata, and still can represent language and metadata about other metadata.
> >>> Also it can very easily be implemented efficiently, currently all operations
> >>> are O(n) thus it would become slow if there are many tags. But if we would
> >>> use tree.c/h it would all just be O(log n) and its very easy to use tree.c/h
> >>> with it ...
> >>>
> >>> Now if we do add attributes
> >>> * The api to search for tags becomes more complex
> >>> * It is more difficult to use tree.c/h (it needs like qsort a sanely
> >>>   behaving comparission function, which is trivial for char*, less
> >>>   so with an additional attriute list, and even a lot less if we want
> >>>   to actually search for specific attributes)
> >>>
> >>> * No container i know supports arbitrary attributes, thus muxers would
> >>>   either have to convert the attribute list into a string or extract the
> >>>   2 or 3 they suport.
> >> Well, these are good point.
> >> To be clear, I'm not suggesting a tree metadata scheme, but a way to
> >> easily specifiy this key/value metadata details.
> >>
> >> Like language, type (comes from .mov so excpect '\r' as line separator,
> >> encoding is UTF8, etc...)
> >>
> > 
> >> Parsing for '-' is not convenient, 
> > 
> > either theres a single string, in which case some muxers have to parse for -
> > or
> > there are many fields, in which case some other muxers have to combine them
> > in a single string.
> 
> Which muxers ?

from greping:
ape, asf, nut, ogg
(though, note, i did not check the specs so maybe some of them do have
 explicit language fields)

also there may be more considering our current spotchy metadata support


> How does .mkv stores lang metadata info if it does so ?
> 

> All I see is that for .mov you would have to concatenate key name and
> lang, and muxer would have to split lang from metadata.
> Combining is easier than splitting when dealing with strings in my
> experience.

I agree but an extra field, and especally a unordered extensible list of
attributes would make handling them alot more complex.
Like for example if we add tree.c/h support, now they would just need a
strcasecmp() call to compare ... 


> 
> I don't know of any container that use "key"-"lang" at metadata scheme
> (nut maybe ?).

well, any format allowing generic user choosen key strings allows key-lang.


> 
> > The convertion doesnt dissapear and because of this IMHO i would prefer the
> > simpler internal repressentation.
> 
> How would you specify to the user that data stored in value is raw data
>  (like jpeg cover),

This first leads to the question if a cd cover is metadata at all instead of
actual data.
But in that light iam not against adding a
int or char* type if this is really usefull and needed in reality, iam just
slightly afraid that we could overshoot the goal qute a bit and end up with a
overcomplicated API where 98% is not used by anyone or anything ...



> encoded in UTF8/16, special like '\r' line ended ?

IMHO we should only support UTF-8 and and a single type of line ending
(de)muxers using a different formating should convert to/from it

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Observe your enemies, for they first find out your faults. -- Antisthenes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090106/3f96db90/attachment.pgp>



More information about the ffmpeg-devel mailing list