[FFmpeg-devel] GSoC with FFMpeg waht a combination!

Aurelien Jacobs aurel
Sun Mar 23 04:48:44 CET 2008


On Sun, 23 Mar 2008 00:19:39 +0100
Michael Niedermayer <michaelni at gmx.at> wrote:

> On Sun, Mar 23, 2008 at 12:33:04AM +0200, Uoti Urpala wrote:
> > 
> > Also your claim that using
> > strings as keys would necessarily require O(log n) lookups is not true.
> > Hash tables require O(1) on average, and your own suggested method needs
> > an equal lookup.
> 
> Yes, but if you do use a hash table why calculate the hash values at runtime?

I doubt this would make any mesurable difference, especially for ffmpeg
which don't really makes intensive use of text output.

> Why store the english strings twice instead of corresponding hash values?

Because you have no guaranty that the hash results won't clash.

> I dont belive you consider this good design. It is plain waste of space.

If you don't care about potential hash clash, that's indeed a bad
design. It would be nice if english strings could be optionaly
avoided in the gmo file.

> > The only calculation your suggestion can save is
> > calculating the hash at runtime, which is O(length of string) and thus
> > cannot affect O() behavior (assuming the result is of similar length and
> > has to be output).
> 
> The .gmo files do not contain hashes, they contain 2 lists of pointers to
> to arrays of sorted strings.

The doc disagree with you. They contain hashes:
http://www.gnu.org/software/gettext/manual/html_node/MO-Files.html
So gettext is not so different than your proposition.

If I try to resume the significant differences between gettext and
your proposition:
1) gettext use more disk space/memory by storing english strings twice
   but your system can't guaranty that there is no hash clash.
2) gettext allows your program to run without any additionnal file,
   while your system require a "translation" file even for default
   language.

Also, how is your proposition supposed to work with such a string:
  printf (_("The amount is %0" PRId64 "\n"), number);
This is something quite common in ffmpeg, and gettext knows how to
handle this.

And here is another example which couldn't be translated with
your proposed way of calculating hashes at compile time:

static const char *messages[] = {
    "some very meaningful message",
    "and another one"
};
printf (index > 1 ? "a default message" : messages[index]);

Well, reinventing gettext is not so trivial. I think the main
disadvantage of gettext is that it forces you to have a copy
of english strings in every translation files. But I'm pretty
sure this can be fixed in gettext (as an optional feature).

Aurel




More information about the ffmpeg-devel mailing list