[Ffmpeg-devel] retrieving asf textual info in other languages

Michael Niedermayer michaelni
Tue May 17 09:59:03 CEST 2005


Hi

On Tuesday 17 May 2005 05:12, ?? wrote:
> Hi guys:
>
>
> What am I trying to do is to extract textual information from asf type
> containers, namely the fields: title, author, copyright and comments. I'm
> using libavformat to accomplish this. The problem I encountered is that the
> extracted asf information within the AVFormatContext is corrupted. As I
> know the textual information stored within asf containers should be in
> unicode (UCS2), so I compiled a debug version of the libavformat and dug
> deeper, this is what I found:
>
> the code to extract these textual fields:
> asf.c, around line 312:
>     get_str16_nolen(pb, len1, s->title, sizeof(s->title));
>
> and in get_str16_nolen( )
> {
>     int c;
>     char *q;
>
>     q = buf;
>     while (len > 0) {
>         c = get_le16(pb);
>         if ((q - buf) < buf_size - 1)
>             *q++ = c;
>         len-=2;
>     }
>     *q = '\0';
> }
>
> what I can see here is copying a 16 bit int (c) to a 8 bit char (*q), if
> I'm not mistaken this would cut the higher 8 bit, this is fine for ascii
> character which will just leave a lower ascii byte but will corrupt any
> other language that also uses the higher byte. To confirm this I've also
> tried some ffmpeg based player such as videoLan, and the result is the
> same, no encoding other than ascii is shown in text fields. So this problem
> definitly affects all asian languages.
>
> So here are my questions: is this cutting of higher byte a deliberate act
> to avoid EOS '\0' character ? or is this a bug ? and will you guys consider
> a patch or fix soon for this ?

send a patch, if its clean and working it will be considered
note, the thing must be converted to utf8

[...]
-- 
Michael





More information about the ffmpeg-devel mailing list