[Ffmpeg-devel] retrieving asf textual info in other languages

廖拯 c_liao
Tue May 17 05:12:51 CEST 2005


Hi guys:


What am I trying to do is to extract textual information from asf type containers, namely the fields: title, author, copyright and comments. I'm using libavformat to accomplish this. The problem I encountered is that the extracted asf information within the AVFormatContext is corrupted. As I know the textual information stored within asf containers should be in unicode (UCS2), so I compiled a debug version of the libavformat and dug deeper, this is what I found:

the code to extract these textual fields:
asf.c, around line 312:
    get_str16_nolen(pb, len1, s->title, sizeof(s->title));

and in get_str16_nolen( )
{
    int c;
    char *q;

    q = buf;
    while (len > 0) {
        c = get_le16(pb);
        if ((q - buf) < buf_size - 1)
            *q++ = c;
        len-=2;
    }
    *q = '\0';
}

what I can see here is copying a 16 bit int (c) to a 8 bit char (*q), if I'm not mistaken this would cut the higher 8 bit, this is fine for ascii character which will just leave a lower ascii byte but will corrupt any other language that also uses the higher byte. To confirm this I've also tried some ffmpeg based player such as videoLan, and the result is the same, no encoding other than ascii is shown in text fields. So this problem definitly affects all asian languages.

So here are my questions: is this cutting of higher byte a deliberate act to avoid EOS '\0' character ? or is this a bug ? and will you guys consider a patch or fix soon for this ?

thanks for your time

Cheng Liao

p.s. it's a wonderfull job what you guys did on ffmpeg lib ;)



More information about the ffmpeg-devel mailing list