[FFmpeg-devel] [PATCH] lavc: make invalid UTF-8 in subtitle output a non-fatal error
nicolas.george at normalesup.org
Sat Aug 10 11:58:15 CEST 2013
Le duodi 22 thermidor, an CCXXI, Reimar Döffinger a écrit :
> It would help far more if you could try to figure out what
> the main objections are and to come up with a compromise
My main objection here is that wm4's requests are unnecessary because the
API already allows to do it more elegantly, just not the way wm4 had in
mind. It does not require a patch on ffmpeg's part.
> For example we should ideally make a better job of providing
> useful output for those applications that do _not_ want to
> reimplement charset detection/conversion.
That is what my recent patch series is about. With it, srtdec (and easily
enough, the other text subtitles demuxers) can read UTF-8, ISO-8859-1,
and UTF-16 and UCS-4 if a BOM is present, and they would be able to read
other legacy encodings as soon as we decide how the user must select it.
> And I think this will help a lot of users, though I also
> think there is a good argument for giving our users options
> beyond that (and nicer than "have FFmpeg wrap it to UTF-8 and
> then unwrap it back" preferably, though I realize that is a
> bit of an issue if the input is UTF-16 since that one contains
> 0s, so I think there is no way to just pass that through unchanged
> with the current API).
Why do you think it is not a nice solution? Nowadays, when dealing with
text, using a single Unicode representation internally and converting
immediately after input and immediately before output is more or less the
only sane way to go. And for ffmpeg's uses, char in UTF-8 is more
practical than int.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 198 bytes
Desc: Digital signature
More information about the ffmpeg-devel