[FFmpeg-devel] [PATCH] lavc: make invalid UTF-8 in subtitle output a non-fatal error
nicolas.george at normalesup.org
Fri Jun 28 11:00:12 CEST 2013
Le decadi 10 messidor, an CCXXI, wm4 a écrit :
> Such as?
Depends on your input and what you want to do with it.
> I get them from libavformat demuxers, but also elsewhere. I actually
> can perform codepage auto-detection on subs read by libavformat
> demuxers (it's really awkward: read a number of subtitle packages,
> concatenate their contents, then run the charset detector on it). But
> it's disabled by default
Then enable it.
> and doesn't guarantee success anyway.
Success is not guaranteed in that you can not be sure to get the right
encoding, but you will always succeed in finding at least one encoding that
can work, since there are common encodings, including plain ISO-8859-1, that
can accept any byte sequence.
> In some
> cases, subtitles might be demuxed from interleaved files, in which
> auto-detection can't be reasonably performed.
Do you have any such file where conversion fails? If so, share it.
Also, you have only answered half the question: what do you intend to do
with the decoded subtitles. If garbaged output suits you, do not bother
decoding the subtitles, read them directly from /dev/urandom.
> I have the impression that you still believe the charset problem can
> be solved perfectly. This is not the case. Such problems are very common
> even today, and just showing an error message (or even dropping broken
> text) won't help.
Please provide a realistic scenario where you believe the encoding problem
can not be "solved perfectly" and where your proposal would have helped.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 198 bytes
Desc: Digital signature
More information about the ffmpeg-devel