[FFmpeg-trac] #7203(undetermined:new): Problem with encoding type "Cyrillic (DOS)" with metadata delivery.
FFmpeg
trac at avcodec.org
Mon May 14 04:00:06 EEST 2018
#7203: Problem with encoding type "Cyrillic (DOS)" with metadata delivery.
-------------------------------------+-------------------------------------
Reporter: max79 | Owner:
Type: defect | Status: new
Priority: normal | Component:
Version: unspecified | undetermined
Keywords: | Resolution:
Blocking: | Blocked By:
Analyzed by developer: 0 | Reproduced by developer: 0
-------------------------------------+-------------------------------------
Comment (by mkver):
This file uses id3v2.3 tags. The TIT2-tag (the tag containing the title)
is as follows in hex: 0x54 49 54 32 00 00 00 0A 00 00 00 C4 EE F0 EE E6 EA
E0 20 31. According to [http://id3.org/id3v2.3.0#ID3v2_frame_overview the
standard] the 0x00 after the length field indicates that the tag uses
ISO-8859-1 as encoding, an encoding that does not contain cyrillic
characters. For such purposes Unicode could (and should) be used, but
isn't. This is a bug in the tool that created said file, not in FFmpeg.
Btw: The last nine bytes are the actual titel; in Windows-1251 they would
be read as "Дорожка 1"; in the Cyrillic DOS code page 866 that you are
referring to it means "─юЁюцър 1". In ISO-8859-1 they mean "Äîðîæêà 1".
FFmpeg's output to the console is encoded as UTF-8, but cmd.exe (that you
seem to be using) expects applications to use the native legacy codepage
of the system (for Russian Windows versions, this is usually Code page
855; cmd.exe is by the way Unicode compatible). The UTF-8 that FFmpeg
writes to the console is 0xC3 84 C3 AE C3 B0 C3 AE C3 A6 C3 AA C3 A0 20
31. In CP 866 0xC3 is "├" whereas 0x84 is "Д". That six of the seven
characters of the word (seem to) have been preserved does not really have
a deeper meaning. It is accidental.
--
Ticket URL: <https://trac.ffmpeg.org/ticket/7203#comment:3>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list