[FFmpeg-trac] #10291(ffmpeg:new): FFmpeg removes IETF BCP-47 language tags from MKV files during remuxing or encoding

FFmpeg trac at avcodec.org
Thu Mar 30 05:26:00 EEST 2023

#10291: FFmpeg removes IETF BCP-47 language tags from MKV files during remuxing or
             Reporter:  ptr727  |                     Type:  defect
               Status:  new     |                 Priority:  normal
            Component:  ffmpeg  |                  Version:  git-master
             Keywords:  mkv     |               Blocked By:
             Blocking:          |  Reproduced by developer:  0
Analyzed by developer:  0       |
 When FFmpeg creates MKV files from MKV files, the LanguageIETF tags from
 the original file is not written, and the language granularity is lost.

 For reference see:
 - https://datatracker.ietf.org/doc/draft-ietf-cellar-matroska/
 - https://gitlab.com/mbunkus/mkvtoolnix/-/wikis/Languages-in-Matroska-and-
 - https://github.com/ietf-wg-cellar/matroska-
 - https://en.wikipedia.org/wiki/IETF_language_tag
 - https://r12a.github.io/app-subtags/

 Create media file snippet from MKV that contains IETF BCP-47 tags:

 mkvmerge --split parts:00:00:00-00:01:00 --output MKV-IETF-Snippet.mkv

 Use MkvMerge to create a JSON file describing the MKV contents:

 mkvmerge --identify MKV-IETF-Snippet.mkv --identification-format json

 Note the presence of language and language_ietf tags in the file:

 "language": "srp"
 "language_ietf": "sr-Latn-RS"

 Similar output can be produced using MediaInfo and FfProbe:

 mediainfo --Output=XML MKV-IETF-Snippet.mkv

 ffprobe -loglevel quiet -show_streams -show_format -print_format json MKV-
 "language": "srp"

 Note that FfProbe only uses the ISO693-3 tags, and ignores the IETF BCP-47

 ReMux the file using FfMpeg

 ffmpeg -i MKV-IETF-Snippet.mkv -map 0 -codec copy -f matroska MKV-IETF-

 Repeat the steps above to get the MKV tag information, and note that the
 IETF language tags have been stripped from the output file.

 "language": "srp"

 The "sr-Latn-RS" detailed language has been reduced the "srp", losing the
 regional specifics.

 Observed behavior: ffmpeg strips IETF language tags from files.
 Expected behavior: ffmpeg retains IETF tags (or all Matroska tags even if
 not interpreted) from the source file.
 Nice to have behavior: FfProbe emits IETF language tags.
Ticket URL: <https://trac.ffmpeg.org/ticket/10291>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker

More information about the FFmpeg-trac mailing list