[FFmpeg-trac] #4915(avcodec:new): WebVTT decoder doesn't handle html escapes

FFmpeg trac at avcodec.org
Thu Oct 8 00:50:21 CEST 2015

#4915: WebVTT decoder doesn't handle html escapes
             Reporter:  RiCON    |                     Type:  enhancement
               Status:  new      |                 Priority:  minor
            Component:  avcodec  |                  Version:  git-master
             Keywords:  webvtt   |               Blocked By:
             Blocking:           |  Reproduced by developer:  0
Analyzed by developer:  0        |
 WebVTT spec specifies a dozen HTML escapes that should be handled,
 including '>', '<' and '&'. These aren't converted back to the
 proper characters.

 FFmpeg version:
 % ffmpeg -i htmlescapes.vtt out.srt
 ffmpeg version N-75818-g8135b1e Copyright (c) 2000-2015 the FFmpeg
   built with gcc 5.2.0 (Rev4, Built by MSYS2 project)

 Attached is an example vtt file, result with this build and proper result.
 Examples of where these html escapes are used can be found by getting the
 subtitles from any video in Comedy Central's site using something like
 youtube-dl. Example:
 % youtube-dl --all-subs "http://www.cc.com/video-clips/52dpzm/the-daily-

Ticket URL: <https://trac.ffmpeg.org/ticket/4915>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker

More information about the FFmpeg-trac mailing list