[FFmpeg-devel] [PATCH 1/1] Fixing 3GPP Timed Text (TTXT / tx3g / mov_text) encoding for UTF-8 (ticket 6021)

Michael Niedermayer michael at niedermayer.cc
Fri Dec 16 14:02:24 EET 2016


On Thu, Dec 15, 2016 at 10:43:15PM +0000, Erik BrĂ¥then Solem wrote:
> According to the format specification (3GPP TS 26.245, section 5.2) "storage
> lengths are specified as byte-counts, wheras highlighting is specified using
> character offsets." This patch replaces byte counting with character counting
> for highlighting. See the following page for a link to the specification:
> https://gpac.wp.mines-telecom.fr/mp4box/ttxt-format-documentation/
> ---
>  libavcodec/movtextenc.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/libavcodec/movtextenc.c b/libavcodec/movtextenc.c
> index 20e01e2..3ae015a 100644
> --- a/libavcodec/movtextenc.c
> +++ b/libavcodec/movtextenc.c
> @@ -70,6 +70,7 @@ typedef struct {
>      uint8_t style_fontsize;
>      uint32_t style_color;
>      uint16_t text_pos;
> +    uint16_t text_pos_chars;
>  } MovTextContext;
[...]
> @@ -302,7 +303,10 @@ static void mov_text_text_cb(void *priv, const char *text, int len)
>  {
>      MovTextContext *s = priv;
>      av_bprint_append_data(&s->buffer, text, len);
> -    s->text_pos += len;
> +    s->text_pos += len;             // length of text in bytes
> +    for (int i = 0; i < len; i++)   // length of text in UTF-8 characters
> +        if ((text[i] & 0xC0) != 0x80)
> +            s->text_pos_chars++;
>  }
>  
>  static void mov_text_new_line_cb(void *priv, int forced)
> @@ -310,6 +314,7 @@ static void mov_text_new_line_cb(void *priv, int forced)
>      MovTextContext *s = priv;
>      av_bprint_append_data(&s->buffer, "\n", 1);
>      s->text_pos += 1;
> +    s->text_pos_chars += 1;
>  }

The code isnt really my area but is there a check to prevent
text_pos and text_pos_chars from overflowing the 16bit range ?

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

it is not once nor twice but times without number that the same ideas make
their appearance in the world. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20161216/dbc7d571/attachment.sig>


More information about the ffmpeg-devel mailing list