[FFmpeg-devel] [PATCH 1/4] avcodec/strdec: factor out HTML parsing code
Yayoi Ukai
yayoi.ukai at gmail.com
Fri Aug 28 08:01:04 CEST 2015
On Sat, Aug 15, 2015 at 11:08 AM, Clément Bœsch <u at pkh.me> wrote:
> On Sat, Aug 08, 2015 at 12:52:04PM -0700, Yayoi Ukai wrote:
> [...]
>> >> - while (dst->len >= 2 && !strncmp(&dst->str[dst->len - 2], "\\N", 2))
>> >> - dst->len -= 2;
>> >> - dst->str[dst->len] = 0;
>> >> - rstrip_spaces_buf(dst);
>> >
>> > why did you completely remove this chunk?
>>
>> It appeared to me that it didn't do anything even in original code.
>> But I can put it back.
>> (It didn't make any difference in fate test or other simple test
>> whether I commented it out or not.)
>>
>
> This commit is supposed to be factoring out the code, not do any
> functional change. If you want to do such changes it belongs in a separate
> commit (but you probably don't want to because it's likely wrong).
>
> Anyway, I had to have a look to the diff again to check if you hadn't
> added more unwanted changes. And unfortunately, you did. So first of all,
> this following diff I extracted belongs in a separated commit (typically
> squashed in the one where you make SAMI use it, or eventually just
> before).
>
> Here is a review of that part assuming it is extracted/moved:
>
> diff --git a/libavcodec/htmlsubtitles.c b/libavcodec/htmlsubtitles.c
> index a138955..b2f2273 100644
> --- a/libavcodec/htmlsubtitles.c
> +++ b/libavcodec/htmlsubtitles.c
> @@ -71,6 +71,21 @@ void ff_htmlmarkup_to_ass(AVCodecContext *avctx, AVBPrint *dst, const char *in)
> end = 1;
> break;
> }
> +
> + /* check if it is end of the paragraph or not*/
> + in++;
> + count = 1;
> + while(*in == ' ') {
> + in++;
> + count++;
> + }
> + if (*in == '\0' || *in == '\n'){
> + in = in - count;
> + break;
> + }
> + in = in - count;
> +
> + /*if not the end of the paragraph, add line break */
>
> Why? What are you trying to do here that isn't already handled by the code
> already?
Yes removed.
>
> There is no concept of "paragraph" in SubRip markup, so it probably
> doesn't belong here, assuming it's necessary.
>
> rstrip_spaces_buf(dst);
> av_bprintf(dst, "\\N");
> line_start = 1;
> @@ -90,6 +105,15 @@ void ff_htmlmarkup_to_ass(AVCodecContext *avctx, AVBPrint *dst, const char *in)
> av_bprint_chars(dst, *in, 1);
> break;
> case '<':
> + if (!av_strncasecmp(in, "<BR", 3)){
>
> here and below, check your style
>
> + av_bprintf(dst, "\\N");
> + len = 3;
> + while (in[len] != '>' && (av_isspace(in[len]) || in[len] == '/')){
>
> if in[len] is a space or a '/', it's obviously different than '>', so the
> condition is redundant.
>
> + len++;
>
> wrong indent
>
> + }
>
> + in += len + 1;
>
> this +1 is very dangerous, there is a risk of overread.
Well, I was not sure what it could be replaced for checking the closed
'>' for the <br > tag..
So this one stayed... Please let me know if you have any suggestion.
Thank you!
>
> + }
> +
> tag_close = in[1] == '/';
> len = 0;
> if (sscanf(in+tag_close+1, "%127[^>]>%n", buffer, &len) >= 1 && len > 0) {
>
> [...]
>
> --
> Clément B.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
More information about the ffmpeg-devel
mailing list