[FFmpeg-trac] #8849(undetermined:new): sub2video does not work with overlay_cuda

Fri Aug 14 05:45:57 EEST 2020

#8849: sub2video does not work with overlay_cuda
-------------------------------------+-------------------------------------
             Reporter:  Znuff        |                     Type:  defect
               Status:  new          |                 Priority:  normal
            Component:               |                  Version:
  undetermined                       |  unspecified
             Keywords:               |               Blocked By:
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
 Summary of the bug:

 sub2video seems to fail when attempting to overlay (burn-in) a dvb
 subtitle to a video using overlay_cuda

 This seems to work fine when using a transparent png as a second input,
 and a similar chain (with the same source) works fine when using the
 normal overlay filter, by using hwdownload/hwupload, while retaining
 hardware decoding/encoding.

 Sample file: https://0x0.st/iYuU.ts

 I based the filter logic on these examples:
 https://patchwork.ffmpeg.org/project/ffmpeg/patch/20200318071955.2329-1-yyyaroslav@gmail.com/

 How to reproduce:

 {{{
 # ./ffmpeg_npp -v verbose -report -dump_filtergraph
 fmt=dot:filename=./graph.dot -nostats -vsync 0 -init_hw_device cuda=cuda
 -filter_hw_device cuda -hwaccel cuvid -c:v h264_cuvid -i in.ts
 -filter_complex "[0:s] format=yuva420p,hwupload [0s]; [0:v]
 scale_npp=format=yuv420p [0v]; [0v][0s] overlay_cuda [v]" -map "[v]" -map
 0:a -c:v h264_nvenc -preset medium -b:v 5M -bufsize 10M -profile:v main
 -temporal-aq 1 -acodec copy -copy_unknown -f mpegts -y out.ts
 ffmpeg started on 2020-08-14 at 02:34:38
 Report written to "ffmpeg-20200814-023438.log"
 Log level: 48
 ffmpeg version N-98725-gcfc6552032 Copyright (c) 2000-2020 the FFmpeg
 developers
   built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
   configuration: --pkg-config=pkg-config --pkg-config-flags=--static
 --disable-libxcb --disable-debug --enable-cuda-llvm --enable-cuvid
 --enable-nvenc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include
 --extra-ldflags=-L/usr/local/cuda/lib64 --extra-cflags='-mtune=generic'
 --extra-cflags=-O3 --enable-static --disable-shared --prefix=/home/ibm86
 /ffmpeg-windows-build-helpers/sandbox/cross_compilers/native --enable-
 nonfree --enable-libfdk-aac
   libavutil      56. 58.100 / 56. 58.100
   libavcodec     58.100.100 / 58.100.100
   libavformat    58. 50.100 / 58. 50.100
   libavdevice    58. 11.101 / 58. 11.101
   libavfilter     7. 87.100 /  7. 87.100
   libswscale      5.  8.100 /  5.  8.100
   libswresample   3.  8.100 /  3.  8.100
 [h264 @ 0x564b2f71bb00] Reinit context to 1920x1088, pix_fmt: yuv420p
 [h264 @ 0x564b2f71bb00] Increasing reorder buffer to 2
 [mpegts @ 0x564b2f7156c0] max_analyze_duration 5000000 reached at 5016000
 microseconds st:1
 WARNING: defaulting hwaccel_output_format to cuda for compatibility with
 old commandlines. This behaviour is DEPRECATED and will be removed in the
 future. Please explicitly set "-hwaccel_output_format cuda".
 Input #0, mpegts, from 'in.ts':
   Duration: 00:00:15.93, start: 1.400000, bitrate: 8561 kb/s
   Program 1
     Metadata:
       service_name    : Service01
       service_provider: FFmpeg
     Stream #0:0[0x100]: Video: h264 (High), 1 reference frame
 ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, top first, left), 1920x1080
 (1920x1088) [SAR 1:1 DAR 16:9], 25 fps, 50 tbr, 90k tbn, 50 tbc
     Stream #0:1[0x101](rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz,
 stereo, fltp, 256 kb/s
     Stream #0:2[0x102](qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000
 Hz, 5.1(side), fltp, 640 kb/s
     Stream #0:3[0x103](rum): Subtitle: dvb_subtitle ([6][0][0][0] /
 0x0006)
 [h264_mp4toannexb @ 0x564b2f7eef40] The input looks like it is Annex B
 already
 [h264_cuvid @ 0x564b315982c0] CUVID capabilities for h264_cuvid:
 [h264_cuvid @ 0x564b315982c0] 8 bit: supported: 1, min_width: 48,
 max_width: 4096, min_height: 16, max_height: 4096
 [h264_cuvid @ 0x564b315982c0] 10 bit: supported: 0, min_width: 0,
 max_width: 0, min_height: 0, max_height: 0
 [h264_cuvid @ 0x564b315982c0] 12 bit: supported: 0, min_width: 0,
 max_width: 0, min_height: 0, max_height: 0
 Stream mapping:
   Stream #0:0 (h264_cuvid) -> scale_npp
   Stream #0:3 (dvbsub) -> format
   overlay_cuda -> Stream #0:0 (h264_nvenc)
   Stream #0:1 -> #0:1 (copy)
   Stream #0:2 -> #0:2 (copy)
 Press [q] to stop, [?] for help
 [h264_cuvid @ 0x564b315982c0] Formats: Original: cuda | HW: cuda | SW:
 nv12
 [mpegts @ 0x564b2f7156c0] sub2video: using 1920x1080 canvas
 [graph 0 input from stream 0:3 @ 0x564b2f902ac0] w:1920 h:1080 pixfmt:bgra
 tb:1/90000 fr:0/1 sar:0/1
 [graph 0 input from stream 0:0 @ 0x564b2f903740] w:1920 h:1080 pixfmt:cuda
 tb:1/90000 fr:25/1 sar:1/1
 [auto_scaler_0 @ 0x564b2f906a00] w:iw h:ih flags:'bilinear' interl:0
 [Parsed_format_0 @ 0x564b30f98540] auto-inserting filter 'auto_scaler_0'
 between the filter 'graph 0 input from stream 0:3' and the filter
 'Parsed_format_0'
 [Parsed_scale_npp_2 @ 0x564b30f99600] w:1920 h:1080 -> w:1920 h:1080
 [auto_scaler_0 @ 0x564b2f906a00] w:1920 h:1080 fmt:bgra sar:0/1 -> w:1920
 h:1080 fmt:yuva420p sar:0/1 flags:0x2
 [Parsed_overlay_cuda_3 @ 0x564b2f901ac0] [framesync @ 0x564b2f901bf8] Sync
 level 2
 [h264_nvenc @ 0x564b2f8125c0] Using input frames context (format cuda)
 with h264_nvenc encoder.
 [h264_nvenc @ 0x564b2f8125c0] Loaded Nvenc version 10.0
 [h264_nvenc @ 0x564b2f8125c0] Nvenc initialized successfully
 [h264_nvenc @ 0x564b2f8125c0] Temporal AQ enabled.
 [mpegts @ 0x564b2f8c8d80] service 1 using PCR in pid=256, pcr_period=80ms
 [mpegts @ 0x564b2f8c8d80] muxrate VBR, sdt every 500 ms, pat/pmt every 100
 ms
 Output #0, mpegts, to 'out.ts':
   Metadata:
     encoder         : Lavf58.50.100
     Stream #0:0: Video: h264 (h264_nvenc) (Main), 1 reference frame, cuda,
 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 5000 kb/s, 25 fps, 90k tbn, 25 tbc
 (default)
     Metadata:
       encoder         : Lavc58.100.100 h264_nvenc
     Side data:
       cpb: bitrate max/min/avg: 0/0/5000000 buffer size: 10000000
 vbv_delay: N/A
     Stream #0:1(rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz,
 stereo, fltp, 256 kb/s
     Stream #0:2(qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz,
 5.1(side), fltp, 640 kb/s
 Error while add the frame to buffer source(Internal bug, should not have
 happened).
 Error while filtering: Internal bug, should not have happened
 Failed to inject frame into filter network: Internal bug, should not have
 happened
 Error while processing the decoded data for stream #0:0
 [AVIOContext @ 0x564b2f7fc100] Statistics: 0 seeks, 0 writeouts
 [h264_nvenc @ 0x564b2f8125c0] Nvenc unloaded
 [AVIOContext @ 0x564b2f71e580] Statistics: 5525648 bytes read, 2 seeks
 Conversion failed!
 }}}

 This seems to be working fine with a transparent PNG, for example:

 {{{
 # ./ffmpeg_npp -v verbose -nostats -vsync 0 -init_hw_device cuda=cuda
 -filter_hw_device cuda -hwaccel cuvid -c:v h264_cuvid -i in.ts -i t.png
 -filter_complex "[1:v] format=yuva420p,hwupload [0s]; [0:v]
 scale_npp=format=yuv420p [0v]; [0v][0s] overlay_cuda=shortest=false [v]"
 -map "[v]" -map 0:a -c:v h264_nvenc -preset medium -b:v 5M -bufsize 10M
 -profile:v main -temporal-aq 1 -acodec copy -copy_unknown -f mpegts -y
 out.ts
 ffmpeg version N-98725-gcfc6552032 Copyright (c) 2000-2020 the FFmpeg
 developers
   built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
   configuration: --pkg-config=pkg-config --pkg-config-flags=--static
 --disable-libxcb --disable-debug --enable-cuda-llvm --enable-cuvid
 --enable-nvenc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include
 --extra-ldflags=-L/usr/local/cuda/lib64 --extra-cflags='-mtune=generic'
 --extra-cflags=-O3 --enable-static --disable-shared --prefix=/home/ibm86
 /ffmpeg-windows-build-helpers/sandbox/cross_compilers/native --enable-
 nonfree --enable-libfdk-aac
   libavutil      56. 58.100 / 56. 58.100
   libavcodec     58.100.100 / 58.100.100
   libavformat    58. 50.100 / 58. 50.100
   libavdevice    58. 11.101 / 58. 11.101
   libavfilter     7. 87.100 /  7. 87.100
   libswscale      5.  8.100 /  5.  8.100
   libswresample   3.  8.100 /  3.  8.100
 [h264 @ 0x555fe4920b00] Reinit context to 1920x1088, pix_fmt: yuv420p
 [h264 @ 0x555fe4920b00] Increasing reorder buffer to 2
 [mpegts @ 0x555fe491a640] max_analyze_duration 5000000 reached at 5016000
 microseconds st:1
 WARNING: defaulting hwaccel_output_format to cuda for compatibility with
 old commandlines. This behaviour is DEPRECATED and will be removed in the
 future. Please explicitly set "-hwaccel_output_format cuda".
 Input #0, mpegts, from 'in.ts':
   Duration: 00:00:15.93, start: 1.400000, bitrate: 8561 kb/s
   Program 1
     Metadata:
       service_name    : Service01
       service_provider: FFmpeg
     Stream #0:0[0x100]: Video: h264 (High), 1 reference frame
 ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, top first, left), 1920x1080
 (1920x1088) [SAR 1:1 DAR 16:9], 25 fps, 50 tbr, 90k tbn, 50 tbc
     Stream #0:1[0x101](rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz,
 stereo, fltp, 256 kb/s
     Stream #0:2[0x102](qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000
 Hz, 5.1(side), fltp, 640 kb/s
     Stream #0:3[0x103](rum): Subtitle: dvb_subtitle ([6][0][0][0] /
 0x0006)
 Input #1, png_pipe, from 't.png':
   Duration: N/A, bitrate: N/A
     Stream #1:0: Video: png, 1 reference frame, rgba(pc), 1024x721, 25
 tbr, 25 tbn, 25 tbc
 [h264_mp4toannexb @ 0x555fe4a37400] The input looks like it is Annex B
 already
 [h264_cuvid @ 0x555fe4a3d580] CUVID capabilities for h264_cuvid:
 [h264_cuvid @ 0x555fe4a3d580] 8 bit: supported: 1, min_width: 48,
 max_width: 4096, min_height: 16, max_height: 4096
 [h264_cuvid @ 0x555fe4a3d580] 10 bit: supported: 0, min_width: 0,
 max_width: 0, min_height: 0, max_height: 0
 [h264_cuvid @ 0x555fe4a3d580] 12 bit: supported: 0, min_width: 0,
 max_width: 0, min_height: 0, max_height: 0
 Stream mapping:
   Stream #0:0 (h264_cuvid) -> scale_npp
   Stream #1:0 (png) -> format
   overlay_cuda -> Stream #0:0 (h264_nvenc)
   Stream #0:1 -> #0:1 (copy)
   Stream #0:2 -> #0:2 (copy)
 Press [q] to stop, [?] for help
 [h264_cuvid @ 0x555fe4a3d580] Formats: Original: cuda | HW: cuda | SW:
 nv12
 [graph 0 input from stream 1:0 @ 0x555fe674b980] w:1024 h:721 pixfmt:rgba
 tb:1/25 fr:25/1 sar:0/1
 [graph 0 input from stream 0:0 @ 0x555fe674c740] w:1920 h:1080 pixfmt:cuda
 tb:1/90000 fr:25/1 sar:1/1
 [auto_scaler_0 @ 0x555fe4b22440] w:iw h:ih flags:'bilinear' interl:0
 [Parsed_format_0 @ 0x555fe4a32540] auto-inserting filter 'auto_scaler_0'
 between the filter 'graph 0 input from stream 1:0' and the filter
 'Parsed_format_0'
 [Parsed_scale_npp_2 @ 0x555fe4a0bf40] w:1920 h:1080 -> w:1920 h:1080
 [auto_scaler_0 @ 0x555fe4b22440] w:1024 h:721 fmt:rgba sar:0/1 -> w:1024
 h:721 fmt:yuva420p sar:0/1 flags:0x2
 [Parsed_overlay_cuda_3 @ 0x555fe674a9c0] [framesync @ 0x555fe674aaf8] Sync
 level 2
 [h264_nvenc @ 0x555fe69e3e40] Using input frames context (format cuda)
 with h264_nvenc encoder.
 [h264_nvenc @ 0x555fe69e3e40] Loaded Nvenc version 10.0
 [h264_nvenc @ 0x555fe69e3e40] Nvenc initialized successfully
 [h264_nvenc @ 0x555fe69e3e40] Temporal AQ enabled.
 [mpegts @ 0x555fe4acd9c0] service 1 using PCR in pid=256, pcr_period=80ms
 [mpegts @ 0x555fe4acd9c0] muxrate VBR, sdt every 500 ms, pat/pmt every 100
 ms
 Output #0, mpegts, to 'out.ts':
   Metadata:
     encoder         : Lavf58.50.100
     Stream #0:0: Video: h264 (h264_nvenc) (Main), 1 reference frame, cuda,
 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 5000 kb/s, 25 fps, 90k tbn, 25 tbc
 (default)
     Metadata:
       encoder         : Lavc58.100.100 h264_nvenc
     Side data:
       cpb: bitrate max/min/avg: 0/0/5000000 buffer size: 10000000
 vbv_delay: N/A
     Stream #0:1(rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz,
 stereo, fltp, 256 kb/s
     Stream #0:2(qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz,
 5.1(side), fltp, 640 kb/s
 [Parsed_overlay_cuda_3 @ 0x555fe674a9c0] [framesync @ 0x555fe674aaf8] Sync
 level 0
 No more output streams to write to, finishing.
 frame=  354 fps=0.0 q=15.0 Lsize=   10349kB time=00:00:15.84
 bitrate=5352.1kbits/s speed=16.8x
 video:8383kB audio:1628kB subtitle:0kB other streams:0kB global
 headers:0kB muxing overhead: 3.378023%
 Input file #0 (in.ts):
   Input stream #0:0 (video): 714 packets read (14014289 bytes); 354 frames
 decoded;
   Input stream #0:1 (audio): 620 packets read (476160 bytes);
   Input stream #0:2 (audio): 465 packets read (1190400 bytes);
   Input stream #0:3 (subtitle): 0 packets read (0 bytes);
   Total: 1799 packets (15680849 bytes) demuxed
 Input file #1 (t.png):
   Input stream #1:0 (video): 1 packets read (10935 bytes); 1 frames
 decoded;
   Total: 1 packets (10935 bytes) demuxed
 Output file #0 (out.ts):
   Output stream #0:0 (video): 354 frames encoded; 354 packets muxed
 (8584346 bytes);
   Output stream #0:1 (audio): 620 packets muxed (476160 bytes);
   Output stream #0:2 (audio): 465 packets muxed (1190400 bytes);
   Total: 1439 packets (10250906 bytes) muxed
 [AVIOContext @ 0x555fe4a011c0] Statistics: 0 seeks, 41 writeouts
 [h264_nvenc @ 0x555fe69e3e40] Nvenc unloaded
 [AVIOContext @ 0x555fe4923580] Statistics: 21886488 bytes read, 2 seeks
 [AVIOContext @ 0x555fe49f3880] Statistics: 10935 bytes read, 0 seeks
 }}}

 Filter Graph - https://bit.ly/33ZjUE8
 {{{
 digraph G {
 node [shape=box]
 rankdir=LR
 "Parsed_format_0\n(format)" -> "Parsed_hwupload_1\n(hwupload)" [ label=
 "inpad:default -> outpad:default\nfmt:yuva420p w:1920 h:1080 tb:1/90000"
 ];
 "Parsed_hwupload_1\n(hwupload)" -> "Parsed_overlay_cuda_3\n(overlay_cuda)"
 [ label= "inpad:default -> outpad:overlay\nfmt:cuda w:1920 h:1080
 tb:1/90000" ];
 "Parsed_scale_npp_2\n(scale_npp)" ->
 "Parsed_overlay_cuda_3\n(overlay_cuda)" [ label= "inpad:default ->
 outpad:main\nfmt:cuda w:1920 h:1080 tb:1/90000" ];
 "Parsed_overlay_cuda_3\n(overlay_cuda)" -> "format\n(format)" [ label=
 "inpad:default -> outpad:default\nfmt:cuda w:1920 h:1080 tb:1/90000" ];
 "graph 0 input from stream 0:3\n(buffer)" -> "auto_scaler_0\n(scale)" [
 label= "inpad:default -> outpad:default\nfmt:bgra w:1920 h:1080
 tb:1/90000" ];
 "graph 0 input from stream 0:0\n(buffer)" ->
 "Parsed_scale_npp_2\n(scale_npp)" [ label= "inpad:default ->
 outpad:default\nfmt:cuda w:1920 h:1080 tb:1/90000" ];
 "format\n(format)" -> "out_0_0\n(buffersink)" [ label= "inpad:default ->
 outpad:default\nfmt:cuda w:1920 h:1080 tb:1/90000" ];
 "auto_scaler_0\n(scale)" -> "Parsed_format_0\n(format)" [ label=
 "inpad:default -> outpad:default\nfmt:yuva420p w:1920 h:1080 tb:1/90000"
 ];
 }
 }}}

--
Ticket URL: <https://trac.ffmpeg.org/ticket/8849>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker