[FFmpeg-trac] #397(swscale:new): swscale crashes when upscaling pictures using MMX2

FFmpeg trac at avcodec.org
Sun Aug 14 19:44:51 CEST 2011


#397: swscale crashes when upscaling pictures using MMX2
-------------------------+---------------------
  Reporter:  TobiX       |      Owner:  michael
      Type:  defect      |     Status:  new
  Priority:  normal      |  Component:  swscale
   Version:  git-master  |   Keywords:
Blocked By:              |   Blocking:
Reproduced:  0           |   Analyzed:  0
-------------------------+---------------------
 This crash was found using XBMC. XBMC would crash while generating
 thumbnails for videos which are smaller then the configured thumbnail
 size. If you want to reproduce this with XBMC, you need to compile it with
 "--enable-external-libraries" since the internal copy is very old and does
 not expose this bug.

 As far as I can see, this crash only happens when upscaling from a YUV to
 a RGB color space, sample run (make sure video is small then the scale
 filter options):

 {{{
 $ ./ffmpeg_g -v 9 -loglevel 99 -i in.wmv -sws_flags fast_bilinear -vf
 "scale=640:480" -vframes 1 -vcodec png output.png
 ffmpeg version N-31884-gb854c2a, Copyright (c) 2000-2011 the FFmpeg
 developers
   built on Aug 14 2011 19:32:25 with gcc 4.6.1
   configuration: --disable-doc --disable-stripping
   libavutil    51. 12. 0 / 51. 12. 0
   libavcodec   53. 10. 0 / 53. 10. 0
   libavformat  53.  7. 0 / 53.  7. 0
   libavdevice  53.  3. 0 / 53.  3. 0
   libavfilter   2. 31. 1 /  2. 31. 1
   libswscale    2.  0. 0 /  2.  0. 0
 [asf @ 0x1fc3420] Format asf probed with size=2048 and score=100
 [asf @ 0x1fc3420] gpos mismatch our pos=24, end=26
 [asf @ 0x1fc3420] gpos mismatch our pos=24, end=1168
 [asf @ 0x1fc3420] gpos mismatch our pos=24, end=38
 [asf @ 0x1fc3420] gpos mismatch our pos=24, end=238
 [asf @ 0x1fc3420] gpos mismatch our pos=24, end=38
 [wmav2 @ 0x1fc58c0] Unsupported bit depth: 0
 [wmv2 @ 0x1fc7c20] Unsupported bit depth: 0
 [asf @ 0x1fc3420] All info found

 Seems stream 1 codec frame rate differs from container frame rate: 1000.00
 (1000/1) -> 30.00 (30/1)
 Input #0, asf, from 'in.wmv':
   Metadata:
     WMFSDKVersion   : 9.00.00.2980
     WMFSDKNeeded    : 0.0.0.0000
     IsVBR           : 0
   Duration: 00:00:10.23, start: 0.000000, bitrate: 302 kb/s
     Stream #0.0(eng), 5, 1/1000: Audio: wmav2, 44100 Hz, 1 channels, s16,
 32 kb/s
     Stream #0.1(eng), 41, 1/1000: Video: wmv2, yuv420p, 320x240, 1/1000,
 618 kb/s, 30 tbr, 1k tbn, 1k tbc
 Incompatible pixel format 'yuv420p' for codec 'png', auto-selecting format
 'rgb24'
 [buffer @ 0x1fc5840] w:320 h:240 pixfmt:yuv420p tb:1/1000000 sar:0/1
 sws_param:
 [scale @ 0x1fc7720] w:320 h:240 fmt:yuv420p -> w:640 h:480 fmt:rgb24
 flags:0x1
 [png @ 0x1fbd740] Unsupported bit depth: 0
 [wmv2 @ 0x1fc7c20] Unsupported bit depth: 0
 Output #0, image2, to 'output.png':
   Metadata:
     WMFSDKVersion   : 9.00.00.2980
     WMFSDKNeeded    : 0.0.0.0000
     IsVBR           : 0
     encoder         : Lavf53.7.0
     Stream #0.0(eng), 0, 1/90000: Video: png, rgb24, 640x480, 1/30,
 q=2-31, 200 kb/s, 90k tbn, 30 tbc
 Stream mapping:
   Stream #0.1 -> #0.0
 Press [q] to stop, [?] for help
 [wmv2 @ 0x1fc7c20] I7:0/
 zsh: segmentation fault  ./ffmpeg_g -v 9 -loglevel 99 -i in.wmv -sws_flags
 fast_bilinear -vf  -vframes
 }}}

 GDB session:

 {{{
 (gdb) bt
 #0  0x00000000012894e0 in ?? ()
 #1  0x0000000000944eef in hyscale_fast_MMX2 (c=0x12a1700, dst=0x0,
 dstWidth=19436768,
     src=0x13122c8 '^' <repeats 40 times>, ']' <repeats 48 times>,
 "\\\\\\\\\\\\\\\\]]]]]]]]", '\\' <repeats 40 times>, '[' <repeats 16
 times>, 'Z' <repeats 40 times>..., srcW=320, xInc=32788) at
 libswscale/x86/swscale_template.c:2289
 #2  0x0000000001295dc0 in ?? ()
 #3  0x00000000009346c4 in hyscale (c=0x1295dc0, src=<value optimized out>,
 srcStride=0x7fffffffc650, srcSliceY=0, srcSliceH=240, dst=0x7fffffffc630,
     dstStride=0x7fffffffc660) at libswscale/swscale.c:2243
 #4  swScale (c=0x1295dc0, src=<value optimized out>,
 srcStride=0x7fffffffc650, srcSliceY=0, srcSliceH=240, dst=0x7fffffffc630,
 dstStride=0x7fffffffc660)
     at libswscale/swscale.c:2688
 #5  0x000000000091a79b in sws_scale (c=0x1295dc0, srcSlice=<value
 optimized out>, srcStride=0x7fffffffc700, srcSliceY=0, srcSliceH=240,
     dst=<value optimized out>, dstStride=0x7fffffffc710) at
 libswscale/swscale_unscaled.c:794
 #6  0x00000000004503cc in scale_slice (sws=<value optimized out>, y=<value
 optimized out>, h=<value optimized out>, mul=<value optimized out>,
 field=0,
     link=<value optimized out>) at libavfilter/vf_scale.c:298
 #7  0x0000000000450540 in draw_slice (link=0x1295ba0, y=0, h=240,
 slice_dir=1) at libavfilter/vf_scale.c:315
 #8  0x0000000000447df4 in avfilter_draw_slice (link=0x1295ba0, y=0, h=240,
 slice_dir=1) at libavfilter/avfilter.c:616
 #9  0x0000000000452740 in request_frame (link=0x1295ba0) at
 libavfilter/vsrc_buffer.c:196
 #10 0x0000000000446f05 in avfilter_request_frame (link=<value optimized
 out>) at libavfilter/avfilter.c:505
 #11 avfilter_request_frame (link=<value optimized out>) at
 libavfilter/avfilter.c:507
 #12 avfilter_request_frame (link=<value optimized out>) at
 libavfilter/avfilter.c:507
 #13 avfilter_request_frame (link=<value optimized out>) at
 libavfilter/avfilter.c:507
 #14 avfilter_request_frame (link=<value optimized out>) at
 libavfilter/avfilter.c:507
 #15 avfilter_request_frame (link=<value optimized out>) at
 libavfilter/avfilter.c:507
 #16 avfilter_request_frame (link=<value optimized out>) at
 libavfilter/avfilter.c:507
 #17 avfilter_request_frame (link=<value optimized out>) at
 libavfilter/avfilter.c:507
 #18 avfilter_request_frame (link=<value optimized out>) at
 libavfilter/avfilter.c:507
 #19 0x00000000004525c4 in av_vsink_buffer_get_video_buffer_ref (ctx=<value
 optimized out>, picref=0x12899d0, flags=0) at
 libavfilter/vsink_buffer.c:109
 #20 0x0000000000405fe1 in output_packet (ist=<value optimized out>,
 ist_index=1, ost_table=0x12eefe0, nb_ostreams=1, pkt=<value optimized
 out>)
     at ffmpeg.c:1735
 #21 0x000000000043ce58 in transcode (nb_output_files=1,
 input_files=0x127fcb0, nb_input_files=1, stream_maps=0x0,
 nb_stream_maps=<value optimized out>,
     output_files=0xd22960) at ffmpeg.c:2821
 #22 0x0000000000438bcb in main (argc=<value optimized out>,
 argv=0x7fffffffdfc8) at ffmpeg.c:4578
 (gdb) disass $pc-32,$pc+32
 Dump of assembler code from 0x12894c0 to 0x1289500:
    0x00000000012894c0:  data16
    0x00000000012894c1:  insb   (%dx),%es:(%rdi)
    0x00000000012894c2:  (bad)
    0x00000000012894c3:  addr32 jae 0x1289503
    0x00000000012894c6:  xor    %bh,0x31(%rax)
    0x00000000012894c9:  add    %dl,0x2c(%rdi)
    0x00000000012894cc:  xor    %eax,(%rax)
    0x00000000012894ce:  add    %al,(%rax)
    0x00000000012894d0:  add    %al,(%rax)
    0x00000000012894d2:  add    %al,(%rax)
    0x00000000012894d4:  add    %al,(%rax)
    0x00000000012894d6:  add    %al,(%rax)
    0x00000000012894d8:  roll   $0x0,(%rax)
    0x00000000012894db:  add    %al,(%rax)
    0x00000000012894dd:  add    %al,(%rax)
    0x00000000012894df:  add    %bh,0x0(%rdi)
    0x00000000012894e2:  (bad)
    0x00000000012894e3:  add    %bh,0x0(%rdi)
    0x00000000012894e6:  (bad)
    0x00000000012894e7:  add    %bh,0x0(%rdi)
    0x00000000012894ea:  (bad)
    0x00000000012894eb:  add    %bh,0x0(%rdi)
    0x00000000012894ee:  (bad)
    0x00000000012894ef:  add    %bh,0x0(%rdi)
    0x00000000012894f2:  (bad)
    0x00000000012894f3:  add    %bh,0x0(%rdi)
    0x00000000012894f6:  (bad)
    0x00000000012894f7:  add    %bh,0x0(%rdi)
    0x00000000012894fa:  (bad)
    0x00000000012894fb:  add    %bh,0x0(%rdi)
    0x00000000012894fe:  (bad)
    0x00000000012894ff:  add    %bh,0x0(%rdi)
 End of assembler dump.
 (gdb) info all-registers
 rax            0x0      0
 rbx            0x129bf60        19513184
 rcx            0x13122c8        19997384
 rdx            0x12894e0        19436768
 rsi            0x0      0
 rdi            0x12a1700        19535616
 rbp            0x0      0x0
 rsp            0x7fffffffc370   0x7fffffffc370
 r8             0x140    320
 r9             0x8014   32788
 r10            0x280    640
 r11            0x2      2
 r12            0x0      0
 r13            0x0      0
 r14            0x0      0
 r15            0x12a1660        19535456
 rip            0x12894e0        0x12894e0
 eflags         0x10246  [ PF ZF IF RF ]
 cs             0x33     51
 ss             0x2b     43
 ds             0x0      0
 es             0x0      0
 fs             0x0      0
 gs             0x0      0
 st0            -nan(0x2f3c2f7c2f802f80) (raw 0xffff2f3c2f7c2f802f80)
 st1            -nan(0x2f002f002f802f80) (raw 0xffff2f002f002f802f80)
 st2            -inf     (raw 0xffff0000000000000000)
 st3            -nan(0x3c007c003c007d)   (raw 0xffff003c007c003c007d)
 st4            -nan(0x7676767676767676) (raw 0xffff7676767676767676)
 st5            -nan(0x7676767676767676) (raw 0xffff7676767676767676)
 st6            -nan(0x7676767676767676) (raw 0xffff7676767676767676)
 st7            -inf     (raw 0xffff0000000000000000)
 fctrl          0x37f    895
 fstat          0x20     32
 ftag           0xaaaa   43690
 fiseg          0x0      0
 fioff          0x0      0
 foseg          0x0      0
 fooff          0x0      0
 fop            0x0      0
 xmm0           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
 0x0,
     0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
 uint128 = 0x00000000000000000000000000000000}
 xmm1           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
 0x0,
     0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
 uint128 = 0x00000000000000000000000000000000}
 xmm2           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
 0x0,
     0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
 uint128 = 0x00000000000000000000000000000000}
 xmm3           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
 0x0,
     0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
 uint128 = 0x00000000000000000000000000000000}
 xmm4           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double =
 {0x8000000000000000, 0x0}, v16_int8 = {0x73, 0x74, 0x69, 0x6e, 0x61, 0x74,
 0x69, 0x6f, 0x6e,
     0x20, 0x72, 0x61, 0x6e, 0x67, 0x65, 0x0}, v8_int16 = {0x7473, 0x6e69,
 0x7461, 0x6f69, 0x206e, 0x6172, 0x676e, 0x65}, v4_int32 = {0x6e697473,
     0x6f697461, 0x6172206e, 0x65676e}, v2_int64 = {0x6f6974616e697473,
 0x65676e6172206e}, uint128 = 0x0065676e6172206e6f6974616e697473}
 xmm5           {v4_float = {0x3, 0x3, 0x3, 0x3}, v2_double = {0x20, 0x20},
 v16_int8 = {0x40 <repeats 16 times>}, v8_int16 = {0x4040, 0x4040, 0x4040,
     0x4040, 0x4040, 0x4040, 0x4040, 0x4040}, v4_int32 = {0x40404040,
 0x40404040, 0x40404040, 0x40404040}, v2_int64 = {0x4040404040404040,
     0x4040404040404040}, uint128 = 0x40404040404040404040404040404040}
 xmm6           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double =
 {0x8000000000000000, 0x8000000000000000}, v16_int8 = {0x5b <repeats 16
 times>}, v8_int16 = {
     0x5b5b, 0x5b5b, 0x5b5b, 0x5b5b, 0x5b5b, 0x5b5b, 0x5b5b, 0x5b5b},
 v4_int32 = {0x5b5b5b5b, 0x5b5b5b5b, 0x5b5b5b5b, 0x5b5b5b5b}, v2_int64 = {
     0x5b5b5b5b5b5b5b5b, 0x5b5b5b5b5b5b5b5b}, uint128 =
 0x5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b}
 xmm7           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0x20 <repeats 16 times>}, v8_int16 = {0x2020, 0x2020, 0x2020,
 0x2020,
     0x2020, 0x2020, 0x2020, 0x2020}, v4_int32 = {0x20202020, 0x20202020,
 0x20202020, 0x20202020}, v2_int64 = {0x2020202020202020,
 0x2020202020202020},
   uint128 = 0x20202020202020202020202020202020}
 xmm8           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
 0x0,
     0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
 uint128 = 0x00000000000000000000000000000000}
 xmm9           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double =
 {0x8000000000000000, 0x8000000000000000}, v16_int8 = {0x0, 0x0, 0x0, 0xff
 <repeats 13 times>},
   v8_int16 = {0x0, 0xff00, 0xffff, 0xffff, 0xffff, 0xffff, 0xffff,
 0xffff}, v4_int32 = {0xff000000, 0xffffffff, 0xffffffff, 0xffffffff},
 v2_int64 = {
     0xffffffffff000000, 0xffffffffffffffff}, uint128 =
 0xffffffffffffffffffffffffff000000}
 xmm10          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0x0, 0x0, 0x20, 0x20, 0x0, 0x20, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
 0x0,
     0x20, 0x0, 0x0}, v8_int16 = {0x0, 0x2020, 0x2000, 0x0, 0x0, 0x0,
 0x2000, 0x0}, v4_int32 = {0x20200000, 0x2000, 0x0, 0x2000}, v2_int64 = {
     0x200020200000, 0x200000000000}, uint128 =
 0x00002000000000000000200020200000}
 xmm11          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0x0, 0xff, 0xff, 0xff, 0xff, 0xff, 0x0, 0x0, 0x0, 0x0, 0x0,
 0x0, 0x0,
     0xff, 0x0, 0x0}, v8_int16 = {0xff00, 0xffff, 0xffff, 0x0, 0x0, 0x0,
 0xff00, 0x0}, v4_int32 = {0xffffff00, 0xffff, 0x0, 0xff00}, v2_int64 = {
     0xffffffffff00, 0xff0000000000}, uint128 =
 0x0000ff00000000000000ffffffffff00}
 xmm12          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0x29, 0xf2, 0x88, 0x6c, 0xa6, 0x49, 0xde, 0x3e, 0x0, 0x0, 0x0,
 0x0,
     0x0, 0x0, 0x0, 0x0}, v8_int16 = {0xf229, 0x6c88, 0x49a6, 0x3ede, 0x0,
 0x0, 0x0, 0x0}, v4_int32 = {0x6c88f229, 0x3ede49a6, 0x0, 0x0}, v2_int64 =
 {
     0x3ede49a66c88f229, 0x0}, uint128 =
 0x00000000000000003ede49a66c88f229}
 xmm13          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0xb3, 0x12, 0x58, 0x17, 0x64, 0x46, 0xe6, 0x3b, 0x0, 0x0, 0x0,
 0x0,
     0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x12b3, 0x1758, 0x4664, 0x3be6, 0x0,
 0x0, 0x0, 0x0}, v4_int32 = {0x175812b3, 0x3be64664, 0x0, 0x0}, v2_int64 =
 {
     0x3be64664175812b3, 0x0}, uint128 =
 0x00000000000000003be64664175812b3}
 xmm14          {v4_float = {0x0, 0x3, 0x0, 0x0}, v2_double = {0x2d, 0x0},
 v16_int8 = {0xc0, 0x9, 0xf2, 0x16, 0xb5, 0xdf, 0x46, 0x40, 0x0, 0x0, 0x0,
 0x0,
     0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x9c0, 0x16f2, 0xdfb5, 0x4046, 0x0,
 0x0, 0x0, 0x0}, v4_int32 = {0x16f209c0, 0x4046dfb5, 0x0, 0x0}, v2_int64 =
 {
     0x4046dfb516f209c0, 0x0}, uint128 =
 0x00000000000000004046dfb516f209c0}
 xmm15          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
 v16_int8 = {0xa0, 0x83, 0x47, 0x3, 0x1d, 0x3c, 0x8a, 0xb5, 0x0, 0x0, 0x0,
 0x0,
     0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x83a0, 0x347, 0x3c1d, 0xb58a, 0x0,
 0x0, 0x0, 0x0}, v4_int32 = {0x34783a0, 0xb58a3c1d, 0x0, 0x0}, v2_int64 = {
     0xb58a3c1d034783a0, 0x0}, uint128 =
 0x0000000000000000b58a3c1d034783a0}
 mxcsr          0x1fa0   [ PE IM DM ZM OM UM PM ]
 }}}

 Maybe some assembler god can take a look at this, my brain melted while
 trying to understand this asm-and-macro jungle ;)

 I did a git bisect to find the place where it broke, but only ended up at
 commit e66149e714006d099d1ebfcca3f22ca74fc7dcf4 - I suspect it was broken
 before that commit, but somehow the detection choose another path for my
 CPU before that point. Here is my CPU info:

 {{{
 vendor_id       : AuthenticAMD
 cpu family      : 16
 model           : 2
 model name      : AMD Phenom(tm) 9950 Quad-Core Processor
 stepping        : 3
 cpu MHz         : 2600.000
 cache size      : 512 KB
 physical id     : 0
 siblings        : 4
 core id         : 3
 cpu cores       : 4
 apicid          : 3
 initial apicid  : 3
 fpu             : yes
 fpu_exception   : yes
 cpuid level     : 5
 wp              : yes
 flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
 cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
 pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc
 extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic
 cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs npt lbrv svm_lock
 bogomips        : 5217.61
 TLB size        : 1024 4K pages
 clflush size    : 64
 cache_alignment : 64
 address sizes   : 48 bits physical, 48 bits virtual
 power management: ts ttp tm stc 100mhzsteps hwpstate
 }}}

 But I first observed the problem on an Intel Atom D510.

-- 
Ticket URL: <https://ffmpeg.org/trac/ffmpeg/ticket/397>
FFmpeg <http://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list