[FFmpeg-trac] #397(swscale:new): swscale crashes when upscaling pictures using MMX2
FFmpeg
trac at avcodec.org
Sun Aug 14 19:44:51 CEST 2011
#397: swscale crashes when upscaling pictures using MMX2
-------------------------+---------------------
Reporter: TobiX | Owner: michael
Type: defect | Status: new
Priority: normal | Component: swscale
Version: git-master | Keywords:
Blocked By: | Blocking:
Reproduced: 0 | Analyzed: 0
-------------------------+---------------------
This crash was found using XBMC. XBMC would crash while generating
thumbnails for videos which are smaller then the configured thumbnail
size. If you want to reproduce this with XBMC, you need to compile it with
"--enable-external-libraries" since the internal copy is very old and does
not expose this bug.
As far as I can see, this crash only happens when upscaling from a YUV to
a RGB color space, sample run (make sure video is small then the scale
filter options):
{{{
$ ./ffmpeg_g -v 9 -loglevel 99 -i in.wmv -sws_flags fast_bilinear -vf
"scale=640:480" -vframes 1 -vcodec png output.png
ffmpeg version N-31884-gb854c2a, Copyright (c) 2000-2011 the FFmpeg
developers
built on Aug 14 2011 19:32:25 with gcc 4.6.1
configuration: --disable-doc --disable-stripping
libavutil 51. 12. 0 / 51. 12. 0
libavcodec 53. 10. 0 / 53. 10. 0
libavformat 53. 7. 0 / 53. 7. 0
libavdevice 53. 3. 0 / 53. 3. 0
libavfilter 2. 31. 1 / 2. 31. 1
libswscale 2. 0. 0 / 2. 0. 0
[asf @ 0x1fc3420] Format asf probed with size=2048 and score=100
[asf @ 0x1fc3420] gpos mismatch our pos=24, end=26
[asf @ 0x1fc3420] gpos mismatch our pos=24, end=1168
[asf @ 0x1fc3420] gpos mismatch our pos=24, end=38
[asf @ 0x1fc3420] gpos mismatch our pos=24, end=238
[asf @ 0x1fc3420] gpos mismatch our pos=24, end=38
[wmav2 @ 0x1fc58c0] Unsupported bit depth: 0
[wmv2 @ 0x1fc7c20] Unsupported bit depth: 0
[asf @ 0x1fc3420] All info found
Seems stream 1 codec frame rate differs from container frame rate: 1000.00
(1000/1) -> 30.00 (30/1)
Input #0, asf, from 'in.wmv':
Metadata:
WMFSDKVersion : 9.00.00.2980
WMFSDKNeeded : 0.0.0.0000
IsVBR : 0
Duration: 00:00:10.23, start: 0.000000, bitrate: 302 kb/s
Stream #0.0(eng), 5, 1/1000: Audio: wmav2, 44100 Hz, 1 channels, s16,
32 kb/s
Stream #0.1(eng), 41, 1/1000: Video: wmv2, yuv420p, 320x240, 1/1000,
618 kb/s, 30 tbr, 1k tbn, 1k tbc
Incompatible pixel format 'yuv420p' for codec 'png', auto-selecting format
'rgb24'
[buffer @ 0x1fc5840] w:320 h:240 pixfmt:yuv420p tb:1/1000000 sar:0/1
sws_param:
[scale @ 0x1fc7720] w:320 h:240 fmt:yuv420p -> w:640 h:480 fmt:rgb24
flags:0x1
[png @ 0x1fbd740] Unsupported bit depth: 0
[wmv2 @ 0x1fc7c20] Unsupported bit depth: 0
Output #0, image2, to 'output.png':
Metadata:
WMFSDKVersion : 9.00.00.2980
WMFSDKNeeded : 0.0.0.0000
IsVBR : 0
encoder : Lavf53.7.0
Stream #0.0(eng), 0, 1/90000: Video: png, rgb24, 640x480, 1/30,
q=2-31, 200 kb/s, 90k tbn, 30 tbc
Stream mapping:
Stream #0.1 -> #0.0
Press [q] to stop, [?] for help
[wmv2 @ 0x1fc7c20] I7:0/
zsh: segmentation fault ./ffmpeg_g -v 9 -loglevel 99 -i in.wmv -sws_flags
fast_bilinear -vf -vframes
}}}
GDB session:
{{{
(gdb) bt
#0 0x00000000012894e0 in ?? ()
#1 0x0000000000944eef in hyscale_fast_MMX2 (c=0x12a1700, dst=0x0,
dstWidth=19436768,
src=0x13122c8 '^' <repeats 40 times>, ']' <repeats 48 times>,
"\\\\\\\\\\\\\\\\]]]]]]]]", '\\' <repeats 40 times>, '[' <repeats 16
times>, 'Z' <repeats 40 times>..., srcW=320, xInc=32788) at
libswscale/x86/swscale_template.c:2289
#2 0x0000000001295dc0 in ?? ()
#3 0x00000000009346c4 in hyscale (c=0x1295dc0, src=<value optimized out>,
srcStride=0x7fffffffc650, srcSliceY=0, srcSliceH=240, dst=0x7fffffffc630,
dstStride=0x7fffffffc660) at libswscale/swscale.c:2243
#4 swScale (c=0x1295dc0, src=<value optimized out>,
srcStride=0x7fffffffc650, srcSliceY=0, srcSliceH=240, dst=0x7fffffffc630,
dstStride=0x7fffffffc660)
at libswscale/swscale.c:2688
#5 0x000000000091a79b in sws_scale (c=0x1295dc0, srcSlice=<value
optimized out>, srcStride=0x7fffffffc700, srcSliceY=0, srcSliceH=240,
dst=<value optimized out>, dstStride=0x7fffffffc710) at
libswscale/swscale_unscaled.c:794
#6 0x00000000004503cc in scale_slice (sws=<value optimized out>, y=<value
optimized out>, h=<value optimized out>, mul=<value optimized out>,
field=0,
link=<value optimized out>) at libavfilter/vf_scale.c:298
#7 0x0000000000450540 in draw_slice (link=0x1295ba0, y=0, h=240,
slice_dir=1) at libavfilter/vf_scale.c:315
#8 0x0000000000447df4 in avfilter_draw_slice (link=0x1295ba0, y=0, h=240,
slice_dir=1) at libavfilter/avfilter.c:616
#9 0x0000000000452740 in request_frame (link=0x1295ba0) at
libavfilter/vsrc_buffer.c:196
#10 0x0000000000446f05 in avfilter_request_frame (link=<value optimized
out>) at libavfilter/avfilter.c:505
#11 avfilter_request_frame (link=<value optimized out>) at
libavfilter/avfilter.c:507
#12 avfilter_request_frame (link=<value optimized out>) at
libavfilter/avfilter.c:507
#13 avfilter_request_frame (link=<value optimized out>) at
libavfilter/avfilter.c:507
#14 avfilter_request_frame (link=<value optimized out>) at
libavfilter/avfilter.c:507
#15 avfilter_request_frame (link=<value optimized out>) at
libavfilter/avfilter.c:507
#16 avfilter_request_frame (link=<value optimized out>) at
libavfilter/avfilter.c:507
#17 avfilter_request_frame (link=<value optimized out>) at
libavfilter/avfilter.c:507
#18 avfilter_request_frame (link=<value optimized out>) at
libavfilter/avfilter.c:507
#19 0x00000000004525c4 in av_vsink_buffer_get_video_buffer_ref (ctx=<value
optimized out>, picref=0x12899d0, flags=0) at
libavfilter/vsink_buffer.c:109
#20 0x0000000000405fe1 in output_packet (ist=<value optimized out>,
ist_index=1, ost_table=0x12eefe0, nb_ostreams=1, pkt=<value optimized
out>)
at ffmpeg.c:1735
#21 0x000000000043ce58 in transcode (nb_output_files=1,
input_files=0x127fcb0, nb_input_files=1, stream_maps=0x0,
nb_stream_maps=<value optimized out>,
output_files=0xd22960) at ffmpeg.c:2821
#22 0x0000000000438bcb in main (argc=<value optimized out>,
argv=0x7fffffffdfc8) at ffmpeg.c:4578
(gdb) disass $pc-32,$pc+32
Dump of assembler code from 0x12894c0 to 0x1289500:
0x00000000012894c0: data16
0x00000000012894c1: insb (%dx),%es:(%rdi)
0x00000000012894c2: (bad)
0x00000000012894c3: addr32 jae 0x1289503
0x00000000012894c6: xor %bh,0x31(%rax)
0x00000000012894c9: add %dl,0x2c(%rdi)
0x00000000012894cc: xor %eax,(%rax)
0x00000000012894ce: add %al,(%rax)
0x00000000012894d0: add %al,(%rax)
0x00000000012894d2: add %al,(%rax)
0x00000000012894d4: add %al,(%rax)
0x00000000012894d6: add %al,(%rax)
0x00000000012894d8: roll $0x0,(%rax)
0x00000000012894db: add %al,(%rax)
0x00000000012894dd: add %al,(%rax)
0x00000000012894df: add %bh,0x0(%rdi)
0x00000000012894e2: (bad)
0x00000000012894e3: add %bh,0x0(%rdi)
0x00000000012894e6: (bad)
0x00000000012894e7: add %bh,0x0(%rdi)
0x00000000012894ea: (bad)
0x00000000012894eb: add %bh,0x0(%rdi)
0x00000000012894ee: (bad)
0x00000000012894ef: add %bh,0x0(%rdi)
0x00000000012894f2: (bad)
0x00000000012894f3: add %bh,0x0(%rdi)
0x00000000012894f6: (bad)
0x00000000012894f7: add %bh,0x0(%rdi)
0x00000000012894fa: (bad)
0x00000000012894fb: add %bh,0x0(%rdi)
0x00000000012894fe: (bad)
0x00000000012894ff: add %bh,0x0(%rdi)
End of assembler dump.
(gdb) info all-registers
rax 0x0 0
rbx 0x129bf60 19513184
rcx 0x13122c8 19997384
rdx 0x12894e0 19436768
rsi 0x0 0
rdi 0x12a1700 19535616
rbp 0x0 0x0
rsp 0x7fffffffc370 0x7fffffffc370
r8 0x140 320
r9 0x8014 32788
r10 0x280 640
r11 0x2 2
r12 0x0 0
r13 0x0 0
r14 0x0 0
r15 0x12a1660 19535456
rip 0x12894e0 0x12894e0
eflags 0x10246 [ PF ZF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
st0 -nan(0x2f3c2f7c2f802f80) (raw 0xffff2f3c2f7c2f802f80)
st1 -nan(0x2f002f002f802f80) (raw 0xffff2f002f002f802f80)
st2 -inf (raw 0xffff0000000000000000)
st3 -nan(0x3c007c003c007d) (raw 0xffff003c007c003c007d)
st4 -nan(0x7676767676767676) (raw 0xffff7676767676767676)
st5 -nan(0x7676767676767676) (raw 0xffff7676767676767676)
st6 -nan(0x7676767676767676) (raw 0xffff7676767676767676)
st7 -inf (raw 0xffff0000000000000000)
fctrl 0x37f 895
fstat 0x20 32
ftag 0xaaaa 43690
fiseg 0x0 0
fioff 0x0 0
foseg 0x0 0
fooff 0x0 0
fop 0x0 0
xmm0 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,
0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
uint128 = 0x00000000000000000000000000000000}
xmm1 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,
0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
uint128 = 0x00000000000000000000000000000000}
xmm2 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,
0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
uint128 = 0x00000000000000000000000000000000}
xmm3 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,
0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
uint128 = 0x00000000000000000000000000000000}
xmm4 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double =
{0x8000000000000000, 0x0}, v16_int8 = {0x73, 0x74, 0x69, 0x6e, 0x61, 0x74,
0x69, 0x6f, 0x6e,
0x20, 0x72, 0x61, 0x6e, 0x67, 0x65, 0x0}, v8_int16 = {0x7473, 0x6e69,
0x7461, 0x6f69, 0x206e, 0x6172, 0x676e, 0x65}, v4_int32 = {0x6e697473,
0x6f697461, 0x6172206e, 0x65676e}, v2_int64 = {0x6f6974616e697473,
0x65676e6172206e}, uint128 = 0x0065676e6172206e6f6974616e697473}
xmm5 {v4_float = {0x3, 0x3, 0x3, 0x3}, v2_double = {0x20, 0x20},
v16_int8 = {0x40 <repeats 16 times>}, v8_int16 = {0x4040, 0x4040, 0x4040,
0x4040, 0x4040, 0x4040, 0x4040, 0x4040}, v4_int32 = {0x40404040,
0x40404040, 0x40404040, 0x40404040}, v2_int64 = {0x4040404040404040,
0x4040404040404040}, uint128 = 0x40404040404040404040404040404040}
xmm6 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double =
{0x8000000000000000, 0x8000000000000000}, v16_int8 = {0x5b <repeats 16
times>}, v8_int16 = {
0x5b5b, 0x5b5b, 0x5b5b, 0x5b5b, 0x5b5b, 0x5b5b, 0x5b5b, 0x5b5b},
v4_int32 = {0x5b5b5b5b, 0x5b5b5b5b, 0x5b5b5b5b, 0x5b5b5b5b}, v2_int64 = {
0x5b5b5b5b5b5b5b5b, 0x5b5b5b5b5b5b5b5b}, uint128 =
0x5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b}
xmm7 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0x20 <repeats 16 times>}, v8_int16 = {0x2020, 0x2020, 0x2020,
0x2020,
0x2020, 0x2020, 0x2020, 0x2020}, v4_int32 = {0x20202020, 0x20202020,
0x20202020, 0x20202020}, v2_int64 = {0x2020202020202020,
0x2020202020202020},
uint128 = 0x20202020202020202020202020202020}
xmm8 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,
0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
uint128 = 0x00000000000000000000000000000000}
xmm9 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double =
{0x8000000000000000, 0x8000000000000000}, v16_int8 = {0x0, 0x0, 0x0, 0xff
<repeats 13 times>},
v8_int16 = {0x0, 0xff00, 0xffff, 0xffff, 0xffff, 0xffff, 0xffff,
0xffff}, v4_int32 = {0xff000000, 0xffffffff, 0xffffffff, 0xffffffff},
v2_int64 = {
0xffffffffff000000, 0xffffffffffffffff}, uint128 =
0xffffffffffffffffffffffffff000000}
xmm10 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0x0, 0x0, 0x20, 0x20, 0x0, 0x20, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0,
0x20, 0x0, 0x0}, v8_int16 = {0x0, 0x2020, 0x2000, 0x0, 0x0, 0x0,
0x2000, 0x0}, v4_int32 = {0x20200000, 0x2000, 0x0, 0x2000}, v2_int64 = {
0x200020200000, 0x200000000000}, uint128 =
0x00002000000000000000200020200000}
xmm11 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0x0, 0xff, 0xff, 0xff, 0xff, 0xff, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0,
0xff, 0x0, 0x0}, v8_int16 = {0xff00, 0xffff, 0xffff, 0x0, 0x0, 0x0,
0xff00, 0x0}, v4_int32 = {0xffffff00, 0xffff, 0x0, 0xff00}, v2_int64 = {
0xffffffffff00, 0xff0000000000}, uint128 =
0x0000ff00000000000000ffffffffff00}
xmm12 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0x29, 0xf2, 0x88, 0x6c, 0xa6, 0x49, 0xde, 0x3e, 0x0, 0x0, 0x0,
0x0,
0x0, 0x0, 0x0, 0x0}, v8_int16 = {0xf229, 0x6c88, 0x49a6, 0x3ede, 0x0,
0x0, 0x0, 0x0}, v4_int32 = {0x6c88f229, 0x3ede49a6, 0x0, 0x0}, v2_int64 =
{
0x3ede49a66c88f229, 0x0}, uint128 =
0x00000000000000003ede49a66c88f229}
xmm13 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0xb3, 0x12, 0x58, 0x17, 0x64, 0x46, 0xe6, 0x3b, 0x0, 0x0, 0x0,
0x0,
0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x12b3, 0x1758, 0x4664, 0x3be6, 0x0,
0x0, 0x0, 0x0}, v4_int32 = {0x175812b3, 0x3be64664, 0x0, 0x0}, v2_int64 =
{
0x3be64664175812b3, 0x0}, uint128 =
0x00000000000000003be64664175812b3}
xmm14 {v4_float = {0x0, 0x3, 0x0, 0x0}, v2_double = {0x2d, 0x0},
v16_int8 = {0xc0, 0x9, 0xf2, 0x16, 0xb5, 0xdf, 0x46, 0x40, 0x0, 0x0, 0x0,
0x0,
0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x9c0, 0x16f2, 0xdfb5, 0x4046, 0x0,
0x0, 0x0, 0x0}, v4_int32 = {0x16f209c0, 0x4046dfb5, 0x0, 0x0}, v2_int64 =
{
0x4046dfb516f209c0, 0x0}, uint128 =
0x00000000000000004046dfb516f209c0}
xmm15 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0},
v16_int8 = {0xa0, 0x83, 0x47, 0x3, 0x1d, 0x3c, 0x8a, 0xb5, 0x0, 0x0, 0x0,
0x0,
0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x83a0, 0x347, 0x3c1d, 0xb58a, 0x0,
0x0, 0x0, 0x0}, v4_int32 = {0x34783a0, 0xb58a3c1d, 0x0, 0x0}, v2_int64 = {
0xb58a3c1d034783a0, 0x0}, uint128 =
0x0000000000000000b58a3c1d034783a0}
mxcsr 0x1fa0 [ PE IM DM ZM OM UM PM ]
}}}
Maybe some assembler god can take a look at this, my brain melted while
trying to understand this asm-and-macro jungle ;)
I did a git bisect to find the place where it broke, but only ended up at
commit e66149e714006d099d1ebfcca3f22ca74fc7dcf4 - I suspect it was broken
before that commit, but somehow the detection choose another path for my
CPU before that point. Here is my CPU info:
{{{
vendor_id : AuthenticAMD
cpu family : 16
model : 2
model name : AMD Phenom(tm) 9950 Quad-Core Processor
stepping : 3
cpu MHz : 2600.000
cache size : 512 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 4
apicid : 3
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc
extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic
cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs npt lbrv svm_lock
bogomips : 5217.61
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate
}}}
But I first observed the problem on an Intel Atom D510.
--
Ticket URL: <https://ffmpeg.org/trac/ffmpeg/ticket/397>
FFmpeg <http://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list