[FFmpeg-trac] #3048(undetermined:new): ICL and HAVE_INLINE_ASM
FFmpeg
trac at avcodec.org
Mon Oct 14 09:47:44 CEST 2013
#3048: ICL and HAVE_INLINE_ASM
-------------------------------------+-------------------------------------
Reporter: nikolaynnov | Type: defect
Status: new | Priority: normal
Component: | Version:
undetermined | unspecified
Keywords: icl | Blocked By:
Blocking: | Reproduced by developer: 0
Analyzed by developer: 0 |
-------------------------------------+-------------------------------------
I have old version of ffmpeg (0.8) with was compiled with ICL compiler
manually. Manually means what only required c-files (for 2-3 required for
me codecs) was included in my icproj. Now I see that ffmpeg has native
support for icl (through --toolchain=icl). So I have compiled last ffmpeg
(2.0.1) with standard instructions. Then I have notice that encoding to
MPEG4/AV_CODEC_ID_MSMPEG4V2 with new ffmpeg take too much time. I have
some tests: old ffmpeg encode test video for a 0:55 and new ffmpeg encode
the same video for a 1:19 - it is slowly on 43%!!! It is unacceptable in
real applications.
I make some investigation and found that problem is that some
optimizations are disabled for icl by default. The main degradation is
that some high optimized asm code is not included in icl. HAVE_INLINE_ASM
is 0 by default. I have defined HAVE_INLINE_ASM to 1, make code compilable
and linkable with some minor changes and now the test video is encoded for
0:55 min as expected.
I believe that icl build of ffmpeg have to be fully optimized by default.
I have attached some example of how to change code to make it compilable
with ICL and HAVE_INLINE_ASM. Please note that I have disabled some asm
code as I don't need it for my set of codecs.
Typically, there are 2 kind of changes:
1) icl doesn't know non ia32 - cltd instruction. It has to be replaced
with cdq instruction instead.
example:
#ifdef __ICL
#define MASK_ABS(mask, level) \
__asm__ ("cdq \n\t" \
"xorl %1, %0 \n\t" \
"subl %1, %0 \n\t" \
: "+a"(level), "=&d"(mask))
#else
#define MASK_ABS(mask, level) \
__asm__ ("cltd \n\t" \
"xorl %1, %0 \n\t" \
"subl %1, %0 \n\t" \
: "+a"(level), "=&d"(mask))
#endif
2)
#ifdef __ICL
__asm__ volatile(
...
"movq %N, %%mm5 \n\t"
...
: "m"(var)
);
#else
__asm__ volatile(
...
"movq "MANGLE(var)", %%mm5 \n\t"
...
);
#endif
I believe that icl syntax will work with gcc as well. So may be the best
solution will be to rewrite code to don't use MANGLE in movq.
--
Ticket URL: <https://ffmpeg.org/trac/ffmpeg/ticket/3048>
FFmpeg <http://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list