[FFmpeg-devel] Nasm/yasm support and x264 asm

Jason Garrett-Glaser darkshikari
Tue Jul 29 03:38:34 CEST 2008

I'm somewhat curious as to what the status on this is--has anyone
proposed a plan for porting any of the asm yet?  I suspect a very
considerable speed boost could be gained from doing so, and I remember
this was one of the primary arguments for allowing nasm/yasm-syntax
assembly in the first place.  Most of it would have to be modified to
handle variable stride though.  From what I remember, the useful asm
is as follows:


1.  Intra prediction (I'd be happy to LGPL the ones I wrote, rest is by Loren)
2.  SSE2 iDCT and iDCT8 (written by Loren)
3.  SSE2 deblocking (written by Loren)
4.  Cacheline-optimized pixel_average/bilinear MC (I'd be happy to
LGPL these); written for H.264 but I assume these could be used in any
format that uses bilinear halfpel interpolation.


1.  Ultra-fast cacheline-optimized SAD functions (~2x faster on Intel
CPUs) (written by Loren)
2.  SSE2 and SSSE3 SATD functions; not sure what lavc already has for
this (written by Loren)
3.  SSE2 SSD functions; not sure what lavc already has for this
(written by Loren)
4.  SIMD-optimized SSIM measurement (written by Loren)


1.  Fast memcpy/memset functions (I'd be happy to LGPL these)

We got this approved weeks ago, it'd be nice to get some sort of plan
going on how to get this all done and make the decision for nasm/yasm
support useful.

Dark Shikari

More information about the ffmpeg-devel mailing list