[FFmpeg-devel] The SSEPlus Project?
Thu Jun 5 16:12:04 CEST 2008
Hi, I just read about this for the first time and was just wondering
if the SSEPlus Project something that FFmpeg developers could learn
from, take advantage of, or use as is?, (if the is a license issue
then maybe AMD would be willing to consider dual licensing it to LGPL
or GPL for FFmpeg).
Hare is a quote from the AMD webpage linked above:
In March 2008, AMD initiated SSEPlus, an open-source project to help
developers write high performing SSE code. The SSEPlus library
simplifies SIMD development through optimized emulation of SSE
instructions, CPUID wrappers, and fast versions of key SIMD
algorithms. SSEPlus is available under the Apache v2.0 license.
Originally created as a core technology in the Framewave open-source
library, SSEPlus greatly enhances developer productivity. It provides
known-good versions of common SIMD operations with focused platform
optimizations. By taking advantage of the optimized emulation, a
developer can write algorithms once and compile for multiple target
architectures. This feature also allows developers to use future SSE
instructions before the actual target hardware is available.
? C/C++ APIs similar to SSE compiler intrinsics
? CPUID management functions
? Optimized emulation of SSE3, SSSE3, SSE4A, SSE4.1, and SSE5 instructions
? Implementations optimized for multiple target architectures
? Hundreds of additional high performance SIMD functions
? New SIMD operations include arithmetic and logical functions, fixed
accuracy math, sophisticated packing and unpacking operations,
trigonometry, and more
? Macros and include files to help developer productivity while
managing multiple target architectures
? Active development and community participation
? Developers no longer have to redevelop their algorithms to write for
multiple SSE revisions
? Simplified CPUID checking
? Simplified maintenance of code that targets different SSE instruction mixes
? SSEPlus provides containers to hold instructions that are desirable
in hardware (e.g., 32 bit integer divide)
? Helps developers use and implement instructions that match their own
? Optimize code once for target hardware while at the same time
ensuring that generated code conforms to the target hardware.
More information about the ffmpeg-devel