[FFmpeg-devel] [PATCH 1/4] lavc/flacenc: add sse4 version of the 16-bit lpc encoder
james.darnley at gmail.com
Fri Aug 8 16:30:57 CEST 2014
On 2014-07-21 01:48, Michael Niedermayer wrote:
> On Mon, Jul 21, 2014 at 12:32:23AM +0200, James Darnley wrote:
>> On 2014-03-15 00:01, Michael Niedermayer wrote:
>>> On Wed, Mar 12, 2014 at 01:03:03PM +0100, James Darnley wrote:
>>>> +; Is it worth looping correctly over the first samples? The most that ever need
>>>> +; to be copied is 32 so we might as well just unroll the loop and do all 32.
>>> implementations should not make assumtations on their use except
>>> what is documented in the API
>>> or the other way around
>>> if some limitation is always true and you want to write an
>>> implementation that takes advantage of the limitation for optimization
>>> then this limitation should be documented in the API first
>>> (in this case of FLACDSPContext / lpc_encode)
>> So... I've been bored lately and thought I'd come back to this. I've
>> got a changed version which copies these samples in a loop. You can see
>> the changes in these two links:
>> These two are just about the same but apply to 16 and 32-bit.
>> Should I try to measure the difference between the two? Or should I
>> just submit one version, possibly with suitable documentation?
> fastest is best, and docs must match implementation but docs can be
> changed for internal API
Testing showed no reliable difference.
To be specific: the new code, when measured, showed a runtime decrease
of about 0.2% (+/- 0.2), yet the function took a little more time to run
(also with a similarly large error).
Having done this I will submit the patches again using the old code but
with some small changes and documentation of its limits. I will also
add some further documentation about the C code because its unrolled
function (used in the not CONFIG_SMALL case) also assumes a maximum
order of 32.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 618 bytes
Desc: OpenPGP digital signature
More information about the ffmpeg-devel