[FFmpeg-devel] [PATCH 1/2] lavc/pcm_tablegen: slight speedup of table generation

Ganesh Ajjanagadde gajjanag at mit.edu
Tue Jan 5 02:26:01 CET 2016


On Mon, Jan 4, 2016 at 2:45 AM, Michael Niedermayer
<michael at niedermayer.cc> wrote:
> On Sun, Jan 03, 2016 at 09:11:28PM -0800, Ganesh Ajjanagadde wrote:
>> On Sun, Jan 3, 2016 at 7:32 PM, Michael Niedermayer
>> <michael at niedermayer.cc> wrote:
>> > On Mon, Jan 04, 2016 at 04:04:02AM +0100, Michael Niedermayer wrote:
>> >> On Wed, Dec 30, 2015 at 08:34:55PM -0800, Ganesh Ajjanagadde wrote:
>> >> > This gets rid of some branches to speed up table generation slightly
>> >> > (impact higher on mulaw than alaw). Tables are identical to before,
>> >> > tested with FATE.
>> >> >
>> >> > Sample benchmark (Haswell, GNU/Linux+gcc):
>> >> > old:
>> >> >  313494 decicycles in build_alaw_table,    4094 runs,      2 skips
>> >> >  315959 decicycles in build_alaw_table,    8190 runs,      2 skips
>> >> >
>> >> >  323599 decicycles in build_ulaw_table,    4095 runs,      1 skips
>> >> >  318849 decicycles in build_ulaw_table,    8188 runs,      4 skips
>> >> >
>> >> > new:
>> >> >  261902 decicycles in build_alaw_table,    4096 runs,      0 skips
>> >> >  266519 decicycles in build_alaw_table,    8192 runs,      0 skips
>> >> >
>> >> >  209657 decicycles in build_ulaw_table,    4096 runs,      0 skips
>> >> >  232656 decicycles in build_ulaw_table,    8192 runs,      0 skips
>> >> >
>> >> > Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
>> >> > ---
>> >> >  libavcodec/pcm_tablegen.h | 24 ++++++++++++------------
>> >> >  1 file changed, 12 insertions(+), 12 deletions(-)
>> >> >
>> >> > diff --git a/libavcodec/pcm_tablegen.h b/libavcodec/pcm_tablegen.h
>> >> > index 1387210..7269977 100644
>> >> > --- a/libavcodec/pcm_tablegen.h
>> >> > +++ b/libavcodec/pcm_tablegen.h
>> >> > @@ -87,21 +87,21 @@ static av_cold void build_xlaw_table(uint8_t *linear_to_xlaw,
>> >> >  {
>> >> >      int i, j, v, v1, v2;
>> >> >
>> >> > -    j = 0;
>> >> > -    for(i=0;i<128;i++) {
>> >> > -        if (i != 127) {
>> >> > -            v1 = xlaw2linear(i ^ mask);
>> >> > -            v2 = xlaw2linear((i + 1) ^ mask);
>> >> > -            v = (v1 + v2 + 4) >> 3;
>> >> > -        } else {
>> >> > -            v = 8192;
>> >> > -        }
>> >> > -        for(;j<v;j++) {
>> >> > +    j = 1;
>> >> > +    linear_to_xlaw[8192] = mask;
>> >> > +    for(i=0;i<127;i++) {
>> >> > +        v1 = xlaw2linear(i ^ mask);
>> >> > +        v2 = xlaw2linear((i + 1) ^ mask);
>> >> > +        v = (v1 + v2 + 4) >> 3;
>> >> > +        for(;j<v;j+=1) {
>> >> > +            linear_to_xlaw[8192 - j] = (i ^ (mask ^ 0x80));
>> >> >              linear_to_xlaw[8192 + j] = (i ^ mask);
>> >> > -            if (j > 0)
>> >> > -                linear_to_xlaw[8192 - j] = (i ^ (mask ^ 0x80));
>> >> >          }
>> >> >      }
>> >> > +    for(;j<8192;j++) {
>> >> > +        linear_to_xlaw[8192 - j] = (127 ^ (mask ^ 0x80));
>> >> > +        linear_to_xlaw[8192 + j] = (127 ^ mask);
>> >> > +    }
>> >> >      linear_to_xlaw[0] = linear_to_xlaw[1];
>> >>
>> >> i think you can make the tables 8 times smaller
>> >
>> > forget this, i should have checked the whole table or looked when i
>> > am awake ...
>>
>> ha ha. By the way, both changes are needed to get this level of
>> speedup, with only the j change which you acked, the speedup is much
>> smaller. But then also note that the other parts of the patch also
>> increase the binary size more.
>
> hmm, ok if its needed to get the speedup then LGTM
>
> thanks

pushed, thanks

>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Concerning the gods, I have no means of knowing whether they exist or not
> or of what sort they may be, because of the obscurity of the subject, and
> the brevity of human life -- Protagoras
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


More information about the ffmpeg-devel mailing list