[FFmpeg-devel] [PATCH] Optimization of original IFF codec

Fri Apr 23 12:58:45 CEST 2010

Hi a 38573583th time!

M?ns Rullg?rd a ?crit :
> I prefer to write just unsigned, saving a few keystrokes, but others
> may disagree.  Do as you please.
>   
Fixed!
>>      if (avctx->bits_per_coded_sample > 8) {
>>          av_log(avctx, AV_LOG_ERROR, "bit_per_coded_sample > 8 not supported\n");
>> @@ -52,8 +53,8 @@
>>          av_log(avctx, AV_LOG_ERROR, "palette data underflow\n");
>>          return AVERROR_INVALIDDATA;
>>      }
>> -    for (i=0; i < count; i++) {
>> -        pal[i] = 0xFF000000 | AV_RB24( avctx->extradata + i*3 );
>> +    for (i=0, j=0; i < count; i++, j += 3) {
>> +        pal[i] = 0xFF000000 | AV_RB24( avctx->extradata + j);
>>      }
>>      return 0;
>>  }
>>     
>
> Did you check if this makes any difference in the generated code?  The
> original is more readable.
>   
No, and I doubt that many compilers can do such stuff automatically.
>> @@ -71,7 +72,7 @@
>>          return AVERROR_INVALIDDATA;
>>      }
>>  
>> -    s->planesize = avctx->width / 8;
>> +    s->planesize = avctx->width >> 3;
>>      s->planebuf = av_malloc(s->planesize + FF_INPUT_BUFFER_PADDING_SIZE);
>>      if (!s->planebuf)
>>          return AVERROR(ENOMEM);
>>     
>
> This is most definitely a number being divided.
>   
Fixed.
>> @@ -95,12 +96,16 @@
>>   * @param plane plane number to decode as
>>   */
>>  #define DECLARE_DECODEPLANE(suffix, type) \
>> -static void decodeplane##suffix(void *dst, const uint8_t *const buf, int buf_size, int bps, int plane) \
>> +static inline void decodeplane##suffix(void *dst, \
>> +                                       const uint8_t *const buf, \
>> +                                       const unsigned int buf_size, \
>> +                                       const unsigned int bps, \
>> +                                       const unsigned int plane) \
>>  { \
>>      GetBitContext gb; \
>> -    int i, b; \
>> +    unsigned int i, b; \
>>      init_get_bits(&gb, buf, buf_size * 8); \
>> -    for(i = 0; i < (buf_size * 8 + bps - 1) / bps; i++) { \
>> +    for(i = 0; i < ((buf_size << 3) + bps - 1) / bps; i++) { \
>>          for (b = 0; b < bps; b++) { \
>>              ((type *)dst)[ i*bps + b ] |= get_bits1(&gb) << plane; \
>>          } \
>>     
>
> Once again, did you check the generated code?  If it makes no
> difference, the original code is better.
>   
I don't have to check the generated code to know that changing signed to
unsigned will change signed multiplication to unsigned multiplication
assembly instruction ;)

Also there are many architectures out there, which do unsigned
multiplication faster than signed. This even is more true for divisions
and if you look at the for body...
> And again, and again, and again...
>   
Fixed.
> NEVER make changes like this without looking at the output
Just have a knot in my brain, what was the way to extract disassembly
from object files?

I'll post the updated patch, when I compared the actual output.

-- 

Best regards,
                   :-) Basty/CDGS (-: