[FFmpeg-devel] [PATCH] GSM 6.10 audio decoder

Reimar Döffinger Reimar.Doeffinger
Mon Jul 5 19:13:03 CEST 2010


On Mon, Jul 05, 2010 at 09:47:02AM -0400, Ronald S. Bultje wrote:
> > +static int16_t dequant_tab[64][8] = {
> [cut]
> 
> gsmdata.h, missing const.

A gsmdata.c might make sense if someone would be interested in
doing an encoder, but why a .h?
It's not that much that it would disturb much there.

> > +static int get_rrp(int filtered)
> > +{
> > +    int abs = filtered;
> > +    if (abs < 0) abs = -abs;
> > +    if (abs < 11059) abs <<= 1;
> > +    else if (abs < 20070) abs += 11059;
> > +    else abs = (abs >> 2) + 26112;
> > +    return filtered < 0 ? -abs : abs;
> > +}
> 
> FFABS() as already pointed out, vertical align would help a lot also.
> What's "rrp"? Note also that you're doing "filtered<0" twice here, you
> could do both of that at once (if gcc doesn't already, check
> disassembly), as in:
> 
> if (filtered < 0) {
>     abs  = -filtered;
>     sign = -1;
> } else {
>     abs  =  filtered;
>     sign =  1;
> }
> 
> if (abs < 11059)      abs <<= 1;
> else if (abs < 20070) abs += 11059;
> else                  abs = (abs >> 2) + 26112
> 
> return sign * abs;

Does not gain you anything, you then have a maybe slow multiply
in addition and you still have one jump (FFABS is branchless).
If anything, storing the mask from FFABS and using it to negate once
more might make sense, but I'm not sure if it's too ugly and worth it.
Either way I'm in favour of doing more optimization if desired
after applying (so that regression testing is possible properly).

> > +static int filter_value(int in, int rrp[8], int v[9])
> > +{
> > +    int i;
> > +    for (i = 7; i >= 0; i--) {
> > +        in -= gsm_mult(rrp[i], v[i]);
> > +        v[i + 1] = v[i] + gsm_mult(rrp[i], in);
> > +    }
> > +    v[0] = in;
> > +    return in;
> > +}
> 
> Is this similar to the inner loop of ff_celp_lp_synthesis_filter()?
> Can we generalize that to decrease binary size without significantly
> hurting performance?

The principle is the same, but rounding and shifting is done after every
multiplication, so I see no chance if merging those on an object-code level.

> > +static void short_term_synth(GSMContext *ctx, int16_t *dst, const int16_t *src)
> > +{
> > +    int i;
> > +    int rrp[8];
> > +    int *lar = ctx->lar[ctx->lar_idx];
> > +    int *lar_prev = ctx->lar[ctx->lar_idx ^ 1];
> > +    for (i = 0; i < 8; i++)
> > +        rrp[i] = get_rrp((lar_prev[i] >> 2) + (lar_prev[i] >> 1) + (lar[i] >> 2));
> > +    for (i = 0; i < 13; i++)
> > +        dst[i] = filter_value(src[i], rrp, ctx->v);
> [cut]
> 
> Note how ff_celp_lp_synthesis_filter() does this loop in the actual
> function itself. You might want to follow a similar design, this is
> what other audio codecs do also (e.g. ra144.c).

Well, other codecs seem to use the same filter over the whole block,
not with 4 different values.
I though the compiler might be able to do more clever things like
an approprate decision between loop unrolling and inlining.
But it might be better to change it since it probably won't anyway...

> > +static int gsm_decode_block(AVCodecContext *avctx, int16_t *samples,
> > +                            GetBitContext *gb)
> [..]
> > +    for (i = 0; i < 4; i++) {
> > +        int lag      = get_bits(gb, 7);
> > +        int gain_idx = get_bits(gb, 2);
> > +        int offset   = get_bits(gb, 2);
> > +        lag = av_clip(lag, 40, 120);
> 
> lag = av_clip(get_bits(), .., ..);

I very much don't want to merge them.
This case only covers an error case, and ideally this should
be replaced by some kind of error concealment code...

> > +    memcpy(ctx->ref_buf, ctx->ref_buf + 160, 120 * sizeof(*ctx->ref_buf));
> > +    short_term_synth(ctx, samples, ctx->ref_buf + 120);
> > +    ctx->msr = postprocess(samples, ctx->msr);
> 
> memcpy() is slow, can you change postprocess to take ref_buf as input
> and output into samples? That's what all other voice codecs do.

Huh? Synth already does that (and for speed postprocess probably could/should be
merged with synth, but it makes the code rather ugly so I decided not to).
The memcpy only updates the reference buffers which also means that "only"
3/4 of all samples are copied.

> > +        *data_size = 2*2*GSM_FRAME_SIZE;
> > +    }
> > +    *data_size = frame_bytes;
> 
> Isn't that a little pointless?

I forgot to remove the others when I added that one together
with the output buffer size check.
Slightly updated patch attached (waiting for Diego's patch for docs).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ffgsm.diff
Type: text/x-diff
Size: 13273 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100705/6ebc71e6/attachment.diff>



More information about the ffmpeg-devel mailing list