[FFmpeg-devel] [PATCH] WMA Voice decoder

Reimar Döffinger Reimar.Doeffinger
Mon Feb 1 20:35:05 CET 2010


On Sun, Jan 31, 2010 at 09:46:43PM -0500, Ronald S. Bultje wrote:
> On Sat, Jan 30, 2010 at 6:48 PM, Reimar D?ffinger
> <Reimar.Doeffinger at gmx.de> wrote:
> > "short"? Also specifying the type only once is ugly.
> > Also if you use frame_size with a shift they all can be uint8_t
> > (though probably it's better to just "waste" those 34 bytes and
> > leave frame_size 16 bits, but the others don't need to be.
> 
> OK, so I changed this to be holdable in 32bits per frame type, maybe
> that's a bit too much just let me know how far I should go.

I'm generally in favour of the most readable, at least as long as we
are talking about saving a few _bytes_, if it was a kB that'd be different.

> >> + ? ?memset(cntr, ? ? ?0, sizeof(cntr));
> >> + ? ?for (n = 0; n < 17; n++) {
> >> + ? ? ? ?res = get_bits(gb, 3);
> >> + ? ? ? ?if (cntr[res] >= 3 + (res == 7))
> >> + ? ? ? ? ? ?return -1;
> >
> > What is the reason this can't just be cntr[res] > 3?
> > (assuming performance matters, documenting can't hurt
> > either way).
> 
> It might invalidly mark invalid bitstreams as valid. I guess we don't
> care, so changed.

Unless you intend to do something useful with that information there's
no point.
If you do intend, I'd say it depends on the speed difference it makes.

> >> + ? ?if (flags & 0x1000) {
> >
> > A name for that wouldn't hurt.
> 
> I wasn't sure what you meant here. I added a comment indicating what
> the flag means, but I might have misunderstood you.

Well, I usually consider an appropriately named variable name better
than a comment, i.e.
stupid_flag = flags & 0x1000;
or
flags & STUPID_FLAG;
instead of
flags & 0x1000 // stupid flag

> >> + ? ? ? ?for (m = 0; m < num; m++)
> >> + ? ? ? ? ? ?lsps[m] += base_q[n] + mul_q[n] * table[m + values[n] * num];
> >
> > I consider it better to extract constant stuff like values[n]*num outside the loop.
> > Particularly since due to aliasing rules the compiler might not be able to
> > to pull e.g. the base_q[n] out without that (hm, I admit I don't know for sure
> > if the const might be enough for that).
> 
> I did this for the values[n] * num. For the base_q[n], it seems
> overkill, I mean, it's marked const so the compiler should know
> better. I think expanding base_q[n] outside the loop is too little
> honour for gcc (maybe deserved, but still).

Um, the thing is that lsps and base_q is the same type, if you don't
add a restrict keyword the compiler usually simply _can't_ know that
lsps[m] is not the same as base_q[n] and thus _has_ to reload
base_q[n] each time.
I agree though that possibly the better solution is to add restrict (even
though I consider gcc stupid enough that I expect at least some version
to still get it wrong).

> >> + ? ?dequant_lsps(&lsps[5], ?5, &v[2], &vec_sizes[2], 2,
> >> + ? ? ? ? ? ? ? ? ff_wmavoice_dq_lsp16i2, &mul_lsf[2], &base_lsf[2]);
> >
> > I think lsps + 5, v + 2 etc. is the more common style in FFmpeg.
> 
> Uhm, so that will be a huge change, I use &..[off] inistead of ..+off
> everywhere in my code, not just here but also in rdt.c etc. Is that a
> big deal?

Not for me, just saying that it doesn't fit so nicely in overall FFmpeg style
I think.

> >> + ? ?static const uint8_t start_offset[94] = {
> >> + ? ? ? ? ?0, ? 2, ? 4, ? 6, ? 8, ?10, ?12, ?14, ?16, ?18, ?20, ?22,
> >> + ? ? ? ? 24, ?26, ?29, ?28, ?30, ?31, ?32, ?33, ?34, ?35, ?36, ?37,
> >> + ? ? ? ? 38, ?39, ?40, ?41, ?42, ?43, ?44, ?46, ?48, ?50, ?52, ?54,
> >> + ? ? ? ? 56, ?58, ?60, ?62, ?64, ?66, ?68, ?70, ?72, ?74, ?76, ?78,
> >> + ? ? ? ? 80, ?82, ?84, ?86, ?88, ?90, ?92, ?94, ?96, ?98, 100, 102,
> >> + ? ? ? ?104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126,
> >> + ? ? ? ?128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150,
> >> + ? ? ? ?152, 154, 156, 158, 160, 162, 164, 166, 168, 170
> [..]
> > And is that one non-monotonous value really correct?
> 
> Yes, it's really correct. I was surprised by that also.

Too bad, that makes it rather unreasonable to optimize that table away.

> >> + ? ? ? ? ? ?for (idx = 0; idx < MAX_FRAMESIZE / 2; idx++)
> >> + ? ? ? ? ? ? ? ?if (BIT_IS_SET(idx)) break;
> >
> > But that looks about like the most inefficient way to scan 80 bits for
> > a non-zero one ever conceived.
> 
> OK, so what do I use then? (I really don't know. :-).)

What exactly is best I can't say, and that sample code probably has quite a few bugs,
but something along the lines of
uint64_t tmp = AV_RN64(buffer);
if (tmp) {
#ifndef HAVE_BIGENDIAN
bswap_64(tmp);
#endif
if (tmp >> 32)
return 32 - av_log2(tmp >> 32);
else
return 64 - av_log2(tmp);
}
return 80 - av_log2_16bit(AV_RB16(buffer + 8));
(take care to handle the case where the return values is 80, i.e. everything is 0)

> >> + ? ? ? ?if (s->aw_pitch_range == 24) { ///< 3 pulses, 1:sign + 3:index each
> >> + ? ? ? ?} else { ///< 4 pulses, 1:sign + 2:index each
> >
> > I guess you can find all of those on your own.
> 
> ???

Doxy comments that make no sense because they will only confuse doxygen

> >> + ? ? ? ? ? ?while (fcb->x[fcb->n] < 0)
> >> + ? ? ? ? ? ? ? ?fcb->x[fcb->n] += fcb->pitch_lag;
> >
> > What are the values here? This might be a particularly slow way to calculate
> > if (fcb->x[fcb->n] < 0)
> > ? ?fcb->x[fcb->n] = (fcb->x[fcb->n] % fcb->pitch_lag) + fcb->pitch_lag;
> 
> They could be 1-2 pitch values below zero, so -10, -20 or so.

Then it might make sense (speed wise, it might be too ugly to do it if the speed
does not make a difference) to do
if (fcb->x[fcb->n] < 0) {
 ? ?fcb->x[fcb->n] += fcb->pitch_lag;
    if (fcb->x[fcb->n] < 0) {
 ? ?    fcb->x[fcb->n] += fcb->pitch_lag;
}

> >> + ? ? ? ? ? ?float pulse = get_bits1(gb) ? 1.0 : -1.0;
> >
> > I've seen that a lot. Maybe worth optimizing?
> 
> Same as above (also goes for AMR, SIPR).

Well, adding a function for it instead of duplicating code might still make sense
already now, even if you optimize only later. Not important though.

> >> + ? ? ? ? ? ?ff_acelp_interpolatef(&excitation[n], &excitation[n - pitch],
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?ff_wmavoice_ipol1_coeffs, 17,
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(idx >= 0) | (abs(idx) << 1), 9, 1);
> >
> > Maybe FFABS? Also that kind of thing is quite common, there may be a function for it.
> 
> it's hard to test the speed implifcation of that because it's so
> insignificant. I think at some point I said we should test abs() vs.
> FFABS(), there's no clear guideline for that right now.
> 
> I couldn't find a function, but I may just be inexperienced in finding
> them. :-).

There might be only to decode a value encoded that way.



More information about the ffmpeg-devel mailing list