[FFmpeg-devel] [PATCH 5/9] sbc: implement SBC encoder (low-complexity subband codec)

Rostislav Pehlivanov atomnuker at gmail.com
Thu Feb 22 20:18:48 EET 2018


On 21 February 2018 at 22:37, Aurelien Jacobs <aurel at gnuage.org> wrote:

> This was originally based on libsbc, and was fully integrated into ffmpeg.
> ---
>  doc/general.texi         |   2 +-
>  libavcodec/Makefile      |   1 +
>  libavcodec/allcodecs.c   |   1 +
>  libavcodec/sbcdsp.c      | 382 ++++++++++++++++++++++++++++++
> +++++++++++++
>  libavcodec/sbcdsp.h      |  83 ++++++++++
>  libavcodec/sbcdsp_data.c | 329 +++++++++++++++++++++++++++++++++++++
>  libavcodec/sbcdsp_data.h |  55 +++++++
>  libavcodec/sbcenc.c      | 411 ++++++++++++++++++++++++++++++
> +++++++++++++++++
>  8 files changed, 1263 insertions(+), 1 deletion(-)
>  create mode 100644 libavcodec/sbcdsp.c
>  create mode 100644 libavcodec/sbcdsp.h
>  create mode 100644 libavcodec/sbcdsp_data.c
>  create mode 100644 libavcodec/sbcdsp_data.h
>  create mode 100644 libavcodec/sbcenc.c
>
> +
> +#define OFFSET(x) offsetof(SBCEncContext, x)
> +#define AE AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM
> +static const AVOption options[] = {
> +    { "joint_stereo", "use joint stereo",
> +      OFFSET(joint_stereo), AV_OPT_TYPE_BOOL, { .i64 =  0 }, 0,   1, AE },
>
+    { "dual_channel", "use dual channel",
> +      OFFSET(dual_channel), AV_OPT_TYPE_BOOL, { .i64 =  0 }, 0,   1, AE },
>

Erm those 2 things should be decided by the encoder, not by exposing them
to the user. The encoder should decide which mode has lower distortion for
a given signal.



> +    { "subbands",     "number of subbands (4 or 8)",
> +      OFFSET(subbands),     AV_OPT_TYPE_INT,  { .i64 =  8 }, 4,   8, AE },
>

The encoder doesn't check if the value isn't 4 or 8 so 5, 6 and 7 are all
accepted. Similar issue to the previous option too.



> +    { "bitpool",      "bitpool value",
> +      OFFSET(bitpool),      AV_OPT_TYPE_INT,  { .i64 = 32 }, 0, 255, AE },
>

This should be controlled by the bitrate setting. Either have a function to
translate bitrate to bitpool value or a table which approximately maps
bitrate values supplied to bitpools. You could expose it directly as well
as mapping it to a bitrate value by using the global_quality setting so it
shouldn't be a custom encoder option.



> +    { "blocks",       "number of blocks (4, 8, 12 or 16)",
> +      OFFSET(blocks),       AV_OPT_TYPE_INT,  { .i64 = 16 }, 4,  16, AE },
> +    { "snr",          "use SNR mode (instead of loudness)",
> +      OFFSET(allocation),   AV_OPT_TYPE_BOOL, { .i64 =  0 }, 0,   1, AE },
>

SNR mode too needs to be decided by the encoder rather than exposing it as
a setting.



> +    { "msbc",         "use mSBC mode (wideband speech mono SBC)",
>

Add a profile fallback setting for this as well, like in aac where -aac_ltp
turns LTP mode on and -profile:a aac_ltp does the same.


You don't have to make the encoder decide which stereo coupling mode or
snr/loudness setting to use, you can implement that with a later patch.
I think you should remove the "blocks" and "subbands" settings as well and
instead replace those with a single "latency" setting like the native Opus
encoder in milliseconds which would adjust both of them on init to set the
frame size. This would also allow the encoder to change them. Again, you
don't have to do this now, you can send a patch which adds a "latency"
option later.
So in total, only 2 options would be needed, "msbc" as an additional way to
use msbc and "latency", which can be added later. For now you should set
all unexposed options to do something safe by default.

Apart from that, I tested the encoder, valgrind looks clean, the SIMD is
bitexact and all advertised samplerates are supported.


More information about the ffmpeg-devel mailing list