[FFmpeg-devel] [PATCH v2 1/2] checkasm/rv40dsp: cover more cases
Ronald S. Bultje
rsbultje at gmail.com
Thu Dec 5 14:38:21 EET 2024
Hi,
Christophe asked me to chime in.
On Wed, Dec 4, 2024 at 4:14 AM <uk7b-at-foxmail.com at ffmpeg.org> wrote:
> --- a/tests/checkasm/rv40dsp.c
> +++ b/tests/checkasm/rv40dsp.c
> @@ -27,7 +27,7 @@
> #define randomize_buffers() \
> do { \
> for (int i = 0; i < 16*18*2; i++) \
> - src[i] = rnd() & 0x3; \
> + src[i] = rnd() & 0xff; \
> } while (0)
>
> static void check_chroma_mc(void)
>
This is correct.
> @@ -47,8 +47,8 @@ static void check_chroma_mc(void)
> #define CHECK_CHROMA_MC(name)
> \
> do {
> \
> if (check_func(h.name## <http://h.name#%23>
> _pixels_tab[size], #name "_mc%d", 1 << (3 - size))) { \
> - for (int x = 0; x < 2; x++) {
> \
> - for (int y = 0; y < 2; y++) {
> \
> + for (int x = 0; x < 8; x++) {
> \
> + for (int y = 0; y < 8; y++) {
> \
> memcpy(dst0, src, 16 * 18);
> \
> memcpy(dst1, src, 16 * 18);
> \
> call_ref(dst0, src, 16, 16, x, y);
> \
> --
> 2.47.1
>
This is theoretically correct, but it's very inefficient. If you look at
the x86 mc8 mmx, for example, you'll notice that there's 3 codepaths: one
for x==0&&y==0 (no subpel), one for 1D filtering (mx!=0 ^ my!=0) and one
for 2D filtering. It can also be implemented branchless (I believe that's
what mc4 does). But it's highly unusual to have special codepaths for mx=3
vs. mx=4, for example.
So the optimal way to test this is to keep the original loop, but add the
following:
for (int x = 0, mx = 0; x < 2; x++, mx = 1 + (rnd() % 7)) {
for (int y = 0, my = 0; y < 2; y++, my = 1 + (rnd() % 7)) {
[..]
cal_ref(dst0, src, 16, 16, mx, my);
[..]
}
}
This limits the number of tests to 4, while still covering all cases over
multiple checkasm runs, which is the correct way to test this.
Ronald
More information about the ffmpeg-devel
mailing list