[FFmpeg-cvslog] vulkan_decode: halve execution pool size
Lynne
git at videolan.org
Thu Jun 8 00:59:53 EEST 2023
ffmpeg | branch: master | Lynne <dev at lynne.ee> | Wed Jun 7 02:59:55 2023 +0200| [24c4307b80785244d3def38a3787d6e76375f7b5] | committer: Lynne
vulkan_decode: halve execution pool size
Determined experimentally, on various videos and hardware.
On Intel, using less resources in-flight is around 15% faster,
with similar results on Nvidia hardware.
> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=24c4307b80785244d3def38a3787d6e76375f7b5
---
libavcodec/vulkan_decode.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/libavcodec/vulkan_decode.c b/libavcodec/vulkan_decode.c
index 889c67a15f..35e265a5b1 100644
--- a/libavcodec/vulkan_decode.c
+++ b/libavcodec/vulkan_decode.c
@@ -1105,8 +1105,9 @@ int ff_vk_decode_init(AVCodecContext *avctx)
session_create.pVideoProfile = &prof->profile_list.pProfiles[0];
/* Create decode exec context.
- * 4 async contexts per thread seems like a good number. */
- err = ff_vk_exec_pool_init(s, &qf_dec, &ctx->exec_pool, 4*avctx->thread_count,
+ * 2 async contexts per thread was experimentally determined to be optimal
+ * for a majority of streams. */
+ err = ff_vk_exec_pool_init(s, &qf_dec, &ctx->exec_pool, 2*avctx->thread_count,
nb_q, VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR, 0,
session_create.pVideoProfile);
if (err < 0)
More information about the ffmpeg-cvslog
mailing list