whisper.cpp

History

Jeff Bolz a753a82462 vulkan: get the first command buffer submitted sooner (llama/10499) This is an incremental improvement over #9118 to get work to the GPU a bit sooner. The first part is to start with a smaller number of nodes before the first submit, and ramp it up to the current 100 nodes/submit. The second part is to reduce the dryrun overhead for all the nodes that just need to request descriptor space. With these changes I get around 1-2% speedup on RTX 4070 combined with my old Haswell-era CPU.		2024-12-08 20:14:35 +02:00
..
include	ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)	2024-12-08 20:14:35 +02:00
src	vulkan: get the first command buffer submitted sooner (llama/10499)	2024-12-08 20:14:35 +02:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	ggml : add support for dynamic loading of backends (llama/10469)	2024-12-08 20:14:35 +02:00