whisper.cpp

History

Eve 374e9e0c5e ggml : IQ4_NL sgemm + Q4_0 AVX optimization (llama/9422) * squashed readd my iq4_nl sgemm PR https://github.com/ggerganov/llama.cpp/pull/8049 have ggml_vec_dot_q4_0 do two blocks per loop for avx try out f16c ggml_vec_dot_iq4_nl, but it's not really faster. as per https://github.com/ggerganov/llama.cpp/pull/8549 we can calculate several blocks at a time with no issue * shuffle * remove f16c iq4_nl as i cant make it faster than before		2024-09-24 19:45:08 +03:00
..
ggml-cann	cann : fix doxy (ggml/0)	2024-09-02 15:24:50 +03:00
ggml-cuda	CUDA: fix --split-mode row race condition (llama/9413)	2024-09-24 19:45:08 +03:00
ggml-sycl	Fix DMMV dequantization (llama/9279)	2024-09-24 19:45:08 +03:00
kompute-shaders	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
vulkan-shaders	Improve Vulkan shader build system (llama/9239)	2024-09-24 19:45:08 +03:00
CMakeLists.txt	cmake : correct order of sycl flags (llama/9497)	2024-09-24 19:45:08 +03:00
ggml-aarch64.c	ggml : AVX2 support for Q4_0_8_8 (llama/8713)	2024-09-24 19:45:08 +03:00
ggml-aarch64.h	ggml : add ggml-aarch64 (ggml/0)	2024-08-08 22:48:46 +03:00
ggml-alloc.c	ggml : reduce hash table reset cost (llama/8698)	2024-08-08 22:48:46 +03:00
ggml-backend-impl.h	ggml/examples: add backend support for numerical optimization (ggml/949)	2024-09-24 19:45:08 +03:00
ggml-backend.c	ggml/examples: add backend support for numerical optimization (ggml/949)	2024-09-24 19:45:08 +03:00
ggml-blas.cpp	ggml : reduce hash table reset cost (llama/8698)	2024-08-08 22:48:46 +03:00
ggml-cann.cpp	cann: Add host buffer type for Ascend NPU (llama/9406)	2024-09-24 19:45:08 +03:00
ggml-common.h	ggml-quants : ternary packing for TriLMs and BitNet b1.58 (llama/8151)	2024-09-24 19:45:08 +03:00
ggml-cuda.cu	rpc : fix segfault with nkvo (llama/9389)	2024-09-24 19:45:08 +03:00
ggml-impl.h	ggml-quants : ternary packing for TriLMs and BitNet b1.58 (llama/8151)	2024-09-24 19:45:08 +03:00
ggml-kompute.cpp	ggml/examples: add backend support for numerical optimization (ggml/949)	2024-09-24 19:45:08 +03:00
ggml-metal.m	metal : handle zero-sized allocs (llama/9466)	2024-09-24 19:45:08 +03:00
ggml-metal.metal	metal : separate scale and mask from QKT in FA kernel (llama/9189)	2024-08-28 13:22:20 +03:00
ggml-quants.c	ggml : IQ4_NL sgemm + Q4_0 AVX optimization (llama/9422)	2024-09-24 19:45:08 +03:00
ggml-quants.h	ggml-quants : ternary packing for TriLMs and BitNet b1.58 (llama/8151)	2024-09-24 19:45:08 +03:00
ggml-rpc.cpp	rpc : fix segfault with nkvo (llama/9389)	2024-09-24 19:45:08 +03:00
ggml-sycl.cpp	sycl : update support conditions (llama/9394)	2024-09-24 19:45:08 +03:00
ggml-vulkan.cpp	Overlap cmdbuffer creation and cmdbuffer execution in Vulkan backend by submitting smaller cmdbuffers early. (llama/9118)	2024-09-24 19:45:08 +03:00
ggml.c	ggml : ggml_type_name return "NONE" for invalid values (llama/9458)	2024-09-24 19:45:08 +03:00
sgemm.cpp	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
sgemm.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00