whisper.cpp

Author	SHA1	Message	Date
Diego Devesa	1acfadb721	ggml-backend : add device and backend reg interfaces (llama/9707) Co-authored-by: Johannes Gäßler <johannesg@5d6.de>	2024-10-05 15:23:51 +03:00
Akarshan Biswas	c36ddc43c6	Revert "[SYCL] fallback mmvq (ggml/9088)" (llama/9579) This reverts commit 50addec9a532a6518146ab837a85504850627316.	2024-09-24 19:45:08 +03:00
Georgi Gerganov	34291099fb	ggml : refactoring (llama/#0) - d6a04f87 - 23e0d70b	2024-09-24 19:45:08 +03:00
Georgi Gerganov	d245d7aec7	ggml : fix builds (llama/0) ggml-ci	2024-09-24 19:45:08 +03:00
Alberto Cabrera Pérez	32f659861a	sycl : update support conditions (llama/9394) * sycl : update support condition to im2col Signed-off-by: Alberto Cabrera <alberto.cabrera@codeplay.com> * Added TODO to remind supporting FP32 im2col --------- Signed-off-by: Alberto Cabrera <alberto.cabrera@codeplay.com>	2024-09-24 19:45:08 +03:00
Neo Zhang Jianyu	3468983315	add check malloc result on device (llama/9346) * add check malloc result on device * update for review comments, check all malloc_device() result --------- Co-authored-by: arthw <14088817+arthw@users.noreply.github.com>	2024-09-24 19:45:08 +03:00
Johannes Gäßler	c7515b0995	ggml/examples: add backend support for numerical optimization (ggml/949) * CUDA eval works * stochastic gradient descent op * Adam except decay * CUDA CROSS_ENTROPY_LOSS_BACK * CUDA mnist-fc training works * backend CLI arg * refactor gguf load * remove sched from opt_step_adam * implement l1 regularization (weight decay) * extra call to add optimizer * initialize gradients with ggml_graph_reset * gradient accumulation * increment iter per eval instead of epoch * adjust backend interfaces * fix ggml_graph_reset without backend * fix ggml graph export/import * fixup * rename * revert ggml_opt changes * more general CUDA repeat_back * update documentation, fix CNN * validation split * add clarifying comment * optimize PyTorch training * adjust buffer size, thread count * fix 0.0f validation split * Update examples/mnist/mnist-common.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * fix gradient accumulation * tensor flag for accumulators -> tensor hash set * Update include/ggml.h Co-authored-by: slaren <slarengh@gmail.com> * Update tests/test-backend-ops.cpp Co-authored-by: slaren <slarengh@gmail.com> * Update tests/test-backend-ops.cpp Co-authored-by: slaren <slarengh@gmail.com> * fix test prints * Update src/ggml-backend.c Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * better CUDA support for noncontiguous out_prod * add comment --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: slaren <slarengh@gmail.com>	2024-09-24 19:45:08 +03:00
luoyu-intel	32f88af17b	Add oneDNN primitive support (llama/9091) * add onednn * add sycl_f16 * add dnnl stream * add engine map * use dnnl for intel only * use fp16fp16fp16 * update doc	2024-08-28 13:22:20 +03:00
Meng, Hengyu	17e49d3ab2	fallback mmvq (llama/9088) * fallback mmvq to mul_mat * mmvq in cuda path * Update ggml/src/ggml-sycl.cpp Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@codeplay.com> --------- Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@codeplay.com>	2024-08-28 13:22:20 +03:00
zhentaoyu	58b725282a	Fix SYCL `im2col` and `convert` Overflow with Large Dims (llama/9052) * sycl: fix im2col overflow and sync with cuda Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * sycl: fix convert overflow Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * sycl: fix convert and dequantize Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * sycl: fix ib in dmmv Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * sycl:refine convert Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * sycl: move downsample global_range into common Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * test: add im2col and convert test cases Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * test: make new cases only in sycl Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * test: comment new test_cases for only local testing Signed-off-by: zhentaoyu <zhentao.yu@intel.com> --------- Signed-off-by: zhentaoyu <zhentao.yu@intel.com>	2024-08-28 13:22:20 +03:00
zhentaoyu	3f190addda	Add `TIMESTEP_EMBEDDING` OP (llama/8707) Signed-off-by: zhentaoyu <zhentao.yu@intel.com>	2024-08-08 22:48:46 +03:00
Meng, Hengyu	8ef98ae7e3	add conv support (llama/8688)	2024-08-08 22:48:46 +03:00
slaren	dd916a2852	ggml : reduce hash table reset cost (llama/8698) * ggml : reduce hash table reset cost * fix unreachable code warnings after GGML_ASSERT(false) * GGML_ASSERT(false) -> GGML_ABORT("fatal error") * GGML_ABORT use format string	2024-08-08 22:48:46 +03:00
Meng, Hengyu	3e94c7a81d	add concat through dim 1/2 (llama/8483) * add concat through dim 1/2	2024-08-08 22:48:46 +03:00
Chen Xi	68d609a12c	fix the mul_mat_id ut issues (llama/8427) * fix part of mul_mat_id * skip the bfloat 16 sycl ut Signed-off-by: Chen Xi <xi2chen@intel.com> --------- Signed-off-by: Chen Xi <xi2chen@intel.com> Co-authored-by: Meng, Hengyu <hengyu.meng@intel.com> Co-authored-by: Chen Xi <xi2chen@intel.com>	2024-08-08 22:48:46 +03:00
Alberto Cabrera Pérez	2af4a52c39	sycl : Reenabled mmvq path for the SYCL Nvidia Backend (llama/8372) * SYCL : Reenabled mmvq path for the SYCL Nvidia Backend * Reduced verbosity of comment	2024-08-08 22:48:46 +03:00
Ouadie EL FAROUKI	c5b05321e9	Enabled more data types for oneMKL gemm_batch (llama/8236)	2024-07-08 14:53:55 +03:00
luoyu-intel	29a2739d27	Fix WARP_SIZE=16 bug of Intel GPU (llama/8266) * fix group_norm ut * split softmax * fix softmax * add concat support condition * revert debug code * move QK_WARP_SIZE to presets.hpp	2024-07-08 14:53:55 +03:00
Neo Zhang Jianyu	ee6d17f6b4	rm get_work_group_size() by local cache for performance (llama/8286) Co-authored-by: arthw <14088817+arthw@users.noreply.github.com>	2024-07-08 14:53:55 +03:00
luoyu-intel	f096cc6807	Fix the sub group size of Intel (llama/8106) * use warp_size macro for all sycl kernels * fix mask of permute_sub_group_by_xor * fix rms_norm with correct warp number * fix rms_norm_f32/group_norm_f32 * move norm to norm.cpp file * fix quantize bug * fix mmvq's batch size	2024-07-08 14:53:55 +03:00
zhentaoyu	db7e0dbe6e	Update SYCL-Rope op and Refactor (llama/8157) * align with rope.cu and move sycl-op to a single file	2024-07-08 14:53:55 +03:00
Georgi Gerganov	e30c679928	whisper : reorganize source code + improve CMake (#2256 ) * scripts : update sync [no ci] * files : reorganize [no ci] * sync : llama.cpp * cmake : link math library * cmake : build normal ggml library * files : move headers to include * objc : fix path to ggml-metal.h * ci : fix WHISPER_CUDA -> GGML_CUDA * scripts : sync LICENSE [no ci]	2024-06-26 19:34:09 +03:00

22 Commits