Vulkan Mixture of Experts (MoE) support (llama/7628)

* Finish Vulkan mul_mat_id implementation

* Add Vulkan sum_rows and div ops

* Fix MUL_MAT_ID matrix matrix shader

* Fix MUL_MAT_ID matrix vector shader dispatch size

* Fix MUL_MAT_ID matrix vector shader and dispatch code

* Update Vulkan CPU offload for MUL_MAT_ID

* Fix crash when using split mode none and setting a main GPU
This commit is contained in:
0cc4m 2024-06-03 10:59:14 +02:00 committed by Georgi Gerganov
parent 8c01c9b85c
commit 2a6bab5655

File diff suppressed because it is too large Load Diff