Commit Graph

9 Commits

Author SHA1 Message Date
Dan Johansson
c245168ba3 ggml : add run-time detection of neon, i8mm and sve (llama/9331)
* ggml: Added run-time detection of neon, i8mm and sve

Adds run-time detection of the Arm instructions set features
neon, i8mm and sve for Linux and Apple build targets.

* ggml: Extend feature detection to include non aarch64 Arm arch

* ggml: Move definition of ggml_arm_arch_features to the global data section
2024-10-03 12:22:17 +03:00
Charles Xu
1edea2eb4b ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels (llama/9217)
* ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels

* added fallback mechanism when the offline re-quantized model is not
optimized for the underlying target.

* fix for build errors

* remove prints from the low-level code

* Rebase to the latest upstream
2024-10-03 12:22:17 +03:00
Srihari-mcw
05c6139625 ggml : AVX512 gemm for Q4_0_8_8 (llama/9532)
* AVX512 version of ggml_gemm_q4_0_8x8_q8_0

* Remove zero vector parameter passing

* Rename functions and rearrange order of macros

* Edit commments

* style : minor adjustments

* Update x to start from 0

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-09-24 19:45:08 +03:00
Georgi Gerganov
34291099fb ggml : refactoring (llama/#0)
- d6a04f87
- 23e0d70b
2024-09-24 19:45:08 +03:00
Srihari-mcw
01e214a1d7 ggml : AVX2 support for Q4_0_8_8 (llama/8713)
* Add AVX2 based implementations for quantize_q8_0_4x8, ggml_gemv_q4_0_8x8_q8_0 and ggml_gemm_q4_0_8x8_q8_0 functions

* Update code to fix issues occuring due to non alignment of elements to be processed as multiple of 16 in MSVC

* Update comments and indentation

* Make updates to reduce number of load instructions
2024-09-24 19:45:08 +03:00
slaren
b2ad484c89 ggml : do not crash when quantizing q4_x_x with an imatrix (llama/9192) 2024-08-28 13:22:20 +03:00
jdomke
a902fb4ab2 ggml : reading the runtime sve config of the cpu (llama/8709)
* ggml : reading the runtime sve config of the cpu

* change to one time init to prevent performance drop

* prefix variable to avoid possible conflicts

* revert xxhash fix and add brackets

---------

Co-authored-by: domke <673751-domke@users.noreply.gitlab.com>
2024-08-08 22:48:46 +03:00
Borislav Stanimirov
aa816c922c ggml : ignore more msvc warnings (ggml/906) 2024-08-08 22:48:46 +03:00
Dibakar Gope
525f190917 ggml : add ggml-aarch64 (ggml/0) 2024-08-08 22:48:46 +03:00