Commit Graph

10 Commits

Author SHA1 Message Date
Georgi Gerganov
896c41ef30 metal : use F32 prec for K*Q in vec FA (llama/9595)
ggml-ci
2024-09-24 19:45:08 +03:00
Georgi Gerganov
d96a17848f metal : separate scale and mask from QKT in FA kernel (llama/9189)
* metal : separate scale and mask from QKT in FA kernel

* metal : ne01 check no longer necessary

* metal : keep data in local memory
2024-08-28 13:22:20 +03:00
Georgi Gerganov
0e7798677a ggml : add SSM Metal kernels (llama/8546)
* ggml : add ggml_ssm_conv metal impl

* ggml : add ssm_scan metal impl

ggml-ci
2024-08-28 13:22:20 +03:00
slaren
58a36d2e3b metal : gemma2 flash attention support (llama/9159) 2024-08-28 13:22:20 +03:00
Radoslav Gerganov
b6c05ce82f yolo : add backend support (ggml/924)
* yolo : add backend support

* metal : add sub and sqrt kernels

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-08-21 11:07:13 +03:00
Ronsor
3643120690 feat: add new sin and cos operators (ggml/919)
* ggml : add sin/cos operators

* ggml-cuda : add sin/cos operators

* ggml : add corresponding tests for sin/cos

* ggml : add backward computation for sin/cos operators

* ggml-vulkan : add sin/cos operators

* ggml-vulkan : add sin/cos shader source

* metal : add sin, cos

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-08-21 11:07:13 +03:00
slaren
be9a16fd3f ggml : fix quant dot product with odd number of blocks (llama/8549)
* ggml : fix iq4_nl dot product with odd number of blocks

* ggml : fix odd blocks for ARM_NEON (llama/8556)

* ggml : fix iq4_nl dot product with odd number of blocks

* ggml : fix q4_1

* ggml : fix q5_0

* ggml : fix q5_1

* ggml : fix iq4_nl metal

ggml-ci

* ggml : fix q4_0

* ggml : fix q8_0

ggml-ci

* ggml : remove special Q4_0 code for first 2 blocks

* ggml : fix sumf redefinition

---------

Co-authored-by: slaren <slarengh@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-08-08 22:48:46 +03:00
Georgi Gerganov
b852a4c5ca metal : template-ify some of the kernels (llama/8447)
ggml-ci
2024-08-08 22:48:46 +03:00
Clint Herron
c2c60dc9ba Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (llama/8258) 2024-07-08 14:53:55 +03:00
Georgi Gerganov
e30c679928
whisper : reorganize source code + improve CMake (#2256)
* scripts : update sync [no ci]

* files : reorganize [no ci]

* sync : llama.cpp

* cmake : link math library

* cmake : build normal ggml library

* files : move headers to include

* objc : fix path to ggml-metal.h

* ci : fix WHISPER_CUDA -> GGML_CUDA

* scripts : sync LICENSE [no ci]
2024-06-26 19:34:09 +03:00