Commit Graph

  • 709a22b92d cuda : fix defrag with quantized KV (llama/9319) slaren 2024-09-05 11:13:11 +0200
  • 01e214a1d7 ggml : AVX2 support for Q4_0_8_8 (llama/8713) Srihari-mcw 2024-09-04 22:21:22 +0530
  • 1cecfe6a02 Fix DMMV dequantization (llama/9279) Ouadie EL FAROUKI 2024-09-04 16:26:33 +0100
  • 3764bc974c ggml : add pthread includes on FreeBSD (llama/9258) yuri@FreeBSD 2024-09-02 08:25:30 -0700
  • fcffc912a9 llama : support RWKV v6 models (llama/8980) Molly Sophia 2024-09-01 22:38:17 +0800
  • 38d40b9972 Threadpool: take 2 (llama/8672) Faisal Zaghloul 2024-08-29 19:20:53 -0400
  • 09149ee0ae vulkan: fix compilation with GGML_VULKAN_DEBUG=ON (ggml/948) Salvatore Mesoraca 2024-09-06 14:34:33 +0200
  • 6b7f37dd5c vulkan: add dryrun support to sin and cos ops (ggml/947) Salvatore Mesoraca 2024-09-06 14:34:25 +0200
  • 791812fb54 vulkan: correctly report support for OP_CONT (ggml/946) Salvatore Mesoraca 2024-09-06 14:34:07 +0200
  • 5d6dc19f04 tests: add gradient tests for all backends (ggml/932) Johannes Gäßler 2024-09-03 17:21:46 +0200
  • 34972dbe22
    go : add temperature options (#2417) Binozo 2024-09-20 14:45:36 +0200
  • bea43e0c64
    docker : add libsdl2-dev for container builds (#2424) JohnnyB 2024-09-20 13:36:43 +0100
  • 3853d83d73
    go : add tests and update bindings (#2425) Stavros Panakakis 2024-09-20 15:36:12 +0300
  • 5b1ce40fa8
    server : use OS-generated temp file name for converted files (#2419) Toliver 2024-09-17 21:56:32 +0900
  • 049b3a0e53
    go : fix CUDA build (#2416) Binozo 2024-09-15 11:23:56 +0200
  • a551933542
    cann : add Ascend NPU instructions (#2410) Mengqing Cao 2024-09-11 20:59:24 +0800
  • 5caa19240d
    cmake: Fix libdir value in pkgconfig file (#2407) Philippe Normand 2024-09-07 09:18:17 +0100
  • 5236f02784 revert : cmake : set MSVC to use UTF-8 on source files (#2346) Georgi Gerganov 2024-09-02 11:16:30 +0300
  • 2abaf19e0d sync : ggml Georgi Gerganov 2024-09-02 10:39:59 +0300
  • 6eb7a0ffbd ggml: fix ggml_graph_cpy undefined behavior (ggml/943) Johannes Gäßler 2024-08-31 14:35:42 +0200
  • e8f0f9b5f0 cann : fix doxy (ggml/0) Georgi Gerganov 2024-08-28 18:45:01 +0300
  • d8e24b877d vulkan : fix build (llama/0) Georgi Gerganov 2024-08-27 22:10:58 +0300
  • cc68f31577 cuda : mark BF16 CONT as unsupported Georgi Gerganov 2024-08-28 17:08:03 +0300
  • 4a4a52bf98 ggml : fix cont with transposed tensors when one dimension is 1 (ggml/934) Salvatore Mesoraca 2024-08-28 10:23:02 +0200
  • c96906d84d
    cmake : set MSVC to use UTF-8 on source files (#2346) Tim Miller 2024-08-30 20:04:04 +0900
  • 9600fc3eb1
    readme : remove invalid flag from Python example (#2396) UsernamesLame 2024-08-30 07:00:38 -0400
  • e2e55a6fed
    readme : fix link (#2394) Georgi Gerganov 2024-08-30 13:58:22 +0300
  • c4e1861d2c
    go : add beamsize/entropythold/maxcontext to context interface (#2350) hsinhoyeh 2024-08-28 22:09:01 +0800
  • da9809f243 talk-llama : sync llama.cpp Georgi Gerganov 2024-08-28 11:04:02 +0300
  • 9d754a56cf whisper : update FA call Georgi Gerganov 2024-08-28 11:02:54 +0300
  • 8cc90a0e80 sync : ggml Georgi Gerganov 2024-08-28 11:02:42 +0300
  • 82b5c56f63 sync : vulkan (skip) (llama/0) Georgi Gerganov 2024-08-27 21:48:22 +0300
  • b2ad484c89 ggml : do not crash when quantizing q4_x_x with an imatrix (llama/9192) slaren 2024-08-26 19:44:43 +0200
  • d96a17848f metal : separate scale and mask from QKT in FA kernel (llama/9189) Georgi Gerganov 2024-08-26 18:31:02 +0300
  • 0e7798677a ggml : add SSM Metal kernels (llama/8546) Georgi Gerganov 2024-08-26 17:55:36 +0300
  • 58a36d2e3b metal : gemma2 flash attention support (llama/9159) slaren 2024-08-26 11:08:59 +0200
  • 24d8534bd8 CPU/CUDA: Gemma 2 FlashAttention support (llama/8542) Johannes Gäßler 2024-08-24 21:34:59 +0200
  • 9b16ddd3a5 Add a space to supress a cmake warning (llama/9133) Akarshan Biswas 2024-08-22 19:39:47 +0530
  • 32f88af17b Add oneDNN primitive support (llama/9091) luoyu-intel 2024-08-22 12:50:10 +0800
  • 9bf7250bf9 llama : simplify Mamba with advanced batch splits (llama/8526) compilade 2024-08-21 17:58:11 -0400
  • 17e49d3ab2 fallback mmvq (llama/9088) Meng, Hengyu 2024-08-20 23:50:17 +0800
  • 58b725282a Fix SYCL im2col and convert Overflow with Large Dims (llama/9052) zhentaoyu 2024-08-20 23:06:51 +0800
  • 7e59afa1e0 rpc : print error message when failed to connect endpoint (llama/9042) Radoslav Gerganov 2024-08-19 10:11:45 +0300
  • 5ac022140e rpc : prevent crashes on invalid input (llama/9040) Radoslav Gerganov 2024-08-19 10:10:21 +0300
  • 0eaa67280c ggml : dynamic ggml_sched_max_splits based on graph_size (llama/9047) Nico Bosshard 2024-08-16 04:22:55 +0200
  • 5a62fdb735 cmake : remove unused option GGML_CURL (llama/9011) Georgi Gerganov 2024-08-14 09:14:49 +0300
  • 60098d6204 ggml : move rope type enum to ggml.h (llama/8949) Daniel Bevenius 2024-08-13 21:13:15 +0200
  • 317293e6a7 ggml: fix div-by-zero (llama/9003) DavidKorczynski 2024-08-12 13:21:41 +0100
  • 488a966c07 Optimize Vulkan backend for better CPU performance and less GPU synchronization overhead. (llama/8943) Markus Tavenrath 2024-08-11 10:09:09 +0200
  • 8954769aa2 feat: ref. cross entropy, add CUDA, fix grad test (ggml/929) Johannes Gäßler 2024-08-27 20:39:30 +0200
  • df06468d9e ggml: remove bad assert (ggml/928) Johannes Gäßler 2024-08-24 19:27:02 +0200
  • 1fbd828a5d examples: add MNIST training + missing ops Johannes Gäßler 2024-07-30 15:56:35 +0200
  • d2986f8b07
    models : add support for wget2 for fedora (#2387) Brad Murray 2024-08-28 04:46:01 -0400
  • 8bfa8574e2
    readme : update the path to bench.py (#2386) Peng 2024-08-28 16:45:05 +0800
  • 376567bf4f
    readme : fix typo (#2383) Ivo von Putzer Reibegg 2024-08-28 10:42:18 +0200
  • c0fd64a9c0
    readme : fix broken links in implementation details section (#2382) stormofice 2024-08-28 10:41:51 +0200
  • 6e9596f6de
    whisper : fix compile warning for unused params Georgi Gerganov 2024-08-28 11:40:11 +0300
  • 9e3c5345cd sync : ggml vulkan (ggml/0) Georgi Gerganov 2024-08-20 11:27:12 +0300
  • b6c05ce82f yolo : add backend support (ggml/924) Radoslav Gerganov 2024-08-19 10:09:33 +0300
  • 52c80cac00 ggml : fix typo in ggml-quants.c comment (ggml/922) Daniel Bevenius 2024-08-15 09:42:38 +0200
  • 3643120690 feat: add new sin and cos operators (ggml/919) Ronsor 2024-08-12 06:02:08 -0700
  • d65786ea54
    readme : fix broken links (#2358) Eric Curtin 2024-08-20 03:57:45 -0400
  • 7f78675008
    examples : use colorblind friendly TTY color scheme (#2360) Justine Tunney 2024-08-20 00:49:10 -0700
  • 22fcd5fd11
    sync : ggml Georgi Gerganov 2024-08-12 11:59:15 +0300
  • 993f0df419
    ggml : support forward pass broadcasting in ggml_sub (ggml/914) Salvatore Mesoraca 2024-08-11 10:08:53 +0200
  • 9b1788483c
    metal : fix uninitialized abort_callback (llama/8968) slaren 2024-08-10 15:42:10 +0200
  • ad37d26983
    rpc : sanitize tensor data + warnings (llama/0) Georgi Gerganov 2024-08-09 23:03:21 +0300
  • 81c999fe0a
    cann : add Ascend NPU support (#2336) Mengqing Cao 2024-08-09 20:21:56 +0800
  • 4b7de08bfd whisper : fix compile warning (#0) Georgi Gerganov 2024-08-08 22:59:59 +0300
  • 4b9c4de1ad sync : ggml Georgi Gerganov 2024-08-08 22:59:19 +0300
  • be88ee1d75 ggml : add CANN backend (llama/0) hipudding 2024-08-08 14:48:06 +0300
  • 3ab19c744e scripts : sync cann Georgi Gerganov 2024-08-08 22:58:13 +0300
  • 6eac06759b ci : disable ruby workflow (#0) Georgi Gerganov 2024-08-08 20:35:21 +0300
  • 2e9a5bd2c4 ci : try to fix FreeBSD (#0) Georgi Gerganov 2024-08-08 20:32:19 +0300
  • 58323bf8ed build : fix aarch64 (#0) Georgi Gerganov 2024-08-08 14:27:16 +0300
  • 22058f2dbc talk-llama : sync llama.cpp Georgi Gerganov 2024-08-08 14:16:50 +0300
  • 5b7979a1e6 sync : ggml Georgi Gerganov 2024-08-08 14:10:06 +0300
  • ee14c02365 ggml-backend : fix async copy from CPU (llama/8897) slaren 2024-08-07 13:29:02 +0200
  • ab39dd34e1 Updated SYCL device filtering (llama/8901) Ouadie EL FAROUKI 2024-08-07 11:25:36 +0100
  • b1348d3530 CUDA/HIP: fix tests/test-backend-ops (llama/8896) Johannes Gäßler 2024-08-07 09:07:52 +0200
  • 90641b5cf4 CUDA: fix padding logic for FP16/FP32 (llama/8884) Johannes Gäßler 2024-08-06 17:13:55 +0200
  • 4160b930f1 ggml : add epsilon as a parameter for group_norm (llama/8818) Molly Sophia 2024-08-06 15:26:46 +0800
  • 7a96e661e4 ggml : fix overflows in elu function (llama/8866) Justine Tunney 2024-08-05 05:43:40 -0700
  • a902fb4ab2 ggml : reading the runtime sve config of the cpu (llama/8709) jdomke 2024-08-04 01:34:41 +0900
  • 6cb38c3673 Fix conversion of unnormalized BF16->BF16 weights (llama/7843) Sigbjørn Skjæret 2024-08-02 21:11:39 +0200
  • 9cf14ebcbc Fixing wrong VDR iq4nl value (llama/8812) Ouadie EL FAROUKI 2024-08-02 01:55:17 +0100
  • 8e39ee171f ggml-cuda: Adding support for unified memory (llama/8035) matteo 2024-08-01 23:28:28 +0200
  • d26250f78c Build: Only include execinfo.h on linux systems that support it (llama/8783) Alex O'Connell 2024-08-01 12:53:46 -0400
  • 5218ea21b8 cuda : fix dmmv cols requirement to 2*GGML_CUDA_DMMV_X (llama/8800) slaren 2024-08-01 15:26:22 +0200
  • e60be821ce added android implementation of ggml_print_backtrace_symbols (llama/8751) l3utterfly 2024-07-30 23:40:18 +0900
  • 19708df884 cann: update cmake (llama/8765) wangshuai09 2024-07-30 18:37:35 +0800
  • 3f190addda Add TIMESTEP_EMBEDDING OP (llama/8707) zhentaoyu 2024-07-30 14:56:51 +0800
  • b355ee7cfa ggml: bugfix: fix the inactive elements is agnostic for risc-v vector (llama/8748) CarterLi999 2024-07-30 00:38:34 +0800
  • 49ac8872b4 cuda : organize vendor-specific headers into vendors directory (llama/8746) R0CKSTAR 2024-07-29 20:56:12 +0800
  • 8ef98ae7e3 add conv support (llama/8688) Meng, Hengyu 2024-07-29 10:50:27 +0800
  • e471adcfa5 feat: Support Moore Threads GPU (llama/8383) R0CKSTAR 2024-07-28 07:41:25 +0800
  • aa816c922c ggml : ignore more msvc warnings (ggml/906) Borislav Stanimirov 2024-08-07 10:00:56 +0300
  • b3264eb266 metal : fix struct name (ggml/912) Georgi Gerganov 2024-08-07 09:57:00 +0300
  • eb2eb87a58 metal : add abort callback (ggml/905) Conrad Kramer 2024-08-07 02:55:49 -0400
  • 83fcb0e486 vulkan : implement Stable Diffusion operators (ggml/904) 0cc4m 2024-08-04 17:28:08 +0200