Branches · mirrored_repos / MachineLearning / Llama.Cpp · GitLab

This project is mirrored from https://github.com/ggerganov/llama.cpp. Pull mirroring updated Sep 19, 2024.

sycl_q3s_q1s

989e15b3 · Merge branch 'master' into sycl_q3s_q1s · Mar 11, 2024
gg/try-fix-sycl-iq1_s

76be02ae · sycl : fix grid type · Mar 11, 2024
ik/even_better_iq1s

5440a127 · iq1_s: fix dequantize on the CPU · Mar 11, 2024
ik/try_fix_iq1s_sycl

9f805264 · Attempt 2 · Mar 12, 2024
gg/metal-embed

abf0afd0 · ci : fix iOS builds to use embedded library · Mar 14, 2024
gg/repeng

0a9bc301 · control-vectors : minor code style updates · Mar 14, 2024
jg/flash-attn

7fca4586 · pragma unroll, use_mask template parameter · Mar 19, 2024
compilade/fix-server-tests-penalty

9a424a38 · server : fix tests expecting old repeat penalty · Mar 19, 2024
jg/flash-attn-4

82ae7f33 · fused attention kernel for batch size 1 · Mar 20, 2024
sl/blas-backend

fa012a95 · move BLAS to a separate backend · Mar 20, 2024
compilade/smaller-output-buffer

5f33a675 · perplexity : make hellaswag and multiple-choice outputs identical to master · Mar 20, 2024
ik/fix_k_cache_backend_tests

68e4fed4 · Now fix test-quantize-fns · Mar 21, 2024
sl/cuda-f16-fix2

4f7e57a2 · cuda : fix LLAMA_CUDA_F16 build · Mar 21, 2024
0cc4m/vulkan-improvements

1fceeb90 · Fix Intel dequant issue · Mar 21, 2024
ik/try_fix_rocm_k_cache

a710d58d · Try fix quantized k-cache on ROCm · Mar 21, 2024
gg/metal-dequant-align

072c56fc · metal : fix the fix · Mar 22, 2024
gg/enable-cb-default

31f2d03f · server : enable continuous batching by default · Mar 22, 2024
patch-1

12aa74ba · minor : spacing · Mar 22, 2024
gg/hf-args

8c3d5b5a · common : remove defaults · Mar 22, 2024
ik/quantize_not_repeating

0e826d12 · quantize: be able to specify the token embedding tensor type · Mar 22, 2024

Prev
1
…
21
22
23
24
25
26
Next

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾