Skip to content
GitLab
Explore
Sign in
Tags
Tags give the ability to mark specific points in history as being important
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
b3425
69b9945b
·
llama.swiftui: fix end of generation bug (#8268)
·
Jul 20, 2024
b3423
87e397d0
·
ggml : fix quant dot product with odd number of blocks (#8549)
·
Jul 19, 2024
b3421
d1975455
·
llama : bump max layers from 256 to 512 (#8530)
·
Jul 19, 2024
b3419
b57eb9ca
·
ggml : add friendlier error message to fopen errors (#8575)
·
Jul 19, 2024
b3418
f299aa98
·
fix: typo of chatglm4 chat tmpl (#8586)
·
Jul 19, 2024
b3416
a15ef8f8
·
CUDA: fix partial offloading for ne0 % 256 != 0 (#8572)
·
Jul 18, 2024
b3412
3807c3de
·
server : respect `--special` cli arg (#8553)
·
Jul 18, 2024
b3408
1bdd8ae1
·
[CANN] Add Ascend NPU backend (#6035)
·
Jul 17, 2024
b3407
da3913d8
·
batched: fix n_predict parameter (#8527)
·
Jul 17, 2024
b3406
d65a8361
·
llama : disable context-shift for DeepSeek v2 (#8501)
·
Jul 17, 2024
b3405
5e116e8d
·
make/cmake: add missing force MMQ/cuBLAS for HIP (#8515)
·
Jul 16, 2024
b3403
37b12f92
·
export-lora : handle help argument (#8497)
·
Jul 16, 2024
b3402
0efec577
·
llama : valign + remove unused ftype (#8502)
·
Jul 16, 2024
b3400
97bdd26e
·
Refactor lora adapter support (#8332)
·
Jul 15, 2024
b3398
8fac431b
·
ggml : suppress unknown pragma 'GCC' on windows (#8460)
·
Jul 15, 2024
b3396
9104bc20
·
common : add --no-cont-batching arg (#6358)
·
Jul 15, 2024
b3394
16bdfa42
·
[SYCL] add concat through dim 1/2 (#8483)
·
Jul 15, 2024
b3393
3dfda059
·
llama : de-duplicate deepseek2 norm
·
Jul 15, 2024
b3392
bda62d79
·
Vulkan MMQ Fix (#8479)
·
Jul 15, 2024
b3389
73cf442e
·
llama : fix Gemma-2 Query scaling factors (#8473)
·
Jul 14, 2024
Prev
1
…
9
10
11
12
13
14
15
16
17
…
123
Next