Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 20, 2024
.
alloc-mmap-fix
b5b8ff9f
·
ggml-alloc : correctly check mmap return value for errors
·
Sep 08, 2023
ik/fix_kernel_norm
7d6fac3f
·
Merge branch 'master' into ik/fix_kernel_norm
·
Sep 07, 2023
bench-warmup
08c799a4
·
llama-bench : use two tokens in the warmup run for prompt evals
·
Sep 07, 2023
metal-fix-norm
2f689dee
·
metal : minor
·
Sep 07, 2023
ik/metal_rope
8a4b97e5
·
Parallel RoPE on metal
·
Sep 05, 2023
build-metal-default
30ac7a41
·
gitignore : metal
·
Sep 04, 2023
metal-cont-bug
f3a84b2e
·
llama : better express the KV cache dependencies in the graph
·
Sep 04, 2023
speculative-grammar
c79d130f
·
make : fix speculative build
·
Sep 04, 2023
ik/issue_2982
2d5f5d74
·
Also guard against extremely small weights
·
Sep 04, 2023
ik/metal_q3k
2cab21c3
·
nother small improvement for Q3_K on metal
·
Sep 03, 2023
opencl-extra
2d631443
·
ggml-opencl : store GPU buffer in ggml_tensor::extra
·
Sep 03, 2023
speculative
847896ab
·
speculative : add --draft CLI arg
·
Sep 03, 2023
ik/fix_metal_bug
6731796b
·
Fix bug intriduced in PR #2959
·
Sep 03, 2023
ik/more_metal_optimizations
6af0bab3
·
~4-5% improvement for Q8_0 TG on metal
·
Sep 03, 2023
alloc-vmem
eb3877a8
·
ggml-alloc : use virtual memory for measurement
·
Sep 02, 2023
ik/metal_faster_mm_f16_f32
9bdfd09a
·
Merge branch 'master' into ik/metal_faster_mm_f16_f32
·
Sep 02, 2023
metal-add-mul
c7cc7568
·
metal : slight speed-up for add and mul kernels
·
Aug 30, 2023
norm-quants
8c2b8812
·
cuda : poc for norm quants (only -b 1 works)
·
Aug 30, 2023
norm-quants-rebase
b4e70822
·
metal : add poc for normalized Q4_0 and Q4_1
·
Aug 30, 2023
remove-old-convert
63bd2607
·
remove convert-llama-7b-pth-to-gguf.py and convert-llama-hf-to-gguf.py
·
Aug 30, 2023
Prev
1
…
15
16
17
18
19
20
21
22
23
…
26
Next