Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
fix-falcon
cc924c57
·
cuda : add assert to guard from non-cont ropes
·
Aug 27, 2023
llama-bench-utf8
e000ff7b
·
llama-bench : set locale to utf8
·
Aug 27, 2023
ik/speedup_tokenization
86e35115
·
Fixit: it was missing the piece after the last found occurence
·
Aug 27, 2023
add-abort-callback
f2770b8c
·
Add abort callback
·
Aug 27, 2023
fix-gguf-str
74999e08
·
gguf : fix strings to not be null-terminated
·
Aug 27, 2023
ik/faster_bpe_tokenizer
849a31f1
·
Remove comment that no longer applies
·
Aug 29, 2023
view-src
9fca82be
·
formatting
·
Aug 29, 2023
llama2-readme
4a4051b8
·
remove outdated references to -eps and -gqa from README
·
Aug 29, 2023
ik/issue_2858
c6b8bdbc
·
Tell users attmepting to run perplexity with too few tokens to use more
·
Aug 29, 2023
fix-docker-tools-sh
72177120
·
remove `exec`
·
Aug 29, 2023
gguf-publish-ci
488e0320
·
Merge branch 'master' into gguf-publish-ci
·
Aug 30, 2023
remove-old-convert
63bd2607
·
remove convert-llama-7b-pth-to-gguf.py and convert-llama-hf-to-gguf.py
·
Aug 30, 2023
norm-quants-rebase
b4e70822
·
metal : add poc for normalized Q4_0 and Q4_1
·
Aug 30, 2023
norm-quants
8c2b8812
·
cuda : poc for norm quants (only -b 1 works)
·
Aug 30, 2023
metal-add-mul
c7cc7568
·
metal : slight speed-up for add and mul kernels
·
Aug 30, 2023
ik/metal_faster_mm_f16_f32
9bdfd09a
·
Merge branch 'master' into ik/metal_faster_mm_f16_f32
·
Sep 02, 2023
alloc-vmem
eb3877a8
·
ggml-alloc : use virtual memory for measurement
·
Sep 02, 2023
ik/more_metal_optimizations
6af0bab3
·
~4-5% improvement for Q8_0 TG on metal
·
Sep 03, 2023
ik/fix_metal_bug
6731796b
·
Fix bug intriduced in PR #2959
·
Sep 03, 2023
speculative
847896ab
·
speculative : add --draft CLI arg
·
Sep 03, 2023
Prev
1
…
3
4
5
6
7
8
9
10
11
…
26
Next