Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
gg/flash-attn-wip2
06c2d0d1
·
wip
·
Jan 23, 2024
pydantic-fixups
bdf770b3
·
examples : make pydantic scripts pass mypy and support py3.8
·
Jan 23, 2024
gg/minor
ea88e2a4
·
minor : clean-up some warnings and style
·
Jan 23, 2024
ik/kl-divergence-2
0b59931c
·
perplexity: a better organized KL-divergence statistics output
·
Jan 23, 2024
sl/qwen-fix
f0bb1052
·
llama : fix not enough space in buffer with Qwen
·
Jan 22, 2024
gg/flash-attn-wip
diverged from upstream
0f018b7e
·
wip
·
Jan 22, 2024
ik/keep_imatrix
c53a10ca
·
Be able to keep intermediate imatrix results
·
Jan 22, 2024
ik/kl-divergence
c0e9d27b
·
Add ability to compute KL-divergence
·
Jan 22, 2024
ik/q3_k_xs
29c41d49
·
Q3_K_XS: quanize first 1/8 of ffn_down layers with Q4_K
·
Jan 21, 2024
ik/faster_imatrix
3aa56562
·
imatrix: add --no-ppl option to skip PPL calculations altogether
·
Jan 20, 2024
gg/flash-attn-online
a9681feb
·
ggml : online attention (CPU)
·
Jan 20, 2024
sl/nkvo-fix
16b7e83c
·
llama : run all KQV ops on the CPU with no KV offload
·
Jan 20, 2024
ik/truthfull_qa
109dada9
·
TruthfulQA: prepare tasks in parallel for large test datasets
·
Jan 20, 2024
ceb/fix-msvc-build
32a392fe
·
try a differerent fix
·
Jan 19, 2024
ceb/restore-convert
4a3bc152
·
py : linting with mypy and isort
·
Jan 19, 2024
ik/winogrande_parallel_eval
e54fcbcb
·
winogrande: evaluate log-probs in parallel
·
Jan 19, 2024
ceb/nomic-vulkan-fix-add
14532151
·
kompute : fix ggml_add kernel
·
Jan 19, 2024
gg/winogrande-batched
bb58b0e7
·
perplexity : remove unused function
·
Jan 18, 2024
sl/fix-mlock
abbf1a6d
·
llama : fix mlock with no-mmap with Metal
·
Jan 18, 2024
ik/faster_hellaswag
ccc78a20
·
hellaswag: speed up even more by parallelizing log-prob evaluation
·
Jan 18, 2024
Prev
1
…
4
5
6
7
8
9
10
11
12
…
26
Next