Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
ik/imatrix_legacy_quants
bb9abb5c
·
imatrix: guard Q4_0/Q5_0 against ffn_down craziness
·
Jan 16, 2024
gg/hellaswag-clear-kv-cache
27f5fc6d
·
perplexity : fix kv cache handling for hellaswag
·
Jan 16, 2024
cd/test-ggml-ci-run
29927a60
·
ggml-ci
·
Jan 17, 2024
gg/iq2-refactor-and-tests
49bafe09
·
tests : avoid creating RNGs for each tensor
·
Jan 17, 2024
ik/better_q2_k_s
9fd1e83f
·
Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4
·
Jan 17, 2024
gg/fix-spm-added-tokens-dict-4958
23742deb
·
py : fix padded dummy tokens (I hope)
·
Jan 17, 2024
gg/fix-autorelease-4952
06b49791
·
test : simplify
·
Jan 17, 2024
gg/imatrix-gpu-4931
2917e6b5
·
Merge branch 'master' into gg/imatrix-gpu-4931
·
Jan 17, 2024
ceb/nomic-vulkan-fixes
681f6a1f
·
kompute : fix rope_f32 and scale ops
·
Jan 17, 2024
ik/winogrande
e3a17dcb
·
winogrande: add dataset instructions
·
Jan 18, 2024
gg/hellaswag-batched
9df62c25
·
perplexity : remove HellaSwag restruction for n_batch
·
Jan 18, 2024
ik/faster_hellaswag
ccc78a20
·
hellaswag: speed up even more by parallelizing log-prob evaluation
·
Jan 18, 2024
sl/fix-mlock
abbf1a6d
·
llama : fix mlock with no-mmap with Metal
·
Jan 18, 2024
gg/winogrande-batched
bb58b0e7
·
perplexity : remove unused function
·
Jan 18, 2024
ceb/nomic-vulkan-fix-add
14532151
·
kompute : fix ggml_add kernel
·
Jan 19, 2024
ik/winogrande_parallel_eval
e54fcbcb
·
winogrande: evaluate log-probs in parallel
·
Jan 19, 2024
ceb/restore-convert
4a3bc152
·
py : linting with mypy and isort
·
Jan 19, 2024
ceb/fix-msvc-build
32a392fe
·
try a differerent fix
·
Jan 19, 2024
ik/truthfull_qa
109dada9
·
TruthfulQA: prepare tasks in parallel for large test datasets
·
Jan 20, 2024
sl/nkvo-fix
16b7e83c
·
llama : run all KQV ops on the CPU with no KV offload
·
Jan 20, 2024
Prev
1
…
14
15
16
17
18
19
20
21
22
…
26
Next