Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
gg/hf
e856bfed
·
hf : add support for --repo and --file
·
Feb 15, 2024
gg/fix-cmake
34b7d5fb
·
cmake : minor
·
Feb 16, 2024
gg/compare-commits
64dcb283
·
fix make flags
·
Feb 16, 2024
gg/fix-android
974e3cad
·
ggml : try another fix
·
Feb 17, 2024
gg/refactor-alibi
aaa20e1f
·
test-backend-ops : add null pos test to soft_max
·
Feb 17, 2024
gg/fix-ci
d03f66f2
·
ci : fix wikitext url + compile warnings
·
Feb 18, 2024
gg/rename-n_ctx
47c662b0
·
fix some spaces added by IDE in math op
·
Feb 18, 2024
fix-nvcc-wall
92fb52d4
·
build: pass all warning flags to nvcc via -Xcompiler
·
Feb 18, 2024
sl/fix-cuda-soft-max
558d5d4b
·
cuda : fix nans in soft_max
·
Feb 18, 2024
gg/fix-werr-cuda
9382df9c
·
cmake : pass -Werror through -Xcompiler
·
Feb 19, 2024
gg/metal-batched
412735ec
·
Merge branch 'master' into gg/metal-batched
·
Feb 19, 2024
gg/flash-attn-sync
f249c997
·
llama : adapt to F16 KQ_pos
·
Feb 19, 2024
sl/fix-cuda-peer-access
62d3263f
·
fix hip
·
Feb 19, 2024
ik/iq4_nl_no_superblock
daacf6ca
·
It was the ggml_vdotq thing missed inside the brackets
·
Feb 20, 2024
fix-convert-modelname
941de117
·
convert : get general.name from model dir, not its parent
·
Feb 20, 2024
ceb/fix-n-keep
f921fc3e
·
examples : do not assume BOS when shifting context
·
Feb 20, 2024
sl/gemma-offload-output
22ca4ddb
·
gemma : allow offloading the output tensor
·
Feb 21, 2024
sl/fix-quant-kv-shift
5271c756
·
llama : fix K-shift with quantized K (wip)
·
Feb 22, 2024
gg/add-gemma-conversion
7ad7da6a
·
Update convert-hf-to-gguf.py
·
Feb 22, 2024
gg/improve-gemma-quants
488bd973
·
llama : quantize token_embd.weight using output type
·
Feb 22, 2024
Prev
1
…
18
19
20
21
22
23
24
25
26
Next