Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
ik/iq2xxs_tune
f3798f77
·
iq2_xxs: tune quantization
·
Feb 04, 2024
gg/flash-attn-32x8
a647257b
·
cuda : express strides with helper constants
·
Feb 04, 2024
gg/flash-attn-interleave-cc
49a483e0
·
wip
·
Feb 04, 2024
ik/iq3xxs_noimatrix_guard
7278b0e5
·
iq3_xxs: quards for the no-imatrix situation
·
Feb 05, 2024
ik/ggml-quants-cpp
91c453fb
·
One cannot possibly be defining static_assert in a C++ compilation
·
Feb 05, 2024
gg/convert-fix-byte-tokens
adcf16fd
·
py : fix empty bytes arg
·
Feb 05, 2024
ik/q4k_tuning
d3cc1533
·
Q5_K: slightly better quantization
·
Feb 06, 2024
ik/update_readme
238af6e4
·
Update README.md
·
Feb 06, 2024
0cc4m/vulkan-multigpu
b22e9250
·
Rename cpu assist free function
·
Feb 06, 2024
ceb/bert
7286b83d
·
BERT WIP
·
Feb 06, 2024
sl/offload-msg
c4ca3017
·
llama : do not print "offloading layers" message in CPU-only builds
·
Feb 08, 2024
moe-cpu-thread-cap
d5a6e865
·
Whitespace
·
Feb 08, 2024
0cc4m/fix-apu
c9e28a42
·
Fix debug output function names
·
Feb 08, 2024
gg/revert-swift-package-dep
b8a91c5a
·
Revert "swift : update Package.swift to use ggml as dependency (#4691)"
·
Feb 12, 2024
ik/fix_warnings
4246b71a
·
Fix compiler warnings (shadow variable)
·
Feb 13, 2024
gg/tests-disable-moe
dc66c6ac
·
tests : disable moe test
·
Feb 13, 2024
gg/mt-tokenizer-tests
f01f187b
·
unicode : fix data race
·
Feb 13, 2024
gg/ci-add-bert
1ab4f152
·
ci : do not do BERT tests on low-perf nodes
·
Feb 13, 2024
ik/iq1_s
5c977221
·
iq1_s: slightly faster dot product
·
Feb 13, 2024
ceb/nomic-bert
ccd757a1
·
convert : fix mistakes from refactoring
·
Feb 13, 2024
Prev
1
…
17
18
19
20
21
22
23
24
25
26
Next