Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
dev
a169bb88
·
Gate signal support on being on a unixoid system. (#74)
·
Mar 13, 2023
tcp_server
3a0dcb39
·
Implement server mode.
·
Mar 22, 2023
q4_1_more_accel_loopsplit
71122470
·
Break up loop for numeric stability
·
Mar 23, 2023
q4_1_more_accel_kahan
66ea164e
·
Kahan summation on Q4_1
·
Mar 23, 2023
q4_1_more_accel
4aeee216
·
Regroup q4_1 dot addition for better numerics.
·
Mar 24, 2023
new-quant
72e9190e
·
wip
·
Mar 26, 2023
stale-feat-instruct-cpp
9e03cba6
·
Merge branch 'master' into feat-instruct-cpp
·
Mar 28, 2023
mmap
c9c820ff
·
Added support for _POSIX_MAPPED_FILES if defined in source (#564)
·
Mar 28, 2023
revert-mmap
1ca5102d
·
Revert "Add mmap support for model files"
·
Apr 02, 2023
flash-attn
36ddd129
·
llama : add flash attention (demo)
·
Apr 05, 2023
quantize_experiments
97d7ac75
·
POC: Measure rmse of 8 bit quantization
·
Apr 13, 2023
mmap-pages-stats
15067374
·
Add mmap pages stats (disabled by default)
·
Apr 16, 2023
quant-attn
4b8d5e38
·
llama : quantize attention results
·
Apr 22, 2023
gg/rmse_quantization
a0242a83
·
Minor, plus rebase on master
·
Apr 22, 2023
q4_0-q4_2-range-fix
71e6ae37
·
ggml : continue from #729 (wip)
·
Apr 22, 2023
q4_3-range-fix
102cd980
·
ggml : Q4_3c using 2x "Full range" approach
·
Apr 23, 2023
ci_cublas
31ff9e2e
·
ci : add cublas to windows release
·
May 03, 2023
jed/spm-clblast
4baa8563
·
Fix build
·
May 06, 2023
remove-vzip
e116eb63
·
ggml : speed-up Q5_0 + Q5_1 at 4 threads
·
May 11, 2023
dequantize-matmul-3-gg
a3e6d622
·
cuda : alternative q4_q8 kernel
·
May 12, 2023
Prev
1
2
3
4
5
…
26
Next