Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
gguf-export-objs
058fbdd8
·
gguf : bump version
·
Aug 25, 2023
gguf-pip
0248ca81
·
gguf : add notes for tests
·
Aug 25, 2023
codellama-hf-freq-base
06f79259
·
convert.py : add freq_base when converting CodeLlama from an HF model
·
Aug 24, 2023
codellama-ctx
75945403
·
convert.py : try to determine n_ctx automatically for CodeLlama
·
Aug 24, 2023
metal-fix-memory-leak
53dea117
·
metal : fix encoders memory leak
·
Aug 24, 2023
codellama-gguf-rope-base
21dcd944
·
gguf : add rope_freq_base parameter for CodeLlama
·
Aug 24, 2023
metal-add-q8_0
1202e06c
·
metal : add Q8_0 mul_mm kernel
·
Aug 24, 2023
fix-falcon-cuda
ac4bb6ba
·
cuda : add RoPE kernel for mode == 2 (NeoX)
·
Aug 24, 2023
fix-whitespace
d8beb85c
·
Merge branch 'master' into fix-whitespace
·
Aug 23, 2023
fix-eos
977629a3
·
Merge branch 'master' into fix-eos
·
Aug 23, 2023
convert-lora-fix
7935986f
·
fix convert-lora-to-ggml.py
·
Aug 23, 2023
ik/quantize_help_fix
436c68c3
·
Fix values shown in the quantize tool help
·
Aug 23, 2023
ik/fix_quantize_help
3ddec9a5
·
Adjusted the size/PPL values printed in the quantize help
·
Aug 23, 2023
llama2-chat-example
2f5d1908
·
better n_threads
·
Aug 22, 2023
embedding-batches
5cb4658c
·
embedding : evaluate prompt in batches
·
Aug 22, 2023
add-ftype
34151c9d
·
convert.py : fix Enum to IntEnum
·
Aug 22, 2023
ik/better_perplexity
efdfc41e
·
Alternative way to output PPL results
·
Aug 22, 2023
ik/strided_perplexity
208a781d
·
Implementing strided computation of perplexity
·
Aug 22, 2023
falcon
76cad091
·
Merge branch 'master' into falcon
·
Aug 22, 2023
ik/better_q234_k
fdf73db5
·
Fix for changed tensor names
·
Aug 22, 2023
Prev
1
…
17
18
19
20
21
22
23
24
25
26
Next