Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 20, 2024
.
gg/reuse-n_tasks
45331b6a
·
ggml : reuse ggml_get_n_tasks() in ggml_graph_plan()
·
Dec 03, 2023
gg/quantum-k-cache
af99c6fb
·
llama : remove memory_f16 and kv_f16 flags
·
Dec 05, 2023
gg/server-oai-cache-prompt
ef455cb1
·
server : recognize cache_prompt parameter in OAI API
·
Dec 06, 2023
gg/per-layer-kv
fc5f3346
·
readme : add API change notice
·
Dec 07, 2023
ceb/fix-mingw-target
ae2517f7
·
make : fix missing console.o deps
·
Dec 10, 2023
mixtral
e1241d9b
·
metal : switch to execution barriers + fix one of the barriers
·
Dec 13, 2023
ceb/fix-cuda-warning-flags
c8554b80
·
Merge branch 'master' of
https://github.com/ggerganov/llama.cpp
into ceb/fix-cuda-warning-flags
·
Dec 13, 2023
sl/n-dims
6dcdb57b
·
ggml : remove n_dims from ggml_tensor
·
Dec 14, 2023
sl/row-size
00405240
·
ggml : move ggml_nbytes_split to ggml-cuda.cu
·
Dec 14, 2023
sl/mmid-cpu-grouping
7afb69b8
·
store row groups in wdata and calculate only once in GGML_TASK_INIT
·
Dec 15, 2023
sl/finetune-alloc-fix
4d607da9
·
finetune : keep allocs alive until all allocations are done
·
Dec 15, 2023
ceb/fix-badspecial-silentfail
b0547d21
·
gguf-py : fail fast on nonsensical special token IDs
·
Dec 15, 2023
lora-falcon
07838c91
·
fix style
·
Dec 16, 2023
gg/phi-2
d2f1e0da
·
Merge branch 'cuda-cublas-opts' into gg/phi-2
·
Dec 17, 2023
pr/4484
f86b9d15
·
lookup : minor
·
Dec 17, 2023
gg/swiftui-bench
86506662
·
llama.swiftui : improve bench
·
Dec 17, 2023
ceb/fix-logit-check
1b058171
·
decode : fix logits_valid for old API
·
Dec 17, 2023
sl/ggml-backend-int
afdec97e
·
wip
·
Dec 18, 2023
gg/phi-2-2
a462159c
·
cuda : ggml_cuda_op_mul_mat_cublas support F32 precision
·
Dec 18, 2023
gg/plamo-test
3c734f49
·
plamo : testing
·
Dec 18, 2023
Prev
1
…
10
11
12
13
14
15
16
17
18
…
26
Next