Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
fix-op-params
7ae100c6
·
fix n_tasks
·
Jul 23, 2023
cuda-70b-2
f7bb5e91
·
CUDA: GQA implementation
·
Jul 22, 2023
llama-v2-70b
f7bb5e91
·
CUDA: GQA implementation
·
Jul 22, 2023
mem-opt
c530051c
·
llama : optimize memory buffers (wip)
·
Jul 22, 2023
ggml-backends
d273bfd2
·
allocator: cleanup, more comments
·
Jul 22, 2023
ik/cuda-q4k
91317f7b
·
Speed up Q4_K
·
Jul 22, 2023
ci-cuda
785a36ae
·
ci : add 7B CUDA tets
·
Jul 22, 2023
ik/context_extend_cuda
b068f2f4
·
Adjusted look ahead in ggml_cuda_pool_malloc to 5%
·
Jul 21, 2023
ik/metal_faster_q3k
d3c3624c
·
Better Q3_K for QK_K = 64
·
Jul 21, 2023
ik/metal_faster_q2k
bdf3b6e0
·
Fixed bug in new metal Q2_K implementation
·
Jul 21, 2023
ggml-backends-metal
d45c1631
·
metal : rewrite to fit new backend interface correctly (WIP)
·
Jul 20, 2023
ik/metal_faster_q6k
5f2e4bd8
·
Another Q5_K speedup
·
Jul 20, 2023
ik/metal_faster_q4k
8e03cfcb
·
Faster Q4_K on Metal
·
Jul 20, 2023
ik/q4_k_s
30b5e45c
·
Even more Metal optimizations
·
Jul 20, 2023
fix-tensor-split
63ba9f33
·
llama : make tensor_split ptr instead of array
·
Jul 19, 2023
ci
25e14644
·
ci : run ctest
·
Jul 17, 2023
refactor-mpi
04923631
·
mpi : fix after master merge
·
Jul 09, 2023
llama_server_completions
26cc1bd7
·
llama : uniform variable names + struct init
·
Jul 05, 2023
llama_server_timings
ff6e39f1
·
use javascript generators as much cleaner API
·
Jul 05, 2023
test-mac-os-ci
f46db27e
·
ci : disable FMA on Mac OS
·
Jul 05, 2023
Prev
1
…
20
21
22
23
24
25
26
Next