Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
ggml-backends-metal
d45c1631
·
metal : rewrite to fit new backend interface correctly (WIP)
·
Jul 20, 2023
ik/metal_faster_q2k
bdf3b6e0
·
Fixed bug in new metal Q2_K implementation
·
Jul 21, 2023
ik/metal_faster_q3k
d3c3624c
·
Better Q3_K for QK_K = 64
·
Jul 21, 2023
ik/context_extend_cuda
b068f2f4
·
Adjusted look ahead in ggml_cuda_pool_malloc to 5%
·
Jul 21, 2023
ci-cuda
785a36ae
·
ci : add 7B CUDA tets
·
Jul 22, 2023
ik/cuda-q4k
91317f7b
·
Speed up Q4_K
·
Jul 22, 2023
ggml-backends
d273bfd2
·
allocator: cleanup, more comments
·
Jul 22, 2023
mem-opt
c530051c
·
llama : optimize memory buffers (wip)
·
Jul 22, 2023
cuda-70b-2
f7bb5e91
·
CUDA: GQA implementation
·
Jul 22, 2023
llama-v2-70b
f7bb5e91
·
CUDA: GQA implementation
·
Jul 22, 2023
fix-op-params
7ae100c6
·
fix n_tasks
·
Jul 23, 2023
ik/cuda_fix_QKK_64
8b44eef2
·
Some cleanup
·
Jul 23, 2023
ik/cuda_q5k
f3a92117
·
Add some comments to satisfy PR reviewer
·
Jul 23, 2023
ik/cuda_fix_QKK_64_2
e6dd6bc5
·
Very slightly better Q5_K bit fiddling
·
Jul 24, 2023
ik/fix_scalar_q5k_64
b32538da
·
Fix scalar version of Q5_K when QK_K = 64
·
Jul 24, 2023
sync
68c9fca9
·
tests : remove unnecessary funcs
·
Jul 24, 2023
rms-norm-eps-param
3855ea36
·
use scientific notation for eps param in the help
·
Jul 24, 2023
webchat-escape-html
27d0fcc3
·
Merge remote-tracking branch 'origin/master' into webchat-escape-html
·
Jul 24, 2023
ik/metal_q4_0_1_new
7f985612
·
Have N_DST, etc., be template parameters
·
Jul 24, 2023
server-eps
3d4359e2
·
server: add rms_norm_eps parameter
·
Jul 24, 2023
Prev
1
2
3
4
5
6
7
…
26
Next