Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 20, 2024
.
llama-model-params
c8a9658e
·
Merge remote-tracking branch 'origin/master' into llama-model-params
·
Sep 28, 2023
ci-disable-freebsd
666ca5ae
·
ci : disable freeBSD builds due to lack of VMs
·
Sep 28, 2023
custom-attention-mask
c5650ed4
·
server : avoid context swaps by shifting the KV cache
·
Sep 28, 2023
cublas-f16
7d5674dd
·
restrict fp16 mat mul to volta and up
·
Sep 28, 2023
cam-simple-fix
72e7ef4e
·
simple : fixes
·
Sep 26, 2023
llama-bench-readme
3f9a4830
·
llama-bench : add README
·
Sep 23, 2023
cont-reshape
0fd462fd
·
ggml : revert change to ggml_cpy, add ggml_cont_Nd instead
·
Sep 20, 2023
cam-cuda-2
d30ab79b
·
fix rope shift
·
Sep 20, 2023
cam-cuda
93352769
·
Merge branch 'custom-attention-mask' into cam-cuda
·
Sep 19, 2023
custom-attention-mask-no-roped-cache
784d14ed
·
llama : store non-RoPEd K cache (WIP)
·
Sep 17, 2023
llama-bpw
a08e1a92
·
llama.cpp : show model size and BPW on load
·
Sep 17, 2023
metal-fix-soft-max
3e15ea9b
·
metal : fix bug in soft_max kernels (out-of-bounds access)
·
Sep 15, 2023
support-starcoder-fix
92a4f868
·
llama : make starcoder graph build more consistent with others
·
Sep 15, 2023
fix-cmake-out-of-source-install
c2217ca2
·
Fix llama.h location when built outside of root directory
·
Sep 14, 2023
mul-mat-pad
e7e7b114
·
llama : remove experimental stuff
·
Sep 14, 2023
ik/quantize_faster
271785c3
·
Allow to enable/disable mmap via command line
·
Sep 14, 2023
fix-rocm-shared-lib-build
61436803
·
Compile ggml-rocm with -fpic when building shared library
·
Sep 13, 2023
ik/metal_falcon_pp
c5da6f2c
·
Some cleanup
·
Sep 11, 2023
ik/combined_attn_ops
76a0c903
·
POC: combined scale + diagonal mask infinity + soft max op
·
Sep 11, 2023
ik/metal_pp
211d82a8
·
metal : minor (readibility)
·
Sep 11, 2023
Prev
1
…
14
15
16
17
18
19
20
21
22
…
26
Next