Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/ggerganov/llama.cpp
. Pull mirroring updated
Sep 19, 2024
.
sl/backend-sched-fix-null-deref
44c93c67
·
ggml : also check ggml_backend_is_cpu
·
Jan 07, 2024
gg/fix-server-n-decoded-4790
58de6736
·
server : fix n_predict check
·
Jan 06, 2024
gg/remove-gqa-check-4657
7cfde781
·
llama : remove redundant GQA check
·
Jan 06, 2024
gg/base-translate
26fbb10f
·
examples : add few-shot translation example
·
Jan 05, 2024
gg/no-yield-on-blas
4a0e7222
·
ggml : simplify do_yield logic
·
Jan 04, 2024
ik/iq2_2.06bpw
1e6b8e1f
·
iq2_xxs: add to llama ftype enum
·
Jan 04, 2024
gg/metal-opt-mul-mat-id
9f51f3e6
·
metal : opt mul_mm_id
·
Jan 02, 2024
cuda-cublas-opts
4cc78d38
·
ggml : force F32 precision for ggml_mul_mat
·
Jan 02, 2024
gg/avoid-mutex
b5af7ad8
·
llama : refactor quantization to avoid <mutex> header
·
Jan 02, 2024
gg/fix-mingw-4707
c92418dc
·
ggml : include stdlib.h before intrin.h
·
Jan 02, 2024
gg/hf-auto-dl
120a1a55
·
llama : auto download HF models if URL provided
·
Jan 02, 2024
sl/backend-sched
7ed2d3db
·
wip
·
Jan 01, 2024
gg/fix-dotprod-4654
453ae052
·
ggml : add ggml_vdotq_s32 alias
·
Dec 31, 2023
gg/server-token-probs-4088
82033c9b
·
server : send token probs for "stream == false"
·
Dec 31, 2023
gg/fix-ci-metal
a076050e
·
ci : check if these changes fix Github Actions (Metal)
·
Dec 30, 2023
gg/gpu-prec-tests
f64e4f04
·
ggml : testing GPU FP precision via quantized CPY
·
Dec 30, 2023
gg/clip-refact
ec92e781
·
clip : refactor + bug fixes
·
Dec 30, 2023
gg/test-arm
f32f30bc
·
test
·
Dec 26, 2023
gg/fix-dot-prod-arm
d232e0b3
·
ggml : fix dot product (q2_k)
·
Dec 26, 2023
sl/cuda-virt-pool-fixes
061d9652
·
use cudaMemcpy3DPeerAsync
·
Dec 25, 2023
Prev
1
…
7
8
9
10
11
12
13
14
15
…
26
Next