Branches · mirrored_repos / MachineLearning / Llama.Cpp · GitLab

This project is mirrored from https://github.com/ggerganov/llama.cpp. Pull mirroring updated Sep 20, 2024.

server-rev

c0f4d548 · server : add comment about changing slot_state to bool · Oct 22, 2023
upd-issue-templates

b9bb4cbe · Separate bug and enhancement template + no default title · Oct 23, 2023
cuda-batched-gemm-deq

69664749 · cuda : play with faster Q4_0 dequantization · Oct 24, 2023
cuda-batched-gemm

d798a17c · cuda : add TODO for calling cublas from kernel + using mem pool · Oct 24, 2023
fix-server-system

4b32c65d · server : minor · Oct 24, 2023
cuda-quantum-batch

49af767f · build : add compile option to force use of MMQ kernels · Oct 27, 2023
cuda-multi-gpu

cd3e20fb · cuda : fix multi-gpu with tensor cores · Oct 27, 2023
starcoder-cuda

fb2f5f5b · starcoder : offload layers to GPU · Oct 28, 2023
sampling-greedy-with-probs

bbfc62ac · sampling : temp == 0.0 -> no probs, temp < 0.0 -> probs · Oct 28, 2023
apply-3585

de7e0912 · convert : ignore tokens if their IDs are within [0, vocab_size) · Oct 28, 2023
fix-kv-shift

fb645834 · llama : fix kv shift bug · Oct 28, 2023
ggml-quants

8a86b95e · quantize : --pure option for disabling k-quant mixtures · Oct 28, 2023
scratch

15267192 · llama : refactor tensor offloading as callback · Oct 29, 2023
llama-refactor-ffn

3b778a4a · llama : add llm_build_ffn helper function · Oct 29, 2023
lto

bc28aaa8 · make : use -lfto=auto to avoid warnings and maintain perf · Oct 30, 2023
ggml-impl

4b3cb98d · ggml-impl : move extern "C" to start of file · Oct 30, 2023
llama-refactor-norm

7923b70c · llama : add llm_build_inp_embd helper · Oct 31, 2023
deploy

dab42893 · scripts : working curl pipe · Oct 31, 2023
test-mmv

29fe5169 · wip · Oct 31, 2023
try-fix-3869

22cc9bef · cuda : check if this fixes Pascal card regression · Oct 31, 2023

Prev
1
…
7
8
9
10
11
12
13
14
15
…
26
Next

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾