Branches · mirrored_repos / MachineLearning / Llama.Cpp · GitLab

This project is mirrored from https://github.com/ggerganov/llama.cpp. Pull mirroring updated Sep 20, 2024.

llama-bpw

a08e1a92 · llama.cpp : show model size and BPW on load · Sep 17, 2023
custom-attention-mask-no-roped-cache

784d14ed · llama : store non-RoPEd K cache (WIP) · Sep 17, 2023
cam-cuda

93352769 · Merge branch 'custom-attention-mask' into cam-cuda · Sep 19, 2023
cam-cuda-2

d30ab79b · fix rope shift · Sep 20, 2023
cont-reshape

0fd462fd · ggml : revert change to ggml_cpy, add ggml_cont_Nd instead · Sep 20, 2023
llama-bench-readme

3f9a4830 · llama-bench : add README · Sep 23, 2023
cam-simple-fix

72e7ef4e · simple : fixes · Sep 26, 2023
cublas-f16

7d5674dd · restrict fp16 mat mul to volta and up · Sep 28, 2023
custom-attention-mask

c5650ed4 · server : avoid context swaps by shifting the KV cache · Sep 28, 2023
ci-disable-freebsd

666ca5ae · ci : disable freeBSD builds due to lack of VMs · Sep 28, 2023
llama-model-params

c8a9658e · Merge remote-tracking branch 'origin/master' into llama-model-params · Sep 28, 2023
train-fix-kq-pos

1eb4de0f · make sure KQ_pos is not reallocated in finetune · Sep 29, 2023
cparams-doc

1e3781cd · add notice to hot topics · Sep 29, 2023
cuda-cmath

64beaf76 · ggml-cuda : explicitly use cmath for log2 · Sep 29, 2023
cublas-q-f16

39ddda27 · disable fp16 mat mul completely with multi GPU · Sep 30, 2023
fix-sessions

5418932b · llama : fix comments for llama_kv_cache API · Oct 03, 2023
server-parallel

5ab6c213 · server-parallel : add "--reverse-prompt" + compiler warning fixes · Oct 06, 2023
per-layer-kv

f4f9367f · less code duplication, offload k and v separately · Oct 06, 2023
gguf-fix-publish

ba44776d · bump version · Oct 07, 2023
metal-improve-batching

6b9554a7 · metal : print more GPU info + disable mul_mm for MTLGPUFamiliy < Apple7 · Oct 08, 2023

Prev
1
…
5
6
7
8
9
10
11
12
13
…
26
Next

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾