Branches · mirrored_repos / MachineLearning / Llama.Cpp · GitLab

This project is mirrored from https://github.com/ggerganov/llama.cpp. Pull mirroring updated Sep 20, 2024.

llava-fix-offloading

932589c0 · Honor -ngl option for Cuda offloading in llava · Oct 14, 2023
ttfs-alloc-fix

32fe1a58 · train-text-from-scratch : fix assert failure in ggml-alloc · Oct 13, 2023
ggml-enum-finetune-fix

a85229c4 · ggml : add context enumeration functions · Oct 12, 2023
rev-sampling

5261aee8 · sampling : one sequence per sampling context · Oct 12, 2023
llava

0bd7e69d · do not use Wno-cast-qual for MSVC · Oct 12, 2023
fix-server-kv-cache-manage

058e83ca · server : fix kv cache management · Oct 12, 2023
batched-bench

2fcdf869 · batched-bench : add mmq CLI arg · Oct 11, 2023
alloc-assert-fix

ee745692 · ggml-alloc : fix assert in debug builds · Oct 09, 2023
fix-metal-mul-mm

fdd5ad9a · metal : do not use mul_mm kernels when ne00 < 64 · Oct 08, 2023
fix-kv-cache-access

ee268b54 · llama : no longer perform uninitialized access to the KV cache · Oct 08, 2023
fix-refact

acead654 · Merge branch 'master' into fix-refact · Oct 08, 2023
metal-improve-batching

6b9554a7 · metal : print more GPU info + disable mul_mm for MTLGPUFamiliy < Apple7 · Oct 08, 2023
gguf-fix-publish

ba44776d · bump version · Oct 07, 2023
per-layer-kv

f4f9367f · less code duplication, offload k and v separately · Oct 06, 2023
server-parallel

5ab6c213 · server-parallel : add "--reverse-prompt" + compiler warning fixes · Oct 06, 2023
fix-sessions

5418932b · llama : fix comments for llama_kv_cache API · Oct 03, 2023
cublas-q-f16

39ddda27 · disable fp16 mat mul completely with multi GPU · Sep 30, 2023
cuda-cmath

64beaf76 · ggml-cuda : explicitly use cmath for log2 · Sep 29, 2023
cparams-doc

1e3781cd · add notice to hot topics · Sep 29, 2023
train-fix-kq-pos

1eb4de0f · make sure KQ_pos is not reallocated in finetune · Sep 29, 2023

Prev
1
…
13
14
15
16
17
18
19
20
21
…
26
Next

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾