Branches · mirrored_repos / MachineLearning / Llama.Cpp · GitLab

This project is mirrored from https://github.com/ggerganov/llama.cpp. Pull mirroring updated Sep 19, 2024.

gg/hellaswag-batched

9df62c25 · perplexity : remove HellaSwag restruction for n_batch · Jan 18, 2024
ik/winogrande

e3a17dcb · winogrande: add dataset instructions · Jan 18, 2024
ceb/nomic-vulkan-fixes

681f6a1f · kompute : fix rope_f32 and scale ops · Jan 17, 2024
gg/imatrix-gpu-4931

2917e6b5 · Merge branch 'master' into gg/imatrix-gpu-4931 · Jan 17, 2024
gg/fix-autorelease-4952

06b49791 · test : simplify · Jan 17, 2024
gg/fix-spm-added-tokens-dict-4958

23742deb · py : fix padded dummy tokens (I hope) · Jan 17, 2024
ik/better_q2_k_s

9fd1e83f · Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 · Jan 17, 2024
gg/iq2-refactor-and-tests

49bafe09 · tests : avoid creating RNGs for each tensor · Jan 17, 2024
cd/test-ggml-ci-run

29927a60 · ggml-ci · Jan 17, 2024
gg/hellaswag-clear-kv-cache

27f5fc6d · perplexity : fix kv cache handling for hellaswag · Jan 16, 2024
ik/imatrix_legacy_quants

bb9abb5c · imatrix: guard Q4_0/Q5_0 against ffn_down craziness · Jan 16, 2024
crasm_segfault-on-pthread

e6e34b2a · add test to tests/CMakeLists.txt · Jan 15, 2024
gg/sched-eval-callback-4931

40cdb397 · backend : clean-up the implementation · Jan 15, 2024
ik/quantize_iq2_notcompatible

dccaec76 · The check for 256 divisibility was missing for IQ2_XS, IQ2_XXS · Jan 15, 2024
ik/cuda_faster_legacy_dequantize

08b89f7e · CUDA: faster dequantize kernels for Q4_0 and Q4_1 · Jan 14, 2024
ik/imatrix_k_quants

90096a5f · Add ability to use importance matrix for all k-quants · Jan 14, 2024
gg/llama-trace

0abbe2fc · llama : check LLAMA_TRACE env for extra logging · Jan 14, 2024
ik/fix_qxm_moe

121eb066 · Fix the fix · Jan 14, 2024
gg/metal-rm-api

96cf0282 · metal : remove old API · Jan 13, 2024
sl/micro-batching

40b3c5ef · pipeline parallelism demo · Jan 13, 2024

Prev
1
…
5
6
7
8
9
10
11
12
13
…
26
Next

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾